Atlas / The Work of Writing in the Age of AI

Harness Engineering

The layer of prompts, context, tools, and governance that makes multi-step AI work reliable.

Harness engineering names the shift from treating AI as a single prompt-response event to treating it as a governed system of context, tools, reusable instructions, permissions, interfaces, and execution loops. Prompt, Context, Harness tells that history as three nested layers, each emerging because the previous one hit a ceiling.

The first stage was prompt engineering, which “isn’t about commanding the model — it’s about how to communicate effectively to total strangers.” Prompts work because a large model is, at heart, a probability generator guided by your input; persona, examples, and constraints all change the surface of generation. But prompts can only activate what is already inside the model. They are good at eliciting behaviour and bad at supplying knowledge — the source of the familiar twin failures, hallucination and generic answers. The essay sums up the limit cleanly: “Prompts solve communication problems. Not information problems.”

The second stage, context engineering, treats the model’s working memory as a supply chain: “what the model doesn’t know, the system must feed it — at the right time.” That involves more than stuffing background into the prompt. The fixed context window forces choices about compression, summarization, multi-session memory, RAG (which the essay treats as both a real solution and a demonstration of how hard the problem actually is), and the deliberate withholding of capabilities. Skills, in this telling, exist because dumping a dozen tool definitions into context degrades performance: “If you dump a dozen tool descriptions and parameter definitions into the model upfront, it theoretically knows more; in practice it often performs worse. Skills solves this through progressive loading… Capabilities on demand, not capabilities in bulk.”

The third stage is the one the essay names. “By 2026, agentic AI had moved from marketing pitch to working reality… The challenge wasn’t any single agent failing at its task. It was how a team of agents holds together.” Even a well-prompted model with correct information will drift mid-execution, misread a tool’s return value, or quietly deviate over a long chain. Context engineering, at its ceiling, is still about a single agent’s view; harness engineering is about what supervises an agent, constrains it, and catches it when it strays. The medical analogy in the essay — “a brilliant surgeon with perfect information still can’t also be the anesthesiologist” — is what makes this the right level of description: many real tasks are not harder versions of one-agent work but structurally different.

Claude Skills, Commands, Agents toward a Unified Mission shows the harness assembling itself in product form. Skills, slash commands, and agents are all “essentially just prompts under the hood — markdown files containing instructions,” but they differ in who initiates them and how they load. The essay foregrounds intent matching — “If you say ‘polish:’ followed by some text, Claude knows you’re not talking about nail polish” — as the property that turns a personal library of prompts into something more like a reusable component system. Anthropic’s December 2025 move to make Skills an open standard is read as the moment instructions stop being one-off prompts and start behaving like portable software pieces.

How We Build Software in the Age of AI and Claude Code and The Rise of CLI push the same idea outward into product design. The CLI piece argues that the terminal won not because of fashion but because Unix’s “small programs that do one thing well, connected by pipes” turn out to be exactly the kind of substrate an agent can compose against. The software piece pushes further: software has historically been built for human users, with humans as the integration layer threading disconnected tools together. AI begins to question both assumptions. Once tasks extend across tools, sessions, and permissions, the decisive engineering problem is no longer just the model. It is the harness around the model.

Related

Read Next

  • Prompt, Context, Harness: The Three Phases of AI Engineering

    AI engineering has evolved through three compensatory phases (prompt, context, and harness), each addressing a failure the previous layer couldn't fix. Harness Engineering is the governance layer that keeps teams of agents coherent across complex, long-running tasks.

  • Claude Skills, Commands, Agents toward a unified mission

    This article traces the evolution of Claude's Skills, Commands, and Agents, analyzing the fundamental tension between intent-matching intelligence and explicit-command reliability, and arguing that their merger points toward compositional AI behavior.

  • How We Build Software in the Age of AI

    This essay argues that AI is reshaping software at an architectural level, moving from human-centered applications to a composable agentic ecosystem where CLIs, Skills, and MCP form distinct layers that agents invoke as primary users.

  • Claude Code and The Rise of CLI

    Why did developers abandon polished IDEs for a terminal tool? The answer is less about AI than about Unix: a 50-year-old design philosophy of composable text tools that proves to be the perfect substrate for machine intelligence, and a preview of the AUI paradigm ahead.

Dong Liang
Authors
Learning Technologist / Instructional Designer / Elearning Developer