Are We Drifting? — Part 9: One Operation, Three Interfaces

Jun 9, 2026

Are We Drifting? — Part 9: One Operation, Three Interfaces

The data parts are behind us. The shape is not. Parts 2–8 showed one source → many projections → drift gates governing values, models, and state. Now watch it govern behavior — the operations a service exposes.

The same shape, at the operations layer: one handler, three transports, one contract test.

Are we drifting here?

A modern backend capability has at least three audiences:

the app, which wants a REST endpoint;
agents, which want an MCP tool they can discover and call;
humans (and a fallback for when MCP is flaky), who want a CLI command.

The lazy version writes three surfaces. A REST handler here, an MCP tool registration there, a CLI subcommand somewhere else — each with its own copy of the input shape, its own validation, its own idea of what the operation is called.

Then they drift. A field is added to the REST body but not the MCP tool’s schema. A new operation ships as an MCP tool but was never wired into REST, so the app cannot reach it and it never appears in the operation listing. The CLI validates differently from the server. Three faces of one capability, slowly disagreeing about what the capability is.

One source

The fix is to define the operation once — its input schema, its output schema, and its handler — and treat every transport as a projection of that one definition. Our op-server package does exactly this:

defineOperation({
  name: 'run.record',
  input: RunRecordInput,    // a Zod schema
  output: RunStateSchema,   // a Zod schema
  handler: async (ctx) => { /* the one implementation */ },
})

The operation is the source. The input and output are schemas, not prose — the same “validation is declared, not hand-written” move from the config part, now at the boundary of an operation.

Many projections

A service collects its operations into one registry, and the transports are plugins that consume that same array:

createOpServer({ operations })       // the registry
createRestPlugin({ /* … */ })        // → REST endpoints
createMcpServer({ /* … */ })         // → MCP tools
// + a CLI plugin                     // → CLI commands

REST and MCP both read the same operations. There is one handler behind all three faces. This is what “all tools are MCP-driven” means in practice: every backlog and devportal-state operation is automatically an MCP tool because it is in the registry — agents reach the repo’s live state through the very same operations the app calls over REST and a human calls from the terminal. You do not build an MCP surface; you declare operations, and the MCP surface is a projection of them.

A consequence worth stating: a tool registered only on one transport can never be reached on the others, and won’t appear in the registry’s own operations.list introspection. The single registry is what keeps the three faces from being three different APIs.

The drift gate: a contract test against the real SDK

Here is the gate, and it is a good one because it was paid for in production.

op-server ships a test contract — defineMcpServerContract — that boots the real MCP server against the live SDK and asserts that the registered tool surface matches the operations registry exactly. Every operation’s input is run through the same extractShape the server uses, so the SDK receives a proper schema. A tool that was hand-registered outside the registry shows up as an “extra” and fails the test. An operation whose schema the SDK would reject fails here, at unit-test time, instead of at boot.

Why this specific gate exists: hand-calling the SDK’s registerTool with a JSON-Schema-shaped object — instead of going through the registry — passes lint and type-check but crashes the SDK at boot, taking the entire MCP server down. That regression happened in production once (the devportal-state projection tool). So the rule is now structural and enforced:

Do not call registerTool directly. Add an entry to the operations registry and let the server register it. Living outside the registry means living outside the safety net.

The registry is the single source; the contract test is the gate that refuses any tool that tried to exist outside it. Same shape, again.

The three-interface picture has a fourth failure mode that only shows up in long Claude Code sessions: MCP is not bulletproof.

Tools disappear from the parent agent’s registry while subagents keep working.

A tool call hangs.

When that happens, the question is not whether the registry kept the three faces in sync — it did — but whether a CLI fallback exists so an agent can recover without restarting. The answer is not symmetric:

Operation surface	MCP transport	CLI fallback
Backlog ops (`backlog-mcp`)	`mcp__backlog__*` tools	`pnpm liftmere backlog ...` — every tool has an exact equivalent, same handlers, same events
Devportal-state ops (`devportal-state-mcp`)	`mcp__devportal_state__*` tools	None — recovery is `/mcp` reconnect or delegate to a subagent

The single registry guarantees the three transport faces agree on input shape. It does not guarantee all three transports exist for every operation.

A runbook that says “if MCP hangs, use the CLI” is correct for backlog operations and wrong for devportal-state operations.

That asymmetry is real: devportal-state has no CLI fallback today, so its recovery path is /mcp reconnect or delegation to a subagent. Where this goes next is parity — a CLI projection for devportal-state operations, so every operation has all three faces and the runbook holds everywhere.

The velocity payoff

The coherence check this removes is “did I update all three transports, and do they still agree?”

With operations as the source, you add a capability once. The moment it is in the registry it is a REST endpoint the app can call, an MCP tool an agent can discover, and a CLI command a human can run — with one validated input shape and one handler. You cannot ship the half-drifted state where a capability is reachable one way but not another, because the contract test boots the real surface and compares it to the registry.

That is a large multiplier for agentic work specifically: an agent gains a new ability the instant an operation is defined, and the ability it gains is provably the same one the app uses. No separate “expose this to the agent” step, no separate schema for the agent to drift against.

What’s next

If one operation can project onto three transports, one agent definition can project onto every AI tool that runs it. Part 10: The Agent OS climbs to the workflow layer: the Agent OS, where a single manifest becomes a Claude Code agent, a Cursor rule, and a provider-agnostic execution tier — kept in sync by the same kind of gate.

And because operation inputs are already Zod schemas, there is one more face waiting: a frontend form generated from an operation’s input schema. Where this goes next is the fourth projection — a frontend form generated from the same operation schema.