Skip to content

Are We Drifting? — Part 3: buildmere, a Codegen Kernel

Are We Drifting? — Part 3: buildmere, a Codegen Kernel

Section titled “Are We Drifting? — Part 3: buildmere, a Codegen Kernel”

Part 2: The Tagged Vocabulary named the atom: a tagged vocabulary, declared once. This part is about the machine that turns that one declaration into typed code on every side — and, more importantly, keeps the copies from drifting after it does.

A manifest only earns the name “source of truth” if the copies generated from it cannot quietly disagree with it.

A generator that you run by hand, whose output you commit and then edit “just this once,” is not a source of truth. It is a suggestion with extra steps. The generated file drifts from the manifest the moment someone touches it, and nothing notices.

So the real question for any codegen system is not “can it generate?” It is “what stops the generated code from drifting away from its source?“

buildmere is a standalone Go module — github.com/buildmere/vocab — that does exactly one thing: declare a closed tagged vocabulary once in a YAML manifest, and generate typed code, validation, schemas, and reference files across the stack from that single source.

Two rules keep it honest and portable, and they are worth stating verbatim because everything else follows from them:

1. buildmere imports nothing from its consumers. It is imported by products like Liftmere; never the reverse.

2. Generated code imports nothing project-specific — only context / fmt from the standard library. Binding generated code to a project’s runtime is done by ~60 lines of adapter the project owns, not by the generator.

The first rule means buildmere can be lifted into its own repository with a git mv. The second means every generated artifact is adoptable by any project, not just ours — the generated code is inert until a small, hand-owned adapter wires it to the real metrics backend or error transport.

The architecture is a kernel plus plugins. The kernel ships zero kinds.

The kernel is small and generic:

Manifest, Entry the parsed source — a list of vocabulary entries
Generator interface: given a Manifest, return Artifacts
Artifact an in-memory file: path + contents (not written yet)
Registry maps a kind ("enum", "error", …) to its generators
KindDescriptor how a kind declares its metadata schema
Generate / Write / Check the pipeline

A kind is a plugin: a root package that knows how to parse its kind-specific metadata, plus one sub-package per target language. The kernel never hard-codes the list of kinds; the registry resolves a kind string to its generators at runtime. Adding a new output language to a kind is a new package under generators/ — it touches nothing that already works.

The kinds that ship today:

KindThe manifest declaresGenerators
enuma closed value setGo constants, TypeScript, Zod, JSON-Schema, SQL CHECK
configan env-var schema + validationGo Config struct + Load/Validate, .env example
errora domain error vocabulary + fieldszero-import Go error structs (Code/Wire/Fields/Is/Wrap)
metrican OTel instrument set + labelszero-import Go metric set + a Factory interface

This is the detail that makes the drift gate trustworthy.

A generator does not write to disk. It takes a manifest and returns artifacts — path-plus-bytes values, held in memory. The framework collects them, and then does one of two things:

  • Write — write the artifacts to disk (gen).
  • Check — compare the artifacts against what is already committed, and exit non-zero if a single byte differs (check).

Because a generator is a pure function from manifest to bytes, the same call that writes your code in development is the call that verifies it in CI. There is no second, separately-maintained “linter” that could itself drift from the generator. The generator is the spec.

Terminal window
# from packages/buildmere (GOWORK=off keeps the tool's deps out of the app graph)
GOWORK=off go run ./cmd/buildmere gen <manifest.yaml> # write artifacts
GOWORK=off go run ./cmd/buildmere check <manifest.yaml> # CI drift gate
GOWORK=off go run ./cmd/buildmere gen-all <root-dir> # discover + generate every manifest
GOWORK=off go run ./cmd/buildmere check-all <root-dir> # drift gate over a whole tree

Each manifest declares its own output: path, so discovery needs no per-manifest wiring — gen-all walks a directory, finds every buildmere manifest, and generates each to where it says it belongs.

In our repo, two make targets wrap the kernel:

make gen-vocab -> buildmere gen-all (regenerate everything; what you run locally)
make check-vocab -> buildmere check-all (the CI gate; fails on any drift)

check-vocab is not optional or advisory. It sits inside the backend lint umbrella that every pull request runs:

lint-be: contracts lint-migrations check-vocab check-mocks check-sqlc \
check-sql-scope check-connect-errors

If you hand-edit a generated .gen.go, check-vocab regenerates it in memory, sees that your edit does not match what the manifest produces, and fails the build. The only way to change generated code is to change the manifest and regenerate. That is the third leg of the governance model — structure, enforcement, docs — and without it the first two are decoration.

buildmere also ships its own internal drift guards, so the generator itself cannot silently change shape: a compile-time assertion that the metric Factory matches, an error-code round-trip test, and pinned generator signatures.

The same invariant — generated code cannot silently disagree with its source — appears twice in the backend, enforced two different ways, and the contrast is the interesting part.

buildmere outputs are committed to the repo, which is why check-vocab can catch them: it regenerates the files in memory and byte-compares against what is committed — any hand-edit is visible as a diff and fails the build.

buf generate outputs are gitignored, so they cannot drift because they are never stored in the first place — gen/go/ and gen/ts/ are produced fresh on every make gen-be, and any hand-edit to them simply disappears the next time the target runs.

Two different mechanisms. The same refusal.

Generated code cannot silently disagree with its source. In this backend, that invariant is enforced twice over — once by a check that diffs what is committed, once by a store strategy that makes committing impossible.

Drift scenarioGateTier
Hand-edit a committed .gen.go vocab filecheck-vocab inside lint-be: regenerates in memory, byte-compares, exits non-zerocoverage-script (tier 3)
Add an RPC to the proto, skip the handler methodvar _ workoutv1connect.WorkoutServiceHandler = (*Service)(nil) — build failscompile-enforced (tier 1)
Delete a proto method, handler still references the old interfacesame compile assertion, same instant failurecompile-enforced (tier 1)
Hand-edit anything under gen/go/ or gen/ts/make gen-be overwrites it; gen/ is gitignored so the edit was never committedstructural (gitignore)

Neither of these gates requires developer discipline to fire on a PR: make lint-be lists contracts as a prerequisite, so buf generate runs before the linters, and the compile assertion runs before that lint step even starts.

Adding a new closed set used to mean a decision and a chore: where do the constants live, who writes the validator, who remembers to update the SQL constraint, who keeps the frontend list in sync.

With a kernel in place, a new vocabulary is a new manifest plus a make gen-vocab. The generator for that kind already exists. Every projection appears at once, correct by construction, and check-vocab guarantees it stays that way.

And because generators are pure functions tested as pure functions, extending the machine is itself cheap — a new language emitter is a small package with golden-file tests, not a research project. This is the economic inversion from Part 1: The Drift Problem made concrete: the rails are cheap to lay, cheap to extend, and they remove a standing coherence tax.

The kernel is abstract on purpose. Part 4: Enums as Shared Vocabulary makes it concrete with the simplest kind — enum — and follows one manifest all the way out to Go, TypeScript, Zod, JSON Schema, and a SQL CHECK constraint.

Where this goes is more reach on the same kernel: additional kinds like permission, route, and event, more language emitters per kind (error → TypeScript is the obvious next one), a typed Kind discriminator, and lifting buildmere into its own open-source repository with a git mv. The plugin model was built for exactly these extensions.