Are We Drifting? — Part 3: buildmere, a Codegen Kernel
Are We Drifting? — Part 3: buildmere, a Codegen Kernel
Section titled “Are We Drifting? — Part 3: buildmere, a Codegen Kernel”Part 2: The Tagged Vocabulary named the atom: a tagged vocabulary, declared once. This part is about the machine that turns that one declaration into typed code on every side — and, more importantly, keeps the copies from drifting after it does.
Are we drifting here?
Section titled “Are we drifting here?”A manifest only earns the name “source of truth” if the copies generated from it cannot quietly disagree with it.
A generator that you run by hand, whose output you commit and then edit “just this once,” is not a source of truth. It is a suggestion with extra steps. The generated file drifts from the manifest the moment someone touches it, and nothing notices.
So the real question for any codegen system is not “can it generate?” It is “what stops the generated code from drifting away from its source?“
buildmere, in one paragraph
Section titled “buildmere, in one paragraph”buildmere is a standalone Go module — github.com/buildmere/vocab — that does exactly one thing: declare a closed tagged vocabulary once in a YAML manifest, and generate typed code, validation, schemas, and reference files across the stack from that single source.
Two rules keep it honest and portable, and they are worth stating verbatim because everything else follows from them:
1. buildmere imports nothing from its consumers. It is imported by products like Liftmere; never the reverse.
2. Generated code imports nothing project-specific — only
context/fmtfrom the standard library. Binding generated code to a project’s runtime is done by ~60 lines of adapter the project owns, not by the generator.
The first rule means buildmere can be lifted into its own repository with a git mv. The second means every generated artifact is adoptable by any project, not just ours — the generated code is inert until a small, hand-owned adapter wires it to the real metrics backend or error transport.
A kernel that knows nothing about kinds
Section titled “A kernel that knows nothing about kinds”The architecture is a kernel plus plugins. The kernel ships zero kinds.
The kernel is small and generic:
Manifest, Entry the parsed source — a list of vocabulary entriesGenerator interface: given a Manifest, return ArtifactsArtifact an in-memory file: path + contents (not written yet)Registry maps a kind ("enum", "error", …) to its generatorsKindDescriptor how a kind declares its metadata schemaGenerate / Write / Check the pipelineA kind is a plugin: a root package that knows how to parse its kind-specific metadata, plus one sub-package per target language. The kernel never hard-codes the list of kinds; the registry resolves a kind string to its generators at runtime. Adding a new output language to a kind is a new package under generators/ — it touches nothing that already works.
The kinds that ship today:
| Kind | The manifest declares | Generators |
|---|---|---|
enum | a closed value set | Go constants, TypeScript, Zod, JSON-Schema, SQL CHECK |
config | an env-var schema + validation | Go Config struct + Load/Validate, .env example |
error | a domain error vocabulary + fields | zero-import Go error structs (Code/Wire/Fields/Is/Wrap) |
metric | an OTel instrument set + labels | zero-import Go metric set + a Factory interface |
Generators are pure functions
Section titled “Generators are pure functions”This is the detail that makes the drift gate trustworthy.
A generator does not write to disk. It takes a manifest and returns artifacts — path-plus-bytes values, held in memory. The framework collects them, and then does one of two things:
- Write — write the artifacts to disk (
gen). - Check — compare the artifacts against what is already committed, and exit non-zero if a single byte differs (
check).
Because a generator is a pure function from manifest to bytes, the same call that writes your code in development is the call that verifies it in CI. There is no second, separately-maintained “linter” that could itself drift from the generator. The generator is the spec.
# from packages/buildmere (GOWORK=off keeps the tool's deps out of the app graph)GOWORK=off go run ./cmd/buildmere gen <manifest.yaml> # write artifactsGOWORK=off go run ./cmd/buildmere check <manifest.yaml> # CI drift gateGOWORK=off go run ./cmd/buildmere gen-all <root-dir> # discover + generate every manifestGOWORK=off go run ./cmd/buildmere check-all <root-dir> # drift gate over a whole treeEach manifest declares its own output: path, so discovery needs no per-manifest wiring — gen-all walks a directory, finds every buildmere manifest, and generates each to where it says it belongs.
The drift gate is one make target
Section titled “The drift gate is one make target”In our repo, two make targets wrap the kernel:
make gen-vocab -> buildmere gen-all (regenerate everything; what you run locally)make check-vocab -> buildmere check-all (the CI gate; fails on any drift)check-vocab is not optional or advisory. It sits inside the backend lint umbrella that every pull request runs:
lint-be: contracts lint-migrations check-vocab check-mocks check-sqlc \ check-sql-scope check-connect-errorsIf you hand-edit a generated .gen.go, check-vocab regenerates it in memory, sees that your edit does not match what the manifest produces, and fails the build. The only way to change generated code is to change the manifest and regenerate. That is the third leg of the governance model — structure, enforcement, docs — and without it the first two are decoration.
buildmere also ships its own internal drift guards, so the generator itself cannot silently change shape: a compile-time assertion that the metric Factory matches, an error-code round-trip test, and pinned generator signatures.
The same invariant — generated code cannot silently disagree with its source — appears twice in the backend, enforced two different ways, and the contrast is the interesting part.
buildmere outputs are committed to the repo, which is why check-vocab can catch them: it regenerates the files in memory and byte-compares against what is committed — any hand-edit is visible as a diff and fails the build.
buf generate outputs are gitignored, so they cannot drift because they are never stored in the first place — gen/go/ and gen/ts/ are produced fresh on every make gen-be, and any hand-edit to them simply disappears the next time the target runs.
Two different mechanisms. The same refusal.
Generated code cannot silently disagree with its source. In this backend, that invariant is enforced twice over — once by a check that diffs what is committed, once by a store strategy that makes committing impossible.
| Drift scenario | Gate | Tier |
|---|---|---|
Hand-edit a committed .gen.go vocab file | check-vocab inside lint-be: regenerates in memory, byte-compares, exits non-zero | coverage-script (tier 3) |
| Add an RPC to the proto, skip the handler method | var _ workoutv1connect.WorkoutServiceHandler = (*Service)(nil) — build fails | compile-enforced (tier 1) |
| Delete a proto method, handler still references the old interface | same compile assertion, same instant failure | compile-enforced (tier 1) |
Hand-edit anything under gen/go/ or gen/ts/ | make gen-be overwrites it; gen/ is gitignored so the edit was never committed | structural (gitignore) |
Neither of these gates requires developer discipline to fire on a PR: make lint-be lists contracts as a prerequisite, so buf generate runs before the linters, and the compile assertion runs before that lint step even starts.
The velocity payoff
Section titled “The velocity payoff”Adding a new closed set used to mean a decision and a chore: where do the constants live, who writes the validator, who remembers to update the SQL constraint, who keeps the frontend list in sync.
With a kernel in place, a new vocabulary is a new manifest plus a make gen-vocab. The generator for that kind already exists. Every projection appears at once, correct by construction, and check-vocab guarantees it stays that way.
And because generators are pure functions tested as pure functions, extending the machine is itself cheap — a new language emitter is a small package with golden-file tests, not a research project. This is the economic inversion from Part 1: The Drift Problem made concrete: the rails are cheap to lay, cheap to extend, and they remove a standing coherence tax.
What’s next
Section titled “What’s next”The kernel is abstract on purpose. Part 4: Enums as Shared Vocabulary makes it concrete with the simplest kind — enum — and follows one manifest all the way out to Go, TypeScript, Zod, JSON Schema, and a SQL CHECK constraint.
Where this goes is more reach on the same kernel: additional kinds like permission, route, and event, more language emitters per kind (error → TypeScript is the obvious next one), a typed Kind discriminator, and lifting buildmere into its own open-source repository with a git mv. The plugin model was built for exactly these extensions.