Skip to content

Are We Drifting? — Part 4: Enums as Shared Vocabulary

Are We Drifting? — Part 4: Enums as Shared Vocabulary

Section titled “Are We Drifting? — Part 4: Enums as Shared Vocabulary”

Part 3: buildmere, a Codegen Kernel introduced the kernel in the abstract. The enum kind is the simplest place to make it concrete — so let us follow one enum from its manifest all the way out to every side that has to agree about it.

An enum is a closed set of named values. It is also the single most common thing to drift, because it shows up in the most places: a Go type, a TypeScript union, a dropdown, a validator, a database column, an analytics dimension, a model’s structured output.

The drift is always the same story. Someone adds a value on one side. The other sides do not hear about it. A row appears in the database that the frontend cannot render; a model returns a status the backend rejects; a dashboard silently stops counting a state that was renamed.

So the test for the enum kind is direct: can a value exist on one side that another side has never heard of? The answer should be structurally no.

In this codebase, here is what “structurally no” looks like in practice:

  • A new VideoStatus value added to the YAML manifest → make gen-vocab regenerates Go, TS, Zod, SQL CHECK in one pass; make check-vocab fails the PR if any committed projection is out of date.
  • A VideoStatus string typed by hand in handler code → the Go type system rejects anything that isn’t a VideoStatus constant; no raw string survives tsc or the Go compiler.
  • A row inserted with an unknown status → the database CHECK constraint rejects it before it lands.
  • A backlog ticket assigned to an owner not in enums.mjs → the Zod schema on the operation input rejects the value at the REST/MCP/CLI boundary.

Here is a real enum from our backend — the lifecycle of an uploaded video. This is the entire source of truth:

kind: enum
output: "../enums"
module: media
name: VideoStatus
entries:
- name: Uploading
wire: uploading
label: Uploading
- name: Processing
wire: processing
label: Processing
- name: Ready
wire: ready
label: Ready
- name: Failed
wire: failed
label: Failed
metadata:
terminal: true
- name: Archived
wire: archived
label: Archived

Five values, each with a code-facing name, a boundary-facing wire, and a human label. One of them is flagged terminal. That is the whole declaration.

Running the generator turns that manifest into typed code. This is the actual generated Go — not a sketch:

// Code generated by buildmere; DO NOT EDIT.
package enums
type VideoStatus string
const (
VideoStatusUploading VideoStatus = "uploading"
VideoStatusProcessing VideoStatus = "processing"
VideoStatusReady VideoStatus = "ready"
VideoStatusFailed VideoStatus = "failed"
VideoStatusArchived VideoStatus = "archived"
)
func (v VideoStatus) IsValid() bool {
switch v {
case VideoStatusUploading,
VideoStatusProcessing,
VideoStatusReady,
VideoStatusFailed,
VideoStatusArchived:
return true
}
return false
}
var videoStatusLabels = map[VideoStatus]string{
VideoStatusUploading: "Uploading",
VideoStatusProcessing: "Processing",
VideoStatusReady: "Ready",
VideoStatusFailed: "Failed",
VideoStatusArchived: "Archived",
}
func (v VideoStatus) Label() string { return videoStatusLabels[v] }
func AllVideoStatus() []VideoStatus {
return []VideoStatus{
VideoStatusUploading,
VideoStatusProcessing,
VideoStatusReady,
VideoStatusFailed,
VideoStatusArchived,
}
}

Note the name/wire split surviving into the code: VideoStatusReady (the name) is the constant; "ready" (the wire) is its value. Code reads VideoStatusReady; the byte on the wire is ready; the label "Ready" is for humans. Three concerns, one declaration, no place for them to disagree.

The same manifest is what the enum kind projects to the other sides. A TypeScript union and label map:

export const VideoStatus = {
Uploading: "uploading",
Processing: "processing",
Ready: "ready",
Failed: "failed",
Archived: "archived",
} as const;
export type VideoStatus = (typeof VideoStatus)[keyof typeof VideoStatus];
export const videoStatusLabels: Record<VideoStatus, string> = {
uploading: "Uploading",
processing: "Processing",
ready: "Ready",
failed: "Failed",
archived: "Archived",
};

A Zod schema for runtime validation at the edge:

export const VideoStatusSchema = z.enum([
"uploading", "processing", "ready", "failed", "archived",
]);

And a SQL CHECK constraint so the database itself refuses an unknown value:

ALTER TABLE videos
ADD CONSTRAINT videos_status_check
CHECK (status IN ('uploading', 'processing', 'ready', 'failed', 'archived'));

The dropdown iterates AllVideoStatus() (or the TS union). The validator is VideoStatusSchema. The column is constrained. None of these is hand-maintained; all of them are the same five entries, projected.

A closed set is only safe if changing it is disciplined. The rules are simple and they are the same ones a careful DBA already follows:

  • Values are append-only by default. Adding Suspended is safe. The manifest grows; every projection regenerates; the CHECK constraint is widened by a new migration.
  • Renaming is not a rename. Changing a wire value is a data migration in disguise — existing rows still hold the old string. You deprecate the old value and add the new one, then migrate, then remove.
  • Removal waits for the data. A value cannot leave the vocabulary while a single row or in-flight event still references it. Mark it deprecated with a note, migrate, and only then delete.

Deprecation is itself just metadata on the entry — deprecated: true with a deprecation_note — so “this value is on its way out” is a fact the generated code and docs can carry, not tribal knowledge.

The database CHECK constraint is the backstop. Application code can be wrong; a migration can lag; but a column constrained from the same manifest will reject a value that was never declared. The truth has a floor.

Codegen is the right tool when a vocabulary crosses language boundaries. Some of ours do not, and they use a lighter mechanism.

Our backlog’s enums — ticket statuses, areas, efforts, owners — live in a single enums.mjs module that every consumer imports directly. The frontend reaches them over REST; the CLI and the backlog services import the file. Some are static (STATUSES, AREAS); one, OWNERS, is derived at call time by scanning the agent definitions on disk, so it can never go stale against the set of agents that actually exist.

And there is a fourth consumer the others do not have: an agent, which discovers the set at runtime rather than compiling it in. The same enums are queryable over MCP — an agent asks for the live owners or statuses (backlog_list_enums) instead of hard-coding them, and a stale agent never invents an owner that does not exist. An enum, in other words, is the smallest instance of a vocabulary that systems serialize, humans speak, and agents discover — the same closed set, reached three different ways.

Different mechanism, identical principle: one source, many consumers, no second copy to drift. Codegen is how you get there across languages; a shared module is how you get there within one; runtime discovery is how an agent gets there without compiling at all. The discipline is the same.

The check this removes is the most tedious one in code review: you added a status — did you update the type, the dropdown, the validator, and the constraint?

That question disappears. You add an entry to the manifest, regenerate, and every side updates together. The reviewer does not audit five files for consistency, because consistency is not something a human is maintaining. And if anyone hand-edits a generated file to “just add it here,” check-vocab fails the build.

You can let an agent add a video state and trust the diff, because the only way to add one wrong is a way the gate rejects.

Enums are the base vocabulary. Part 5: The Error Manifest adds the first layer of kind-specific metadata and lands the series’ clearest cross-stack example: the error manifest, where one declaration stitches a backend failure to a frontend’s handling of it.