The Judgment Log in Practice: One Chain, Four Stations

I ended the last post with a question: the next challenge is not building the Judgment Log. It is whether anyone writes in it once the deadline is two hours away.

That question only has a useful answer if the artifact is light enough to actually use. So instead of arguing for it further, I want to show it.

Take a fictional but familiar scenario. A checkout flow. A promotional window. A promo code validation feature built using AI-assisted development. It shipped. Three weeks later, it broke when the campaign introduced expired codes. In the post-mortem, nobody could answer the three questions that mattered: what did the PM cut and why, what did the designer choose between, and what did the engineer override.

This post shows what that investigation looks like when someone wrote it down. Four entries. Four people. One chain.

The Chain, Not the Document

The Judgment Log is not a document with an owner. It is a chain, the same way a codebase is a chain of commits, each linking to the ticket that prompted it, each ticket linking to the epic above it.

Nobody owns the chain. Each contributor owns their link.

Each station in the development cycle, PM, design, engineering, and review, writes one entry. Each entry answers three questions: what was decided, what the AI’s role was, and which prior judgment this entry responds to.

That third question is what makes it a chain rather than four disconnected notes. A link alone is not enough. The entry needs one sentence explaining the relationship. Not “this PR closes JRA-412.” Something closer to: “this implementation accepts the PM’s decision to drop the expiry fallback, which I flagged as risk and was told to defer.”

The difference between a reference and a rationale is the difference between a chain and a pile of links.

Station One: PM Intent

The PM’s judgment calls are the most consequential and the least documented. By the time the engineer opens the PRD, the reasoning behind the requirements has evaporated. An AI-generated PRD makes this worse: the document appears complete, but the rationale for what was cut is invisible.

The PM entry does not replace the PRD. It attaches to it. One paragraph, written when the scope decision is made.

AI-generated spec included three error states for promo code validation: invalid code, expired code, and usage limit reached. Reduced to one for this sprint — invalid code only. Expiry and usage-limit handling deferred. Assumption: all promo codes in this promotional window are active and have no usage cap. Confirmed with marketing for this campaign only. If the promo scope expands to include evergreen codes, the validation logic needs to be revisited. Engineering should know this edge case exists and is deliberately out of scope.

Eighty words. Three minutes. The scope cut is on record, the assumption is named, and engineering has been told.

Station Two: Design Rationale

The design judgment calls that carry the most downstream risk are not aesthetic; they are structural. Interaction states, information architecture, and deviations from the design system. When these go undocumented, engineering either builds the wrong thing or makes a call that the designer should have made.

AI tools make this worse at the design layer too. Generative tools produce layouts quickly. They do not produce error, empty, or loading states unless explicitly prompted. A designer who ships an AI-generated layout without noting what was added or deliberately omitted is handing off a spec with invisible gaps.

The design entry attaches to the spec or handoff notes and links to the PM entry above it.

Evaluated modal and inline approaches for promo code entry. Chose inline after testing: modal interrupted the checkout flow in user testing. AI-generated prototype omitted all error and empty states. Added invalid code error state. Expiry and usage-limit error states are not in scope for this sprint, per the PM scope decision [link]. Engineering should not build handling for those states; if they appear, the current design has no answer for them.

Two things the spec cannot do: explain the direction choice, and explicitly name what is missing so engineering does not fill the gap by guessing.

Station Three: Engineering Judgment

This station receives the most in-depth treatment in the next post. Here, the point is structural.

A developer using Cursor or Copilot makes dozens of accept/reject/modify decisions in a session. Most are not judgment calls; they are autocomplete. The Judgment Log is not a record of every suggestion. It is a record of the non-trivial ones: the decisions that, if wrong, would matter.

The threshold: if the reasoning behind a decision would be useful to the engineer who touches this code in six months, write it down.

The entry attaches to the PR and links to the design entry above it.

AI-generated validation logic accepted for the active code path; correct and within confirmed scope. Rejected the AI’s expiry handling suggestion: the model assumed expiry windows would always be present, but PM scoped that out for this campaign [link]. Accepted the simpler path. Left a comment in the code flagging the gap. Design has no error state for expiry [link]. If the promo scope expands, this needs to be addressed before it ships; the next campaign may not have the same constraints.

The code contains none of that reasoning. The entry does.

Station Four: Review

The reviewer is the last human in the chain before code ships. A PR approval is a timestamp and a green checkmark. It records that a review happened, not what the reviewer thought.

The review entry is the shortest in the chain. Not a summary of the PR, a record of where the reviewer exercised judgment rather than verified mechanics.

Flagged the missing expiry handling; the engineer confirmed it is a deliberate scope decision per PM [link]. Approved with that understanding. Did not test the promo code path end-to-end in staging; the environment has no valid promo campaign configured. Functional verification was done by the engineer. Approved on that basis.

Two sentences. If the expiry bug surfaces six months later, the post-mortem does not reconstruct intent. The intent is already there, written by the person who held it, at the moment they held it.

What the Chain Looks Like

The post-mortem opens. Someone pulls the Judgment Log.

The PM entry: expiry handling was deliberately out of scope. Active codes only. Marketing confirmed. Engineering was told.

The design entry: no error state exists for expiry. Engineering was told not to build one.

The engineering entry: expiry logic was rejected from the AI output. The gap was flagged. The constraint was noted as campaign-specific.

The review entry: the reviewer saw the missing handling. The engineer confirmed it was deliberate. The reviewer approved on that basis.

The investigation shifts. The question is no longer “why did the code fail?” It is “why did the campaign introduce expired codes when the entire feature was scoped to assume it would not.” That is a solvable problem, and it has a clear starting point.

Four people. One paragraph each. Written at the moment the decision was made. That is the entire practice.

The Close

The chain is the artifact. Not any single entry, the chain.

An engineer’s entry without a PM entry above it is a note. A PM entry without an engineering entry below it is an intention that may or may not have survived implementation. Together, from intent to shipped code, they are the institutional record that no commit log, token dashboard, or ADR library has ever been designed to produce.

Next post goes deep on the engineering station, what to write, what to skip, and how to make the threshold call between autocomplete and a genuine judgment moment.

The Chain, Not the Document#

Station One: PM Intent#

Station Two: Design Rationale#

Station Three: Engineering Judgment#

Station Four: Review#

What the Chain Looks Like#

The Close#