Architecture Decision Records: Why We Did It That Way

June 30, 2026 · 9 min read

Greenbox delivers weekly produce boxes from local farms to 3,000+ subscribers. The team of fourteen has drawn bounded contexts, built decision tables, and started shipping faster, but new developers keep asking “why does it work this way?” and nobody can give a straight answer.

Ravi starts at Greenbox on a Monday. He’s developer number eleven. Eight years of experience, mostly in backend systems at a Melbourne consultancy. His wife Meera is six months pregnant with their first child. He left the consultancy for Greenbox because the salary was better, the equity was real, and the commute from North Perth was twenty minutes instead of an hour. He has a private timeline: prove himself indispensable before the baby arrives, so that when he takes paternity leave, nobody questions whether he’s pulling his weight.

By Wednesday, he’s reading the billing code. He notices something odd. The payment system charges customers on delivery day, not signup day. That’s unusual.

Ravi asks in Slack: “Why do we charge on delivery day? Seems like it’d be simpler to charge on a fixed billing cycle.”

Tom: “That’s just how it works.”

Priya: “I think there was a reason but I don’t remember.”

Maya: “Lee suggested it during the Event Storming session. Something about variable box contents?”

Nobody can give a definitive answer. The decision was made months ago, by a team that was half the current size.

The LLM explains it wrong

Ravi pastes the billing code into his LLMA neural network trained to predict the next token in a sequence, large enough that it generalises to tasks it wasn’t explicitly trained for. : “Explain why this payment system charges on delivery day instead of signup day.”

The LLM produces a confident explanation: the system charges on delivery day to align with the subscription renewal cycle, reducing disputes over undelivered items.

It sounds plausible. Ravi nods. He’s three days in. Questioning it would mean going back to Slack with a follow-up that might make him look like he hasn’t done his homework. Meera asked last night how it was going and he said “really well.”

The LLM’s explanation is also wrong.

The real reason is that box contents vary week to week based on farm availability. The per-box cost can differ depending on substitutions. Charging at signup means charging for a box whose contents aren’t yet known. The Event Storming session revealed that the billing point should be after supply matching (Tuesday evening), when actual contents and costs are known.

That reasoning lives only in the fading memories of the people who were there.

Two weeks later, Ravi implements a subscription pause feature. Because he believes billing aligns with a fixed cycle, the LLM’s explanation, he implements the pause by skipping the renewal charge. It accidentally breaks the variable pricing logic. The bug doesn’t surface for three weeks.

The fix is straightforward. But the root cause is serious: a reasonable decision based on a plausible but incorrect understanding of why the system works the way it does.

Institutional memory

Charlotte hears about the bug.

“This is the most common scaling problem I see,” she tells the team at the Friday retro. “Not code quality. Not architecture. Memory.”

She draws three boxes on the whiteboard:

6 months ago
Maya, Tom, Priya, Jas, Sam
5/5 carry context = 100%
Today
Same 5 + Kai, Anika, Ravi, +6
5/14 carry context = 36%
6 months from now
Same 5 in a team of 25
5/25 carry context = 20%

“And it’s worse than the numbers suggest. Even the original five don’t remember everything perfectly. Meanwhile, every new person asks the LLM. The LLM guesses. Sometimes it’s right. Sometimes it’s wrong. And you don’t find out which until something breaks.”

Architecture Decision Records

Charlotte introduces ADRs. Architecture Decision Records. The concept comes from Michael Nygard. A short document capturing one decision: what was decided, why, what was considered, and what follows.

Title. Date. Status (accepted, superseded, deprecated). Context (what was going on). Decision (what was chosen). Consequences (positive and negative).

One decision per record. A few paragraphs. Five minutes to write. Five minutes to read.

“The Context section is the most important part. It tells a future reader what the world looked like when you decided. Constraints change. A decision that was right six months ago might be wrong today, but you can only evaluate that if you know the original constraints.”

The first ADR


ADR-001: Charge on delivery day, not signup day

Date: March 2026

Status: Accepted

Context:

Greenbox box contents vary week to week based on farm availability. Supply matching happens on Tuesday, and actual box contents, including substitutions, are finalised Tuesday evening. The per-box cost can vary depending on what’s included. Charging at signup means charging for a box whose contents aren’t yet known.

Lee raised this during the Event Storming session. Three options were considered.

Alternatives considered:

  1. Charge on signup, fixed price. Simplest. But absorbs cost variance, and customers pay for a box they haven’t received.
  2. Charge on signup, adjust on delivery. Complex. Confusing multiple charges.
  3. Charge on delivery day. Single charge, accurate amount, aligned with value delivery.

Decision: Charge on delivery day (Thursday), after box contents are finalised (Tuesday evening).

Consequences:

  • Positive: Accurate billing. No adjustments or refunds.
  • Positive: Payment aligns with value delivery, reducing early cancellations.
  • Negative: Revenue less predictable week to week.
  • Negative: Pause and cancellation logic must account for the billing-delivery coupling.

“Ravi,” Charlotte says. “Read that. Does it change how you’d have implemented the pause?”

Ravi reads it. The bit about variable contents. The bit about billing coupled to delivery.

“Yes. Completely.”

LLMs help write ADRs

The team has months of decisions behind them. Charlotte has a shortcut.

Tom pulls up the original PR that implemented charge-on-delivery. The description references the Event Storming session. Comments link to a Slack thread where Lee explained the reasoning. He feeds these to the LLM:

Draft an ADR for the decision to charge on delivery day instead of signup day. Here’s the PR description, the Slack thread, and the commit messages.

The LLM produces a draft. It’s 80% there, misses some nuance about premium substitution costs, invents an alternative that wasn’t considered. Tom and Maya correct it in ten minutes.

Git History
PRs, commits
Slack Threads
Discussions
LLM
Drafts ADR
Team Reviews
& Corrects
Published ADR

The team writes twelve ADRs over the following week. About twenty minutes each, including review.

That same week, Ravi deploys the pause feature update and forgets to set the config flag, all 3,000 subscribers can suddenly see “Pause Subscription.” Three pause immediately. Tom: “We need a proper feature flag system.” They add a simple environment variable for visibility. Charlotte insists they record it: ADR-013, “Features deploy dark by default.” It’s the first new ADR after the retrospective batch, the first one that captures a decision as it happens rather than months later.

Tom’s ADR that wasn’t good enough

Tom writes one the following week.


ADR-014: Use Stripe for payments

Date: February 2026 Status: Accepted

Context: We needed a payment provider.

Decision: Use Stripe.

Consequences: It works.


Charlotte pushes back. “This tells a future reader nothing they couldn’t figure out from the imports. Why Stripe? What else did you consider?”

“It was obvious. Stripe is the standard.”

“Was it? Did you consider Square? Direct bank transfers? Did anyone raise concerns about lock-in?”

Tom pauses. “Actually, Maya wanted a local provider because the fees were lower. And Priya pointed out Stripe’s webhook system was more reliable for delivery-day billing.”

“That’s the ADR. Not ‘Stripe because it’s popular.’ Stripe because webhook reliability for delivery-day billing outweighed the fee advantage. That’s a real trade-off.”

Tom rewrites it with the actual reasoning: three alternatives considered, webhook reliability as the deciding factor, fee trade-off acknowledged, vendor lock-in noted as a consequence. Now someone reading it in a year knows exactly which assumptions to revisit if they want to switch.

ADRs as LLM context

When Kai asks the LLM to implement a new billing feature, he includes the relevant ADRs in the PromptThe input you hand to an LLM – system instructions, user message, examples, retrieved documents, tool descriptions, the lot. . The LLM produces code that respects the constraints because the ADRs explain why the system works the way it does, not just how.

ADRs aren’t just for humans. They’re context that makes LLM output more accurate. Bounded contexts give the LLM structural boundaries. ADRs give it reasoning. Both make the generated code more likely to be correct.

When to write ADRs

Charlotte’s rule: write an ADR whenever you make a decision that a new team member would question six months from now. If the answer requires a story, a constraint, a trade-off, a workshop insight, you need one. If the answer is “because it was obvious,” you probably don’t.

ADRs aren’t permanent, decisions can be superseded. They’re not consensus documents, dissent goes in too. And they’re not a substitute for conversation. They’re the output of conversations, preserved so the conversation doesn’t have to happen again.

Code is how. ADRs are why. When LLMs are generating the code, you need the why more than ever, because the LLM will always produce confident code. The question is whether it’s confidently right or confidently wrong.

The team now has bounded contexts, decision tables, and ADRs. But a new question is emerging: should Greenbox build its own delivery tracking system, or buy one? Tom thinks the LLM can build it in a week. Charlotte thinks that’s the wrong question. She suggests Wardley Mapping, a way to figure out whether you should build, buy, or borrow before anyone writes a line of code.

The next chapter, Value Stream Mapping: Finding Where to Focus, publishes around 11 August.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.