The decision log is the most boring, least-loved, most under-maintained artifact in almost every team I've worked with — and the one whose absence causes the most expensive mistakes.
I'll admit the obvious: keeping a decision log feels like bureaucracy. When you're two months into a build, fighting a deadline, trying to get the spec clean before the build window closes, the last thing you want is another markdown file to update. So teams skip it. Three months later, when a production behavior is doing something nobody remembers designing, the decision log doesn't exist — and neither does the institutional memory that would have told you whether the behavior is a bug or a deliberate choice from a long-forgotten conversation.
This is also true of solo work. I am, most of the time, the only human on my projects. I keep a decision log anyway. Past-me is functionally a different collaborator from present-me, and the only interface between us is the artifacts we leave behind.
Why This Matters More in Agent Systems
In a human-only engineering team, the decision log is useful. In an agent-augmented team, it's infrastructure. Three reasons.
Recoverability. When an agent makes a decision during a build — choosing overwrite semantics for a duplicate row, picking a library version, deciding what "validate" means in a spec that didn't define it — that decision gets baked into code but not into anyone's head. The agent won't remember next session. You won't remember next week. If the decision was wrong, the decision log is where the alternative was recorded — which means it's where you can go to roll back, rather than reconstruct.
Audit trail. Tier 3 and Tier 4 systems get asked "why did this system do that?" The answer "because the agent generated it that way" is not legally or commercially adequate. The answer "here is the spec section, here is the intent contract, here is the decision log entry where we chose this approach, here is the reviewer's sign-off" is. This is the same structure that ISO 42001 and the EU AI Act's high-risk provisions demand. A decision log is cheap insurance for the day a certifier, an auditor, or a regulator asks.
Knowledge extraction. Every project produces two outputs: the system and the lessons. The decision log is how you capture the second one. Without it, you ship the system and lose the lessons — which means the next project repeats the same mistakes. VZYN Labs and TravelOS repeated two separate mistakes I had already made on earlier projects. I didn't have the log. I have it now.
The Template
### DEC-[NNN] — [Short Title]
- **Date**: YYYY-MM-DD
- **Area**: [project / venture / cross-cutting]
- **Context**: What situation triggered this decision? What constraints exist?
- **Options Considered**:
1. Option A — [description, pros, cons]
2. Option B — [description, pros, cons]
3. Option C — [description, pros, cons]
- **Decision**: [which option and why]
- **Reasoning**: Why this option over the others
- **Trade-offs Accepted**: What you gave up
- **Reversibility**: Cheap (<1 day) / Moderate (1-5 days) / Hard (weeks) / Irreversible
- **Blast Radius**: Which systems, users, contracts are affected if this is wrong?
- **Approved By**: Who signed off (self-approval is valid for solo work — record it)
- **Expected Outcome**: What should happen if this is right
- **Review Trigger**: When to revisit — date, metric, or event
- **Actual Outcome**: [fill in later]
- **Status**: pending | validated | invalidated | mixed
- **Lesson**: [fill in later]
Two fields are easy to skip and worth defending. Reversibility is the single most useful field in the template because it tells future-you how careful to be. A cheap-to-reverse decision deserves a lighter review than an irreversible one. Blast radius is the second — it forces you to write down what breaks if you're wrong, which is often a more honest test of the decision than the reasoning itself.
When to Write an Entry
- Any architectural choice that affects more than one module
- Any decision where two or more options were seriously considered
- Any reversal of a prior decision (always log, even if brief)
- Any pivot — abandoning a direction is a decision
- Any trust tier classification for a new system
- Any intent contract revision
- Any model swap, major prompt change, or harness change for Tier 3-4 systems
Not every small choice needs an entry. The bar is: would a reasonable successor, reading the code in six months, be able to understand why this is the way it is without the log? If yes, skip. If no, log.
When to Update an Entry
Two passes per entry, at minimum.
First pass: write-time. Context, options, decision, reasoning, trade-offs, reversibility, blast radius, approval, expected outcome, review trigger. Everything above "actual outcome" in the template. This takes ten to twenty minutes for a meaningful decision. If it takes longer, the decision itself probably isn't clear enough yet.
Second pass: review-time. When the review trigger fires, you come back and fill in actual outcome, status, and lesson. This is the pass that teams skip the most often — and it's the one that converts the log from a record into a learning system. Without the second pass, you have history. With it, you have patterns.
Worked Example: DEC-001, VZYN Labs Architecture Pivot
- Date: 2026-03-09
- Area: VZYN Labs
- Context: Thirteen-agent system built on hexagonal architecture. Two months of engineering. MVP less than exciting. Over budget. Investor concerned. An independent architect reviewed the codebase and flagged it as over-engineered. Engineers proposed to fix it in place.
- Options Considered:
- Keep iterating — improve each of the 13 agents individually.
- Reduce to fewer agents (3-5) with clearer boundaries.
- Pivot to single agent + skill catalog + playbooks (Ramp model).
- Decision: Option 3 — single agent, 57 skills, 5 deterministic playbooks.
- Reasoning: Ramp proved single-agent-plus-skills works at scale. The problem wasn't individual agent quality — it was orchestration complexity. Skills are simpler to build, test, and maintain than agent-to-agent communication. Playbooks give deterministic reliability for known workflows.
- Trade-offs Accepted: Two months of prior engineering discarded. Investor relationship strained. Engineers' work invalidated.
- Reversibility: Hard. The architecture change is weeks of work and requires re-writing most of the spec. Reverting would cost more than continuing.
- Blast Radius: Entire product. All 13 agent teams. Existing engineering commitments. Investor relationship.
- Approved By: Self, with independent architect as external reviewer.
- Expected Outcome: Simpler codebase, faster iteration, spec-driven development possible, focus shifts from orchestration plumbing to user value.
- Review Trigger: Eight weeks from decision date, or when MVP hits a demo-able state.
- Actual Outcome (partial, at eight-week review): Pivot worked. MVP reached demo-ready in substantially less time than the original architecture had consumed. Engineers adapted to the new methodology with difficulty but adapted.
- Status: mixed — architecture validated; organizational cost higher than predicted.
- Lesson: Tier-classify first. A Tier 2 marketing tool doesn't need enterprise architecture. Over-engineering is a failure mode that looks like competence. And — this is the deeper lesson — agent-per-user parity is a complexity multiplier. Any design where agents need to communicate with other agents to do their job is too complex. One agent, many skills, user as coordinator.
Worked Example: DEC-007, Engineers at END (Validate + Maintain), Not at BUILD
- Date: 2026-03-17
- Area: Dark Factory / all ventures
- Context: VZYN Labs pivot revealed that experienced engineers, given a simplification spec and twenty hours, produced a half-baked result. They resisted executing someone else's design. The role mismatch was structural, not individual.
- Options Considered:
- Keep engineers in the full-cycle role (understand → design → build → validate → maintain).
- Pair engineers with AI agents as co-builders.
- Move engineers to the END of the pipeline — validate what agents build from specs.
- Decision: Option 3 — engineers as Software Validators, not Software Engineers.
- Reasoning: The spec architect's superpower is problem understanding and spec articulation. Traditional engineers want to re-solve the problem their way. The mindset for "solve from scratch" is incompatible with "verify this matches spec." Maps cleanly to the Spec–Tests–Code triangle: architect owns spec, agent owns code, validator owns tests.
- Trade-offs Accepted: Validators have less ownership and may be harder to attract as senior talent. Architecture changes go back through the spec cycle (slower for hotfixes). Depends on spec quality staying consistently high.
- Reversibility: Moderate. Role definitions can change; hires are harder to change. Irreversible for hires who self-select out of the validator role.
- Blast Radius: Entire hiring pipeline, team composition, every future project staffing decision.
- Approved By: Self (founder-level org decision).
- Expected Outcome: Faster delivery, lower cost, higher spec fidelity, no more "I would have designed it differently" friction.
- Review Trigger: First three hires under the new role definition, or six months from decision.
- Actual Outcome: Pending at time of writing.
- Status: pending.
- Lesson (L-002, preliminary): Don't hire problem-solvers to verify solutions. Solvers need creative freedom. Verifiers need attention to detail and respect for constraints. One person rarely excels at both, and asking a solver to verify feels like a demotion. When hiring validators, look for meticulous, security-conscious, spec-literate — not creative, architectural, ownership-driven.
What Patterns Look Like
After twelve to fifteen entries, patterns appear. Mine, current: favors integration over separation, favors simplification, willing to discard prior work when evidence warrants, influenced by proven patterns (looks for who's solved this at scale), builds on top rather than underneath (uses existing infra, owns the methodology layer), picks niche over broad.
The patterns are not prescriptions. They are observations about how I decide — and they give me leverage, because when a new decision comes up, I can check it against the patterns. "This decision goes against my usual instinct to simplify — am I doing it for a good reason, or am I about to over-engineer?" That's the self-audit the log makes possible.
There are also anti-patterns the log surfaced: the Parity Trap (agent-per-user matching), the Elegant Complexity Trap (designs that mirror human team structures), the Solver-as-Verifier Trap (asking creative engineers to validate someone else's spec). Each of these was a real mistake that cost real time. I don't expect to avoid every recurrence. I do expect to catch them earlier next time because the log named them.
What This Can't Do
A decision log doesn't make decisions for you, doesn't catch bad reasoning in the moment, and doesn't prevent the same mistake twice if you forget to review it. It's an artifact, not a process. The process — write decisions, review them, extract patterns — is what turns the artifact into leverage.
And it only works if you write the entry before you know the outcome. Writing it after the fact, when you already know whether the decision was right, is rationalization, not reasoning. The value of the log is the frozen expectation: what you thought would happen, captured at the moment you chose. The gap between expectation and outcome is where the lesson lives.
Keep the log. Write entries honestly. Review them on schedule. The three months in which nothing bad happens will feel like wasted effort. The one moment when everything is on fire and someone asks "why does this system behave this way" will pay for every hour spent maintaining it, several times over.