Capstone — draw and label your own agent loop — The Agent Loop — what's actually running when an agent "works for hours"

The diagram comes before the build decision.

Not because diagrams are valuable in themselves — most diagrams are not. Because the act of drawing the loop forces every implicit decision into the open where it can be examined, debated, and made deliberately rather than inherited by accident.

The teams that produce the most reliable agent systems are not the ones with the most experienced engineers. They are the ones who drew the loop before writing the first line. The diagram is not a deliverable; it is a thinking tool. Its value is in the conversation it forces, not in the artifact it produces.

You have now seen the five-component loop, the context-window band model, the toolbelt design discipline, the four stopping conditions, the six hour-by-hour failure regimes, two SDK reference implementations, and the multi-agent quota mechanics. The capstone asks you to apply all of it to a real system you are actually being asked to build, scope, or evaluate.

Not a toy. Not a hypothetical. A system in your current backlog, vendor pipeline, or engineering spec queue.

The capstone produces four artifacts. Each one is useful on its own; together they constitute a system-design package that a leadership audience can evaluate without reading the underlying code.

Artifact 1: The loop diagram. Five components labelled, arrows annotated with their failure modes, stopping condition named and stated as a function, budget meter drawn. The diagram should be legible to someone who has not read this course — label everything, annotate the arrows, name the stopping conditions explicitly. If you are diagramming a multi-agent system, add a second layer showing the parent-child topology and the shared quota meter.

Artifact 2: The defaults inventory. Every implicit decision the framework or SDK is making for you, listed as a two-column table: the default on the left, your override (or deliberate acceptance) on the right. If a default column entry is blank because you do not know what the default is, that is the first thing to find out. Inherited unknowns are the leading cause of "the agent behaved strangely when" incidents.

Artifact 3: The failure-mode map. For each arrow in the loop diagram, the failure mode if the arrow breaks, drops a message, or returns garbage. For each component in the diagram, the failure mode if the component is unavailable, returns stale data, or exceeds its budget. For multi-agent systems, add: the failure mode if a child never sends a completion signal, and the failure mode if two children write to the same resource simultaneously.

Artifact 4: The one-page memo. Written for a leadership audience that will not read the diagrams. Four paragraphs: what this system is and what it does (one paragraph), what it is not and what it explicitly defers (one paragraph), what could break and what the blast radius is (one paragraph), what it costs to run and what telemetry signals the team is watching (one paragraph). No more than two pages. If the memo runs longer, you have not yet decided what is load-bearing.

The picture

The published version of this lesson will show one anonymized worked example — a real PL scoping session, documented end to end. The diagram, the defaults inventory, the failure-mode map, and the one-page memo, in sequence.

The system being scoped: the PL learning-path recommendation agent, which evaluates a learner's progress state and proposes next courses and cases. It is a moderately complex single-agent loop with a retrieval step, a structured output requirement, and a need to handle learner-state changes between recommendations gracefully.

That worked example will appear here when the lesson page ships. For now, the worked example is this course — the loop that deep-authored these eight lessons is itself the demonstration of the pattern.

Why it matters now

The output of this course is not "you understand agent loops." It is "you can produce the four-artifact package the next time somebody asks you to ship one."

Most teams skip directly from idea to implementation because the design artifact is not required by any process. The four-artifact package is not required either. It is useful. The distinction between "impressive demo" and "shippable system" lives almost entirely in whether the defaults inventory and failure-mode map exist before code is written, or are discovered through incidents after it ships.

Making the diagram is the cheapest insurance against the most common class of agent production incidents. It takes two hours with an engineering partner. It is reusable — the next system will rhyme with this one.

The four-artifact package also solves a communication problem that pure implementation does not. Engineers can read code and understand what a system does. Leadership cannot. The one-page memo translates the loop into a decision-quality document: what this enables, what it costs, what could go wrong, what we measure. That document gets the system approved, funded, and reviewed at the right level.

A source you should trust

The seven preceding lessons of this course. The diagram is built from Lesson 1; the defaults inventory requires the SDK reading from Lesson 6; the failure-mode map uses the taxonomy from Lesson 5; the one-page memo uses the rules cumulatively. You cannot complete the capstone by skimming forward to this lesson.
Amazon's six-page memo discipline. The writing model for the one-page memo. Not the six-page format — the discipline of narrative-first, no slideware, executive audience, explicit unknowns. A well-written one-page memo can be reviewed in seven minutes and debated in twenty. A well-designed slide deck takes forty minutes to present and is ambiguous about what decision is being asked for.
Your own incident log. If your team has shipped any AI feature and has a record of what broke, that log is the highest-signal source for your failure-mode map. Failure modes you have already experienced are the ones most worth designing against in the next system.

A recipe

A two-hour capstone working session with one engineering partner:

Pick the system. (15 min) Choose a real system from your current backlog or pipeline. Avoid systems that are already built — the diagram is most useful before implementation. Write one sentence describing the system's JTBD.
Draw the loop together. (30 min) Both people should have a blank sheet of paper. Draw independently for ten minutes, then compare. The differences in your diagrams are the implicit assumptions that would have become disagreements in code review.
Inventory the defaults. (20 min) List every default the framework inherits on your behalf. If you do not know a default, write a question mark. Count the question marks. Each one is a production incident waiting to be scheduled.
Map failure modes. (25 min) For each arrow and each component, name the failure. Focus on failures that are non-obvious — not "the API is down" but "the API returns a partial result that looks like a success."
Write the one-page memo. (30 min) Do not write it as a slide deck. Write it as prose. Four paragraphs. The memo should be able to stand alone without the diagrams.
Sleep on it. Revisit the next morning. The thing that bothers you at 9am about the diagram you drew at 4pm is the real risk. Write it down.

The smell of it going wrong

The diagram has fewer than five components. Either you have not seen the whole loop yet, or the system is genuinely simpler than an agent loop (in which case, use a workflow, not an agent).
The defaults inventory has no question marks. Either the team has genuinely read every default in the SDK — possible — or the inventory was filled in with assumptions. Ask the question-mark question: "are you certain, or are you assuming?"
The failure-mode map skips the hour-three regimes — context eviction and silent-success hallucination. These are the hardest to surface from first principles and the most common in production. If they are missing from your map, borrow them directly from Lesson 5.
The one-page memo is longer than two pages. The excess is usually the team trying to hedge risk in prose rather than designing it out. Cut anything that is not answering one of the four paragraph questions.
The capstone was done solo, not with a partner. Two people drawing independently and comparing is where the implicit disagreements surface. Solo diagramming produces a coherent artifact that reflects one mental model. It does not find the gaps.

A judgment call from real work

The PL harness-engineering Path — the learning path that teaches the subagent worktree pattern, the CLAUDE.md harness conventions, and the multi-agent quota mechanics — was scoped using exactly this four-artifact discipline before any lessons were commissioned.

The loop diagram identified the memory layer as the most underspecified component: the MEMORY.md auto-memory system had evolved through several incident-driven iterations and was not yet explicitly documented as a designed system with eviction rules and injection priorities. Drawing the diagram made that gap visible in ten minutes. It had been invisible in the codebase for months.

The defaults inventory surfaced three SDK defaults that the harness had been relying on without knowing: the default context-compression strategy, the default model-selection behavior when a tool call returns an error, and the default retry count on transient API failures. All three had been correct by coincidence. Making them explicit transformed them from luck into policy.

The failure-mode map produced the checkpoint requirement that became part of the subagent spawn protocol — the requirement that children commit work-in-progress before advancing past a complexity threshold. The failure mode it was designed against: a child agent failing to send a completion signal within its wall-clock timeout, leaving the parent uncertain whether the work was complete, partial, or lost.

The one-page memo was the brief that authorized the course — it was what turned a diagram and a checklist into a funded piece of PL's curriculum. The diagram was for the engineer. The memo was for the decision.

Rules from this lesson

The diagram comes before the implementation, not after; making it takes two hours, and the alternative is discovering the implicit disagreements in production.
Every default is a decision; surface them and inherit deliberately, not by accident.
The one-page memo is the artifact leadership actually reads; budget the writing time, not just the diagramming time.
Capstones are reusable templates; save your diagram, because the next system you scope will rhyme with this one.