There are two kinds of AI writing: the kind that tells you what's possible, and the kind that tells you what to do on Monday morning. This manual is the second kind.
This is not a manual about how to use ChatGPT. It is a manual about judgment — the judgment a builder, PM, or founder needs to ship AI products that survive contact with users, budgets, and regulators.
Each chapter ends with a numbered set of Rules. Rules are short, opinionated, citable, and contestable. If you disagree, fork them. If you agree, link to them. The Rule IDs (ai-1, ai-2, …) are permanent — they will outlive any model name in this manual.
Read in order if you are new to building with AI. Jump in by chapter if you have a specific decision in front of you today.
The twelve chapters
1. When AI is the right answer (and when it isn't)
The first decision is not which model — it is whether the problem needs a model at all. Most teams reach for AI when they should reach for a SQL query, a checklist, or a one-line rule. This chapter sharpens the judgment of when to stop reading vendor decks and write an if statement instead.
2. The model-selection ladder
Opus, Sonnet, Haiku, GPT-4o, GPT-4o-mini, Gemini Flash, Llama, your own fine-tune. The ladder is not "biggest model wins." It is "smallest model that clears the bar." This chapter teaches you to start at the bottom and climb only on evidence — because the cost gap between rungs is 10× to 100×, and your unit economics will not survive a default-to-frontier habit.
3. Prompt design as product design
A prompt is a spec, not a wish. This chapter treats prompts the way the rest of the manual treats PRDs: with structure, constraints, examples, and a quality bar. By the end you should be able to review a teammate's prompt the way you review their copy — line by line, asking what each sentence is buying you.
4. Eval before launch
If you can't measure whether the AI is right, you can't ship it — you can only hope. This chapter walks through building a small, sharp evaluation set before any AI feature touches production, and treating it as a regression suite you run on every prompt or index change. Shipping without an eval set is the AI-era equivalent of shipping without tests.
5. Hallucination as a product problem
Hallucination is not a bug to be fixed in the next model release. It is a permanent property of how these systems work, and the PM's job is to design around it — with grounding, with confidence signals, with UI that shows sources, with topic guards on high-stakes categories. This chapter turns hallucination from an engineering surprise into a product constraint you spec for.
6. Tool use, function calling, agents — the maturity ladder
A chatbot that answers questions is one thing. A system that books your flight, refunds your customer, or files your GST return is another. This chapter walks the maturity ladder from single-prompt to function-calling to multi-step agents — and teaches you when each rung adds value and when it adds blast radius. Most teams skip rungs and pay for it in production.
7. RAG, fine-tune, or context window?
The three ways to give a model your data, in increasing order of cost and operational drag. This chapter gives you a decision tree: start with the context window if your data fits, move to RAG when it doesn't, fine-tune only when prompts and retrieval are both exhausted. The wrong choice here costs months, not days.
8. AI UX patterns that work
Streaming responses, "thinking" indicators, citations panels, suggestion chips, undo affordances, confidence pills, the "regenerate" button. AI UX is its own subfield now, and most teams reinvent it badly. This chapter is a pattern library with opinions — what to copy, what to skip, what to invent only when the problem is genuinely new.
9. Cost & latency as first-class product constraints
Inference is not free, and a three-second response is not a feature — it is a churn driver. This chapter treats cost per inference and p95 latency the way the SaaS manual treats CAC and conversion rate: as numbers you put on the dashboard and answer for in every review. If you cannot recite your cost-per-user-per-month, you do not have a product, you have a science fair project.
10. Safety, privacy, compliance for shipping teams
What goes in the prompt, what comes out, where it gets logged, who can see it, and what happens when the regulator calls. This chapter is the shortest path through GDPR, DPDP Act (India), HIPAA-adjacent risks, and the practical safety controls — content filters, PII scrubbing, audit logs, human approval gates — that keep a shipping team out of the news.
11. Building with AI vs. building AI products
The two are not the same. Using Copilot to ship faster is a productivity choice. Building a product where AI is the value proposition is a business choice with a different cost structure, hiring plan, and moat. This chapter separates the two cleanly so you stop accidentally hiring an ML team to do work an API would have done.
12. The 2026 model landscape
A short, opinionated atlas of where the frontier sits today — what each lab is good at, where the open-source line is, and what is likely to be commodity in 18 months. This chapter has a shorter half-life than the others by design; it is the page we update every quarter so the rest of the manual can stay timeless.
13. Learning in the AI Step-Change
Benchmarks shift gradually, step-changes are rare — and the AI step-change demands a metacognition and epistemic-hygiene skill most professionals have never had to build before. This chapter is the meta-skill underneath all the AI skill-building: knowing what to learn, when, and who to trust.
14. Harness Engineering
The cage that lets autonomous agents run safely for hours or days. If you are building something that runs unsupervised — a coding agent, a research agent, a multi-step automation — this is the chapter on the five layers your system needs before you can sleep through the night: the loop, the eval suite, tools and memory, orchestration, and the production observability layer that turns demos into systems you can maintain.
How the Rules system works
Every chapter ends with numbered Rules in this shape:
<Rule id="ai-1">
Use AI for unstructured problems. For structured ones, use code.
</Rule>
Rule IDs are permanent. The body text may sharpen over time. When you write a PRD, a post-mortem, or a strategy doc, cite Rules by ID (ai-1, ai-12, ai-37) so the reader can jump straight to the reasoning. The Rule permalinks PR (forthcoming) will formalize the component and routing.
Companion reading from the rest of the manual
The AI manual does not stand alone. It assumes you already think like a PM. If you don't, start here first:
- What Is Product Management — the job, defined.
- Idea to Launch Process — the loop from backlog to ship to measurement.
- Product Prioritization — how to choose, which applies tenfold when AI is on the table.
- Working with Engineers — most AI work happens at this seam.
- Ethical PM — the substrate under chapter 10.
What's drafted today
This file (the ToC) and Chapter 1 — When AI is the right answer are the first installments. Chapters 2 through 12 will land as separate PRs, each in the same shape: opinionated body, real cases, numbered Rules at the end. The seven legacy AI files in this folder (ai-fundamentals.mdx, ai-product-strategy.mdx, building-ai-features.mdx, etc.) remain as the raw material these chapters consolidate from; they will be redirected once the new spine is complete.