Lesson 5.1: Sector-Specific Use Cases: Healthcare, Finance, and E-Commerce — Course 5: Industry Applications and Deployment

Most teams think they are shipping one AI system. In practice they are shipping three different kinds of risk — clinical, financial, and commercial.

Talvinder Singh, from a Pragmatic Leaders session on AI deployment

A RAG system is never just a RAG system. In healthcare it becomes a clinical safety surface. In finance it becomes an audit surface. In e-commerce it becomes a speed and freshness surface.

If you miss that, you build the wrong product with the right model. The demo works. The deployment fails. A doctor gets an answer without the latest lab context. A risk analyst cannot trace why the system surfaced a policy clause. A shopper sees an item marked in stock that sold out nine minutes ago. Same stack. Different failure. Different damage.

The same model becomes three different products

Most AI teams start from the architecture diagram. Vector store, retriever, ranker, generator, guardrails. That is reasonable for engineering. It is weak product thinking.

Your first question is simpler: what is the cost of a wrong answer in this sector? Everything else is downstream of that one job.

Sector	If the system is wrong	What the PM must optimise for	Typical Indian context
Healthcare	A clinician misses context, delays care, or trusts a summary that should have been reviewed	Safety, provenance, role-based access, human review	Apollo 24/7, hospital chains, diagnostics platforms, ABDM-linked records
Finance	A user is denied incorrectly, an analyst cannot explain a decision, or sensitive data is exposed	Auditability, policy traceability, permissions, immutable logs	Razorpay, PhonePe, NBFC workflows, RBI and internal policy checks
E-commerce	Users see stale inventory, poor recommendations, or irrelevant search results and abandon	Freshness, latency, ranking quality, catalog hygiene	Flipkart, Meesho, Zepto, dark-store inventory and search

The trap is building one “enterprise RAG platform” and assuming the sectors are just different document collections. They are not. The same retrieval miss means three different things. In healthcare it is unsafe. In finance it is indefensible. In commerce it is a conversion leak.

Healthcare punishes confident wrongness

Healthcare is where PMs learn the hardest lesson fastest: partial context can be more dangerous than no context.

If your assistant summarizes a patient chart but misses the latest creatinine result, the model may still produce a polished answer. That polish is the problem. Clinicians are not reacting to the internal confidence score. They are reacting to the fluency of the output and the time pressure of their day.

// scene:

Clinical workflow review at a Bengaluru hospital group piloting an internal AI assistant for discharge summaries.

Clinical Informatics Lead: “The summary reads well. My concern is source coverage. Which systems are being searched?”

PM: “The assistant retrieves the discharge notes, current medication list, and the latest lab bundle from the hospital information system.”

Consultant Physician: “Latest lab bundle according to which timestamp? The blood gas can land after the summary draft starts. If that result is missing, I sign the wrong document faster.”

PM: “Then the draft should hard-block if the latest lab reconciliation has not completed. No draft is better than a draft built on stale inputs.”

That changed the design. The team stopped treating retrieval completeness as an engineering metric and started treating it as a patient-safety gate.

// tension:

A fluent summary built on incomplete records speeds up the wrong decision.

This is why healthcare RAG needs stricter defaults than most teams expect:

The retrieval source set must be narrow and explicit. Pull from verified clinical systems, not every document someone can upload.
Every answer needs provenance visible to the clinician. Not hidden in logs. Visible in the interface.
High-stakes outputs need human sign-off. Draft the discharge summary, medication counseling note, or visit recap if you want. Do not silently finalize it.
Freshness is a safety property. A record that is two hours stale might still be clinically unsafe.

In the Indian context, this often sits on top of fragmented systems. A hospital group may have one EMR for OPD, another system for labs, a third for radiology, and ABDM-style record sharing expectations on top of all of it. The actual job is not “add AI.” It is deciding which clinical truths are allowed to enter the answer path, and when.

Apollo 24/7 and similar health platforms are a useful mental model here. The product surface looks digital. The trust surface is still clinical. If the doctor cannot tell where the answer came from, the product should not ask for trust.

Finance does not forgive missing traceability

In finance, a wrong answer is bad. An untraceable answer is worse.

Support copilots, risk-review assistants, policy search tools, collections guidance, merchant compliance helpers — these are all reasonable finance use cases for RAG. But a finance workflow does not end when the answer looks useful. It ends when somebody can defend that answer to a manager, an auditor, or a regulator.

// thread: #risk-ops-ai — Internal discussion at a Mumbai fintech after a policy assistant surfaces the wrong collections script

Meera (Risk Ops)The assistant told an agent to use the old collections script for overdue BNPL accounts.

Karthik (PM)Did it cite the policy source?

Meera (Risk Ops)It cited a PDF from January. Compliance updated the script in March.

Rahul (Compliance)Then the issue is not just answer quality. We have no defensible retrieval policy. Archived policies should not be eligible unless explicitly requested.

Karthik (PM)Agreed. We need effective-date filtering, source whitelisting, and an answer card that shows the exact policy version on screen. No citation, no answer.

Here is the uncomfortable reality. In finance, “helpful” is not a sufficient product standard. You need four things together:

Policy version control. The retriever must know which policy is current, which is archived, and who is allowed to see each one. An answer built from the wrong version is not “almost right.” It is operational debt turned into user-facing guidance.

Decision traceability. If a collections lead, credit analyst, or support agent acts on the answer, you need a record of query, retrieved sources, answer, and user action. This is not optional process overhead. It is the product.

Tight permissions. Finance assistants often touch transaction data, user identity data, internal risk rules, or underwriting playbooks. Your retrieval layer cannot assume “internal user” means “authorized user.”

Explicit refusal boundaries. If the system cannot retrieve a valid source, it should say so. A refusal is cheaper than an answer that causes a wrong customer action.

Razorpay, PhonePe, and Zerodha each operate in very different parts of finance, but the pattern is consistent: money products are trust products. A finance RAG system that cannot explain itself has stopped being a product assistant and started being a liability.

E-commerce punishes latency and stale context

E-commerce is lower stakes than healthcare or finance. That does not mean it is easy. It means the failure shows up in behavior instead of escalation.

Users do not file a clinical review when search is bad. They bounce. They open another tab. They buy from a different seller. The product pain is commercial, not procedural.

That changes the architecture. In commerce, you usually care less about long-form reasoning and more about whether the system can retrieve the right context fast enough to still matter.

Flipkart-style catalog search, Meesho-style semantic discovery, Zepto-style inventory-aware recommendations, seller-support assistants, return-policy Q&A — all of these look like RAG candidates. But they only work if you respect three constraints.

Freshness beats cleverness. A beautiful answer about a product that is now out of stock is worse than a blunt answer that reflects live inventory. For quick commerce, ten minutes can make the answer stale.

Latency is part of the feature. If a shopper waits three seconds for an “AI search explanation,” you have already lost more value than you created. Fast retrieval and ranking matter more than eloquence.

Catalog quality determines answer quality. If the underlying product titles, attributes, and inventory states are messy, the AI layer inherits the mess. Most commerce “AI failures” start as catalog failures.

You can see this pattern across Indian commerce companies. Meesho has to contend with inconsistent seller data. Flipkart has to make sense of massive catalog breadth. Zepto has to combine recommendation with real-time availability in a dark-store model. Different operating context, same PM lesson: your retriever is only as good as the operational truth it sits on.

The architecture should follow the consequence, not the hype

Once you understand the failure mode, the design choices become much cleaner.

Design choice	Healthcare default	Finance default	E-commerce default	Indian context cue
Retrieval scope	Narrow, verified clinical systems only	Current policy and approved data sources only	Broad but freshness-ranked product and inventory sources	Fragmented hospital systems, regulated policy repositories, messy seller catalogs
Output style	Draft with source references	Answer with versioned citation and log trail	Fast answer or ranked suggestions	Doctors need proof, risk teams need traceability, shoppers need speed
Fallback	Escalate to clinician review	Refuse and route to manual check	Fall back to standard search or rules	Better to degrade gracefully than bluff
Primary metric	Unsafe answer rate	Uncited answer rate and audit completeness	Conversion, latency, stale-answer rate	Each sector measures damage differently

What I tell PMs is simple: do not start by picking a model. Start by writing the refusal policy, freshness rule, and escalation path for the sector you are in. That document will tell your engineers more about the real product than a page of vendor names.

// exercise: · 15 min

Stress-test one sector before you build

Pick one use case you are considering. Keep it concrete: discharge-summary drafting, policy search for risk ops, catalog search assistant, something real.

Then answer these five questions in writing:

If the system is wrong, who gets hurt first? Name the person, not the team.
What source of truth is allowed into retrieval? List exact systems or document classes.
How fresh does the answer need to be to remain useful? Minutes, hours, days.
What must the interface show so the user can trust or reject the answer? Citation, timestamp, confidence, policy version, inventory state.
When does the system refuse instead of answering? Write the refusal rule before launch.

If you cannot answer #2 or #5, you are not ready to build. You are still at the demo stage.

Sector choice is a product strategy decision

A lot of teams ask which sector is “best for AI.” That is the wrong question.

The better question is which sector your team is equipped to serve with discipline. Healthcare demands domain review and safety gates. Finance demands policy governance and auditable logs. Commerce demands operational freshness and speed. The model layer is rarely the bottleneck. The surrounding product system is.

This is why two companies with the same LLM can get wildly different outcomes. One builds a hospital assistant and discovers the real work is permissions, source reconciliation, and clinician approval. Another builds a finance assistant and discovers the real work is policy versioning and evidence trails. A third builds a commerce assistant and discovers the real work is catalog cleanup and sub-second retrieval.

Most teams confuse vertical opportunity with deployment readiness. Healthcare may sound prestigious. Finance may sound high value. Commerce may sound easier. None of that matters if your team cannot support the operational discipline the sector requires.

Test yourself: Which pilot do you launch first?

You are not choosing a market in theory. You are choosing the first place your product earns or loses trust.

// learn the judgment

You are a Senior PM in Bangalore earning ₹38 LPA CTC at a Series B enterprise AI startup. Your product category is workflow copilots built on RAG for Indian businesses. This week you have three pilot requests: a hospital group wants discharge-summary drafting on top of its EHR and lab systems, a Pune NBFC wants a policy assistant for collections and risk operations, and a Mumbai quick-commerce company wants a catalog and inventory assistant for dark-store support teams. Your CEO wants one shared architecture and asks you to launch the easiest pilot within 6 weeks.

The call: Which pilot do you choose first, and what non-negotiable design constraints do you set before engineering starts?

Your reasoning:

Where to go next

If you need the model and RAG basics before deployment decisions: AI Fundamentals for PMs
If you are choosing prompts, RAG, agents, or fine-tuning for a real feature: Building AI Features
If you need the ethics and risk lens behind these sector choices: AI Ethics & Responsible AI
If you want the healthtech operating context behind clinical workflows and trust: Healthtech Product Management
If you want the fintech operating context behind auditability and regulation: Fintech Product Management
If you want the commerce operating context behind freshness, search, and inventory truth: E-commerce & Quick Commerce PM