Skip to main content
Back to projects

Multilingual RAG at 100K+ DAU — +40% knowledge-base precision

Grounded retrieval with frozen-eval regression checks for a multilingual user base. Precision is measured before every rollout. AWS Bedrock + pgvector.

PythonAWS BedrockpgvectorFrozen-eval harnessRAG
Scale
100K+ daily active users
Knowledge-base precision
+40% after rollout
Release gate
frozen-eval — no silent regression
Retrieval
grounded, multilingual

The problem

A 100K-DAU multilingual assistant can't regress silently. A prompt or model change that helps one language can quietly degrade another, and nobody notices until the users do.

The solution

Grounded retrieval anchors answers to the knowledge base, not the model's memory. A frozen-eval regression suite measures precision on a held-out question set before any rollout. A change that improves one cohort can't silently degrade another. AWS Bedrock for inference, pgvector for retrieval.

fig. 01decision record
Constraint
At 100K DAU across many languages, a change that helps one cohort can silently degrade another. You find out from the users.
Decision
Ground every answer in the knowledge base and gate rollouts behind a frozen evaluation set, so precision is measured on held-out questions before shipping. Rejected ungated prompt iteration (silent regressions) and ungrounded generation (hallucination at scale).
Outcome
+40% knowledge-base precision at 100K+ daily active users. A frozen-eval gate blocks silent regressions before they reach users.

Overview

A retrieval-augmented assistant serving 100K+ daily active users across a multilingual population. Grounded retrieval keeps answers tied to the knowledge base. A frozen evaluation set gates every rollout: precision is measured on held-out questions before a change ships, so the knowledge base improves monotonically instead of regressing silently. AWS Bedrock + pgvector under the hood.