Compliance-grade tax-filing agent — Brazilian IRPF
A typed-tool LLM agent for regulated tax filings. One hallucinated number is a compliance failure, so every step is schema-validated and every figure cites its source.
The problem
A regulated tax-filing agent cannot hallucinate one numerical field. Every output has to replay for a five-year audit window under LGPD, on-premise, with no data egress. General-purpose RAG fails this twice: it can't prove where a number came from, and it can't bound how long it 'thinks'.
The solution
The loop caps at ≤40 turns per filing. The router rejects any cross-year retrieval hit. Forced citation plus post-hoc span validation means the model's arithmetic is never trusted blind. 80 deterministic anomaly rules catch out-of-policy values before they land, and a hash-chained audit ledger anchors to a public transparency log. Per-field retry, rather than re-running the whole filing, cut LLM cost ~80%. Rejected: a single large-context prompt (no provenance) and an open-ended agent loop (no audit ceiling).
- Constraint
- LGPD plus a five-year audit-replay mandate, on-premise, zero data egress. One hallucinated numerical field is a regulatory failure.
- Decision
- Bound the agent at ≤40 typed-tool turns per filing. Scope retrieval per filing year and reject cross-year hits at the router. Force every number to cite, then re-validate it against its source span before it lands. Rejected a single large-context prompt (no provenance) and an unbounded loop (no audit ceiling).
- Outcome
- Zero hallucinated numerical fields across 18 months in production. ~10K filings/day at peak. 80 deterministic anomaly rules gate every value, and per-field retry cut LLM cost ~80%. Every field replays to the source span that produced it.
Overview
An LLM agent that prepares and validates Brazilian income-tax (IRPF) filings. The loop is bounded at ≤40 turns per filing. Each step is a schema-validated tool call. Retrieval is scoped to the filing year. Every emitted number cites its source span and is re-validated against that span before it lands in the return. The decision log is hash-chained and anchored to a transparency log, so any filing replays years later.