Risqly: AI Underwriter
A document intelligence pipeline that reads insurance policy PDFs, extracts structured data, and flags rule violations against the underwriting manual, turning a multi-hour manual review into a background job.
The Challenge
Underwriters reviewing insurance policy submissions face a slow, manual grind: read a PDF policy (often scanned, multi-format, dozens of pages), extract dozens of data points (coverages, deductibles, dwelling details, insured info) and cross-check every one against an internal manual's rulebook to catch non-compliant or risky terms.
It doesn't scale, it's error-prone, and the rules vary by manual and product line. Any automation has to be both accurate and trustworthy enough that underwriters won't second-guess it.
What Was Built
Risqly is a full-stack platform (Angular 19 + FastAPI/LangGraph) where an underwriter uploads a policy PDF and selects a manual, and within minutes gets back a structured breakdown of the policy plus a list of flagged issues, each tied to the exact rule it violates.
The system handles two document pipelines end to end: policy PDFs (parsed, anonymized, validated) and insurance manuals (parsed into structured, machine-checkable rule sets). Users track jobs in real time via a polling progress UI, review results in a dashboard, browse report history, and operate within per-user processing budgets, all behind JWT auth.
How It Was Solved
The core is a LangGraph state machine running behind an async job queue, so uploads return instantly and a worker pool (5 concurrent workers) processes them in the background:
PDF → page images (pdf2image) → OCR (AWS Textract, FORMS+LAYOUT) → PII masking → LLM classification into 6 policy sections → LLM rule validation against the manual → flagged issues in DynamoDB
Key engineering decisions that made it work:
- Hybrid OCR + LLM extraction: Textract handles structured key-value pairs; when FORMS extraction comes up short (unusual layouts, low yield), an LLM-based fallback extractor kicks in automatically based on a configurable threshold.
- Privacy by design: a regex-based masker strips names, addresses, phone numbers, SSNs, emails, and full birth dates before anything reaches OpenAI, while preserving what's actually needed downstream (birth year for age calculations, city/postal code for risk context).
- Rule-driven, not free-form, validation: manuals run through their own pipeline that converts prose into structured rules, and the analysis prompt is deliberately constrained (strict rule application: no inference, no rule-chaining, no flagging on missing data) so the LLM behaves like a deterministic checker rather than an over-eager reviewer, which is critical in a regulated domain where false positives destroy trust faster than missed issues.
- Cost governance: every Textract call and OpenAI completion is metered, aggregated per job, charged against per-user budgets, and capped (page limits, hard spending limits), so unit economics stay predictable as volume grows.
- Observability and iteration: Langfuse traces every LLM call, and ~31KB of prompts live in a single externalized YAML file, so tuning extraction or analysis behavior doesn't mean touching application code.
Results
On a field-by-field benchmark against hand-labeled ground truth, the pipeline extracts policy data at 93.3% overall accuracy (83/89 fields), with core sections like Policy Information, Insured Information, and Co-Insured Information at a clean 100%, and structurally harder sections (Coverages, Dwelling Information) in the high 80s/low 90s.
The PII masker reliably catches names, phone numbers, emails, SSNs, and addresses before any text reaches the LLM. The net effect: a multi-hour, page-by-page manual review is now a queued background job that returns in minutes with a structured policy summary and a prioritized, rule-referenced list of issues, turning underwriting from "find the problems" into "confirm the findings."
Want to build something like this?
I help founders take ideas like this from zero to a live, AI-ready product.
Book a Call