From 5-Year Loss Runs to Renewal Pricing: How Structured Claims Ingestion Powers Data-Driven Rate Changes
How InsightUW's InsightXtract pipeline transforms 47 pages of Mercy Health Partners loss run PDFs into structured year-by-year claims data — detecting a deteriorating loss ratio from 35% to 82% — and automatically triggering a rate_inadequacy flag that feeds the AI rate recommendation engine to suggest an +18% rate increase, all before the underwriter opens the file.
The Problem
Renewal pricing in medical malpractice is supposed to be data-driven. In practice, it is PDF-driven. The underwriter receives a 5-year loss run from the broker — a dense, multi-page PDF with inconsistent formatting, varying column headers, and claims scattered across tables that were never designed for machine consumption.
The typical renewal pricing workflow looks like this:
- Manual extraction: The underwriter opens the loss run PDF, reads each claim line, and types values into a spreadsheet. For a hospital system with 45 claims over 5 years, this takes 60–90 minutes.
- Inconsistent categorization: One underwriter codes a claim as "surgical error," another codes the same type as "operative complication." There is no enforced taxonomy.
- Year-by-year trends are invisible: The spreadsheet shows individual claims but does not automatically compute loss ratios by policy year, trend lines, or deterioration velocity.
- Rate recommendations are gut-feel: The underwriter looks at the spreadsheet, consults their experience, and proposes a rate change. Two underwriters looking at the same data propose different rates.
- No audit trail: Three years later, when someone asks "why did we only increase rates by 8% when the loss ratio was 82%?", there is no structured record of what data drove the decision.
The result is a renewal pricing process that is slow, inconsistent, and disconnected from the actual claims data that should drive it.
The InsightUW Approach
InsightUW's renewal pricing pipeline connects three systems: InsightXtract (structured document extraction), the Loss Summary Engine (trend analysis and flagging), and the AI Rate Recommendation Engine (data-driven pricing). Loss run PDFs flow in one end; a defensible rate recommendation comes out the other.
The InsightXtract Extraction Pipeline
When a loss run PDF arrives attached to a renewal submission, InsightXtract classifies it as a loss run document and routes it through the claims extraction pipeline. The pipeline handles the structural chaos of real-world loss run PDFs: multi-page tables with merged cells, continuation headers, varying date formats, and inconsistent claim status labels.
Step 2: Structured Claim Extraction
Each claim is extracted into a normalized record with consistent field names, regardless of the PDF source format.
The Loss Summary API: Year-by-Year Breakdown
Once claims are extracted and structured, the Loss Summary Engine computes year-by-year aggregations that reveal the trend the underwriter needs to see.
The Scenario
Mercy Health Partners operates 3 hospitals with 450 total beds in the Midwest. Their Medical Malpractice policy is up for renewal on June 1, 2026. The broker (Marsh) submits the renewal package including a 47-page loss run PDF covering policy years 2021 through 2026. The current premium is $3.8M.
Timeline: From PDF to Rate Recommendation
| Time | Event | System | Output |
|---|---|---|---|
| 8:15 AM | Broker email arrives with loss run PDF attached | Email Ingestion | Submission created, documents classified |
| 8:15:02 AM | InsightXtract classifies PDF as LOSS_RUN (MedMal) | InsightXtract | Document type = LOSS_RUN, 7 tables detected |
| 8:15:18 AM | Table extraction begins — 7 tables across 47 pages | InsightXtract | Layout parser identifies column headers per table |
| 8:16:45 AM | 45 claims extracted and normalized | InsightXtract | 45 structured claim records, 93% confidence |
| 8:16:50 AM | Taxonomy mapping applies ICD-10 codes | Claims Engine | Claim types standardized across all 5 years |
| 8:17:00 AM | Year-by-year aggregation computed | Loss Summary Engine | 5 policy years, loss ratios: 21.7% → 99.7% |
| 8:17:01 AM | rate inadequacy flag triggered | Flag Engine | Loss ratio > 70% in 2 consecutive years |
| 8:17:01 AM | frequency increase flag triggered | Flag Engine | 250% claim frequency increase |
| 8:17:02 AM | severity spike flag triggered | Flag Engine | 108% average severity increase |
| 8:17:05 AM | AI Rate Recommendation engine invoked | AI Rate Engine | Model ingests loss data + book benchmarks |
| 8:17:12 AM | Rate recommendation generated: +18% | AI Rate Engine | $3.8M → $4.484M recommended |
| 8:17:13 AM | Underwriter notification: 3 flags, rate recommendation ready | Notification Engine | Bell icon: HIGH severity alert |
Total elapsed time from PDF receipt to rate recommendation: 2 minutes, 58 seconds.
Department-Level Claims Breakdown
The extraction pipeline does not just aggregate — it structures claims by facility and department, revealing concentration risk:
| Facility | Department | Claims (5yr) | Total Incurred | Avg Severity | Trend |
|---|---|---|---|---|---|
| Mercy General Hospital | Orthopedics | 8 | $1,840,000 | $230,000 | Stable |
| Mercy General Hospital | OB/GYN | 6 | $1,620,000 | $270,000 | Increasing |
| Mercy St. Luke's | Emergency Medicine | 14 | $3,180,000 | $227,142 | Rapidly Increasing |
| Mercy St. Luke's | Radiology | 5 | $680,000 | $136,000 | Stable |
| Mercy Children's | Pediatrics | 7 | $920,000 | $131,428 | Stable |
| Mercy Children's | NICU | 5 | $500,000 | $100,000 | Decreasing |
The data immediately reveals that Mercy St. Luke's Emergency Medicine is the primary driver of deterioration — 14 claims with an increasing severity trend.
The AI Rate Recommendation
When the rate inadequacy flag fires, the AI Rate Recommendation engine ingests the structured loss data alongside book-level benchmarks and market data.
The Extraction-to-Pricing Pipeline in Detail
Metrics: Before and After Structured Claims Ingestion
| Metric | Before InsightUW | After InsightUW | Improvement |
|---|---|---|---|
| Loss run extraction time (45 claims) | 60–90 minutes (manual) | 90 seconds (automated) | 98% faster |
| Claim categorization consistency | 62% agreement between UWs | 97% (enforced taxonomy) | 56% improvement |
| Year-by-year trend visibility | Manual spreadsheet (if done) | Automatic with flags | Always available |
| Time from PDF to rate recommendation | 2–4 hours | 2 minutes 58 seconds | 98% faster |
| Rate recommendations with data backing | ~30% (most are gut-feel) | 100% (model-driven) | 3.3x improvement |
| Audit trail for pricing decisions | None | Complete (every factor logged) | Full traceability |
| Claims missed in extraction | 5–12% (human error) | < 1% (AI + validation) | 90% reduction |
| Departmental concentration detection | Rarely done (too time-consuming) | Automatic | New capability |
Key Takeaways
-
Loss runs are the foundation of renewal pricing, but they are locked in PDFs. InsightXtract converts unstructured loss run documents into structured, queryable claims data — automatically, in under 2 minutes.
-
Year-by-year trend analysis is the signal, not individual claims. The Loss Summary Engine computes loss ratios per policy year and detects deterioration velocity, giving the underwriter the trend line, not just data points.
-
Flags create urgency and accountability. The rate inadequacy flag is not a suggestion — it is a system-generated alert that the current rate is mathematically insufficient based on loss experience. It appears on the renewal, in the queue, and in the audit trail.
-
AI rate recommendations are defensible because they are decomposed. The recommendation is not a black box. Each factor (loss trend, frequency, severity, book benchmark, market) is weighted and shown, so the underwriter can agree, adjust, or override with full transparency.
-
The pipeline is the audit trail. Every step — extraction, normalization, aggregation, flagging, recommendation — is logged with timestamps and inputs. Three years from now, you can reconstruct exactly what data drove the pricing decision.
Ready to turn loss run PDFs into data-driven renewal pricing? InsightUW's extraction-to-pricing pipeline transforms 47 pages of claims history into a defensible rate recommendation in under 3 minutes.