Business Problem
- Labor-Intensive Extraction: Underwriters had to manually extract claims history from variously formatted loss run documents, leading to inefficiencies.
- Data Inconsistencies: Unstructured data made it difficult to standardize extraction, leading to inaccurate risk assessments.
- Delayed Risk Analysis: Slow processing times hindered the ability to assess risk promptly and negotiate policy premiums effectively.
- Compliance Challenges: Inaccurate data extraction could result in non-compliance with industry regulations, impacting insurance decision-making.
About Ledgebrook
Loss run documents contain crucial historical claims data required for risk assessment. Manually extracting this data was time-consuming and prone to errors. goML built an automated loss run extraction service to streamline this process.
Solution
goML implemented a fully automated loss run data extraction pipeline:
AI-Based Text Extraction: Leveraging AWS Textract, key claims data was extracted from unstructured PDFs and scanned documents.
Real-Time Search & Insights: Fast querying and retrieval of historical claims data for underwriting teams.
Lambda-Powered Data Processing: AWS Lambda functions processed the extracted data, standardizing it for structured storage.
Automation & Workflow Integration: The solution seamlessly integrated into Ledgebrook’s underwriting workflows, reducing dependency on manual processing.
Centralized Data Repository: Extracted loss run data was stored in a PostgreSQL RDS database for easy retrieval and analysis.
AI-Powered Segregation & Classification: AWS Bedrock processed extracted text to segregate policies and claims.
Architecture
- User Interaction & API Triggers
POST /store-lossruns → Triggered inside the Document Service.
User uploads Loss Run Files, which are stored in AWS S3. - Document Processing & Text Extraction
AWS Textract extracts:
Text details from the documents.
Tabular data and form structures as key-value pairs. - AI-Powered Segregation & Classification
AWS Bedrock processes extracted text:
Segregation pipeline built to split text into multiple policies and claims.
Generates a Schema for Policies and Claims Segregation. - Data Structuring & Storage
Bedrock extracts a Data Dictionary for every claim's text chunk.
Stores structured Loss Run responses in a database (presumably PostgreSQL or S3). - Webhook Integration & Execution
Triggers a Webhook Endpoint once processing is completed.
Passes the structured Loss Run response to the Document Service for final execution. - Data Retrieval & Response Generation
GET /loss-runs/{aiDocumentSessionToken} allows users to fetch the Loss Run Response.
Query retrieves stored Loss Run data from the database.
Returns structured response to the user.