Back

Genpact CAT Modeling

Deveshi Dabbawala

February 15, 2025
Table of Content

Business Problem

Genpact faced several challenges in handling large-scale data related to CAT (Catastrophe) modeling: 

  • Manual Data Processing: Traditional data handling required significant manual effort to clean, transform, and interpret Excel-based datasets. 
  • Error-Prone Workflows: The risk of human error in data entry, validation, and structuring led to inconsistencies in insights. 
  • Inefficient Decision-Making: Lack of automation in data visualization and analytics delayed actionable insights. 
  • Scalability Issues: The existing approach was not scalable, requiring high resource allocation for data management tasks. 

About Genpact CAT Modeling

Genpact is a global professional services firm specializing in digital transformation, analytics, and AI-driven solutions. With a strong focus on operational efficiency, Genpact partners with businesses to streamline complex processes and drive data-driven decision-making. 

Solution

goML partnered with Genpact to develop an automated CAT modeling solution that streamlined data ingestion, processing, visualization, and interactive analysis through AI-driven automation. 

Automated Data Preprocessing To eliminate the need for manual data cleansing, goML implemented automated preprocessing pipelines using Python (Pandas, NumPy) for data transformation, AWS Lambda for serverless execution, and AWS RDS for structured data storage. This ensured standardized, high-quality data processing while reducing human effort. 

Efficient Data Validation & Mapping Automated data validation and mapping processes were implemented using Python (Geopandas), AWS S3, and AWS RDS, ensuring accurate geocoding validation, air_code checks, and standardized data mapping through predefined templates. This significantly improved data

Dynamic Data Visualization The processed data was converted into interactive, real-time visualizations using Highcharts, allowing analysts to dynamically filter dimensions and measures. This enabled quick data exploration and pattern identification, significantly improving decision-making efficiency. 

Seamless Workspace Management Users could create and manage dedicated workspaces using AWS EC2, React with TypeScript, and Node.js, making it easier to organize, retrieve, and track processed datasets. This enhanced operational efficiency and streamlined data accessibility across multiple projects. 

Conversational AI for Data Interaction 
A chatbot-driven interface was integrated using GPT-3.5, LangChain, and AWS Lex to assist users in querying and interpreting data through natural language interactions. This eliminated the need for manual report generation and provided real-time, AI-driven insights. 

Architecture

  • Users upload Excel files through a Web UI hosted on AWS EC2, enabling easy data ingestion.
  • Authenticated users can create and manage workspaces, facilitating structured data processing and organization.
  • Admin users define valid columns, mandatory fields, preprocessing rules, mappings, and coding templates to standardize data handling.
  • Users perform file uploads, data preprocessing, validation, mapping, and analytics via the UI, ensuring automated data transformation.
  • Processed data, rules, and files are stored in AWS S3 for unstructured data and AWS RDS for structured relational storage.
  • The React with TypeScript UI app, hosted on AWS EC2, acts as an interface between users and stored data, providing seamless interaction.
  • The Conversational AI (GPT-3.5 Client) enables users to query data using natural language, improving data accessibility and interpretation.
  • AI-driven chatbot assists in data analysis, validation insights, and operational recommendations, enhancing usability and reducing manual efforts.
  • Backend processing leverages Python (Pandas, NumPy, Geopandas) and AWS Lambda for automated data cleansing, validation, and structuring.
  • The overall system ensures a scalable, automated, and AI-powered data processing pipeline, improving accuracy and decision-making efficiency.  

Outcomes

82%
Reduction in time spent on data processing and visualization
96%
Decrease in human errors through automated validation
67%
Reduction in manual effort