Back

Genpact CAT Modeling

Deveshi Dabbawala

February 25, 2025
Table of contents

Business Problem

Genpact faced several challenges in handling large-scale data related to CAT (Catastrophe) modeling: 

  • Manual Data Processing: Traditional data handling required significant manual effort to clean, transform, and interpret Excel-based datasets. 
  • Error-Prone Workflows: The risk of human error in data entry, validation, and structuring led to inconsistencies in insights. 
  • Inefficient Decision-Making: Lack of automation in data visualization and analytics delayed actionable insights. 
  • Scalability Issues: The existing approach was not scalable, requiring high resource allocation for data management tasks. 

About Genpact

Genpact is a global professional services firm specializing in digital transformation, analytics, and AI-driven solutions. With a strong focus on operational efficiency, Genpact partners with businesses to streamline complex processes and drive data-driven decision-making. 

Solution

goML partnered with Genpact to develop an automated CAT modeling solution that streamlined data ingestion, processing, visualization, and interactive analysis through AI-driven automation. 

Automated Data Preprocessing 
To eliminate the need for manual data cleansing, goML implemented automated preprocessing pipelines using Python (Pandas, NumPy) for data transformation, AWS Lambda for serverless execution, and AWS RDS for structured data storage. This ensured standardized, high-quality data processing while reducing human effort. 

Efficient Data Validation & Mapping 
Automated data validation and mapping processes were implemented using Python (Geopandas), AWS S3, and AWS RDS, ensuring accurate geocoding validation, air_code checks, and standardized data mapping through predefined templates. This significantly improved data accuracy and consistency. 

Dynamic Data Visualization 
The processed data was converted into interactive, real-time visualizations using Highcharts, allowing analysts to dynamically filter dimensions and measures. This enabled quick data exploration and pattern identification, significantly improving decision-making efficiency. 

Seamless Workspace Management 
Users could create and manage dedicated workspaces using AWS EC2, React with TypeScript, and Node.js, making it easier to organize, retrieve, and track processed datasets. This enhanced operational efficiency and streamlined data accessibility across multiple projects. 

Conversational AI for Data Interaction 
A chatbot-driven interface was integrated using GPT-3.5, LangChain, and AWS Lex to assist users in querying and interpreting data through natural language interactions. This eliminated the need for manual report generation and provided real-time, AI-driven insights. 

Architecture

  • Data Processing & Storage 
    CRM (Customer Relationship Management System): Source of raw customer data. 
    AWS Glue: Performs data cleansing and preprocessing. 
    Amazon S3: Stores the cleaned data for further processing. 
  • Model Training & Deployment 
    Amazon SageMaker Training: Performs hierarchical clustering to train the model. 
    Trained Model: Stored for inference after training. 
  • Model Inference & API Integration 
    API Gateway: Receives customer data and forwards it for processing. 
    AWS Lambda: Processes API requests and interacts with the inference system. 
    Amazon RDS: Stores metadata and predicted results from inference. 
    SageMaker Inference Endpoint: Deploys the trained model for making predictions. 
  • Monitoring & Continuous Learning 
    SageMaker Monitoring: Observes model performance and triggers re-training. 
    AWS Glue (Re-Training): Re-trains the model if required based on monitored feedback. 
  • Security & Monitoring 
    AWS IAM: Manages access control. 
    AWS KMS: Ensures encryption and security of data. 
    Amazon CloudWatch: Monitors system logs and performance. 

Outcomes

82%
Reduction in time spent on data processing and visualization
96%
Decrease in human errors through automated validation
67%
Reduction in manual effort