Automated MLOps pipeline for credit scoring model improving reliability for Facio

Table of contents

Facio is a Brazilian fintech focused on micro-loans, serving over 4 million customers with a strong financial data infrastructure. The company processes loan applications using one year of bank statement data per user stored as parquet files on AWS S3, along with bureau scores and customer metadata.

Problem: manual workflows limit scalability of credit scoring model

Manual workflows limited the scalability of the credit scoring model at Facio. The entire pipeline relied on Jupyter notebooks, which led to inconsistent training processes and poor reproducibility. Teams lacked a structured way to compare different versions of the credit scoring model, making it difficult to identify the best performing approach. Monitoring was minimal, so the deployed credit scoring model had limited visibility into data drift and performance degradation.

Manual intervention also increased the time required to retrain and update the credit scoring model, slowing down loan decision cycles. In addition, model governance was weak, with no standardized versioning or approval workflows. These gaps reduced the reliability of the credit scoring model and directly impacted the speed and quality of loan approvals.

Solution: automated MLOps pipeline for credit scoring model

Facio partnered with GoML to build an automated MLOps system for the credit scoring model, covering data ingestion, training, benchmarking, and monitoring. The solution uses GoML’s Data Analytics Accelerator to standardize data pipelines and automate model development.

It converts raw financial data into structured datasets, trains and compares XGBoost credit scoring models, and selects the best version with proper versioning. The system also monitors performance using drift detection and triggers retraining when required, enabling faster and more reliable model updates.

End to end pipeline automation

The system automates the complete workflow for the credit scoring model:

Data ingestion from S3 parquet files and Athena queries

Feature engineering from bank transaction data

Automated dataset merging and temporal train validation test splits

XGBoost based credit scoring model training and evaluation

Model registration and deployment to SageMaker endpoints

Versioning of all artifacts related to the credit scoring model

The pipeline follows a structured multi step architecture:

Step 1: Feature extraction from Athena

Step 2: Data consolidation and splitting

Step 3: Credit scoring model training with hyperparameter optimization

Step 4: Model registration in SageMaker Model Registry

Step 5: Automated performance reporting

This ensures repeatable and scalable development of the credit scoring model.

Model benchmarking and selection

A benchmarking framework enables systematic evaluation of the credit scoring model:

Automated evaluation metrics generation

Comparison across multiple credit scoring model versions

Statistical testing for performance validation

Hyperparameter tuning for optimizing the credit scoring model

Selection of best performing credit scoring model for deployment

Monitoring and evaluation

The system tracks the health of the credit scoring model in production:

Data drift detection using statistical metrics

Population stability index thresholds

Outcome drift monitoring

Automated alerts using CloudWatch and SNS

Explainability reports for the credit scoring model using SHAP

These capabilities improve trust and transparency in the credit scoring model.

Automated retraining loop

The pipeline supports continuous improvement of the credit scoring model:

Monitoring jobs evaluate drift on a scheduled basis

Retraining is triggered automatically when thresholds are exceeded

Latest data is used to retrain the credit scoring model

New versions are registered and evaluated before promotion

Infrastructure and deployment

The solution uses a scalable AWS based architecture:

Amazon SageMaker for training, hosting, and managing the credit scoring model

AWS S3 for storing datasets and model artifacts

Amazon Athena and Glue for feature querying

AWS Lambda and EventBridge for automation

CloudWatch and SNS for monitoring and alerts

Python based pipeline components

The system supports real time inference and batch scoring for the credit scoring model.

Quality assurance

Validation ensures reliability of the credit scoring model:

End to end pipeline testing

Data validation and schema checks

Model evaluation on unseen datasets

Inference testing for prediction accuracy

System validation for artifacts and outputs

Impacts

60-70% reduction in credit scoring model training time

2-3X faster loan decision cycles

80% improved accuracy and consistency of the credit scoring model

Better visibility into model performance through monitoring

Reduced manual effort for ML teams

About

Location	Brazil
Tech stack	AWS, SageMaker, Lambda, Athena, S3, Python, XGBoost

‍

Before MLops and after MLops

Area	Before MLOps	After MLOps
Credit scoring model training	Manual notebooks	Automated pipeline
Model comparison	Ad hoc	Structured benchmarking
Monitoring	Minimal	Drift detection and alerts
Retraining	Manual	Automated
Deployment	Inconsistent	Standardized
Governance	Limited	Version controlled

‍

“Facio transformed its credit scoring model pipeline into a scalable system that improves speed, consistency, and decision quality.”

Prashanna Rao, Head of Engineering, GoML

Key takeaways for fintech companies

Common challenges

Manual workflows slow down credit scoring model updates

Lack of monitoring reduces trust in credit scoring models

Difficulty in benchmarking model performance

Practical guidance

Automate the lifecycle of the credit scoring model

Implement benchmarking before deployment

Monitor drift and retrain proactively

Use model registries for governance

Ready to scale your credit scoring model with automated MLOps?

Partner with GoML and build production grade ML systems using AI Matic.

‍

Outcomes

60-70%

Reduction in credit scoring model training time

2-3X

Faster loan decision cycles

80%

Improved accuracy and consistency of the credit scoring model

Access our whitepaper on Production-Grade AI Systems

Click here

Get a Demo

Automated MLOps pipeline for credit scoring model improving reliability for Facio

Deveshi Dabbawala

Problem: manual workflows limit scalability of credit scoring model

Solution: automated MLOps pipeline for credit scoring model

End to end pipeline automation

Versioning of all artifacts related to the credit scoring model

Model benchmarking and selection

Monitoring and evaluation

Automated retraining loop

Infrastructure and deployment

Quality assurance

Impacts

About

Before MLops and after MLops

Key takeaways for fintech companies

Common challenges

Practical guidance

Outcomes

Deveshi Dabbawala

Deveshi Dabbawala

Accelerate Your AI Adoption

Get an Executive Briefing

HQ

India

Access our whitepaper on Production-Grade AI Systems

Click here

Get a Demo

Automated MLOps pipeline for credit scoring model improving reliability for Facio

Deveshi Dabbawala

Problem: manual workflows limit scalability of credit scoring model

Solution: automated MLOps pipeline for credit scoring model

End to end pipeline automation

Versioning of all artifacts related to the credit scoring model

Model benchmarking and selection

Monitoring and evaluation

Automated retraining loop

Infrastructure and deployment

Quality assurance

Impacts

About

Before MLops and after MLops

Key takeaways for fintech companies

Common challenges

Practical guidance

Outcomes

Similar Case Studies

Explore more

AI property search improving rental discovery for Zeme

Deveshi Dabbawala

AI-powered personalized push notifications improving engagement and sell for Goodiebag

Deveshi Dabbawala

Accelerate Your AI Adoption

Get an Executive Briefing​

HQ

India​

Get an Executive Briefing

India