Back

Automated MLOps pipeline for credit scoring model improving reliability for Facio

Deveshi Dabbawala

April 23, 2026
Table of contents

Facio is a Brazilian fintech focused on micro-loans, serving over 4 million customers with a strong financial data infrastructure. The company processes loan applications using one year of bank statement data per user stored as parquet files on AWS S3, along with bureau scores and customer metadata.

Problem: manual workflows limit scalability of credit scoring model

Manual workflows limited the scalability of the credit scoring model at Facio. The entire pipeline relied on Jupyter notebooks, which led to inconsistent training processes and poor reproducibility. Teams lacked a structured way to compare different versions of the credit scoring model, making it difficult to identify the best performing approach. Monitoring was minimal, so the deployed credit scoring model had limited visibility into data drift and performance degradation.  

Manual intervention also increased the time required to retrain and update the credit scoring model, slowing down loan decision cycles. In addition, model governance was weak, with no standardized versioning or approval workflows. These gaps reduced the reliability of the credit scoring model and directly impacted the speed and quality of loan approvals.

Solution: automated MLOps pipeline for credit scoring model

Facio partnered with GoML to build an automated MLOps system for the credit scoring model, covering data ingestion, training, benchmarking, and monitoring. The solution uses GoML’s Data Analytics Accelerator to standardize data pipelines and automate model development.

It converts raw financial data into structured datasets, trains and compares XGBoost credit scoring models, and selects the best version with proper versioning. The system also monitors performance using drift detection and triggers retraining when required, enabling faster and more reliable model updates.

End to end pipeline automation

The system automates the complete workflow for the credit scoring model:

  • Data ingestion from S3 parquet files and Athena queries
  • Feature engineering from bank transaction data
  • Automated dataset merging and temporal train validation test splits
  • XGBoost based credit scoring model training and evaluation

Model registration and deployment to SageMaker endpoints

Versioning of all artifacts related to the credit scoring model

The pipeline follows a structured multi step architecture:

  • Step 1: Feature extraction from Athena
  • Step 2: Data consolidation and splitting
  • Step 3: Credit scoring model training with hyperparameter optimization
  • Step 4: Model registration in SageMaker Model Registry
  • Step 5: Automated performance reporting  

This ensures repeatable and scalable development of the credit scoring model.

Model benchmarking and selection

A benchmarking framework enables systematic evaluation of the credit scoring model:

  • Automated evaluation metrics generation
  • Comparison across multiple credit scoring model versions
  • Statistical testing for performance validation
  • Hyperparameter tuning for optimizing the credit scoring model

Selection of best performing credit scoring model for deployment

Monitoring and evaluation

The system tracks the health of the credit scoring model in production:

  • Data drift detection using statistical metrics
  • Population stability index thresholds
  • Outcome drift monitoring
  • Automated alerts using CloudWatch and SNS
  • Explainability reports for the credit scoring model using SHAP

These capabilities improve trust and transparency in the credit scoring model.  

Automated retraining loop

The pipeline supports continuous improvement of the credit scoring model:

  • Monitoring jobs evaluate drift on a scheduled basis
  • Retraining is triggered automatically when thresholds are exceeded
  • Latest data is used to retrain the credit scoring model

New versions are registered and evaluated before promotion

Infrastructure and deployment

The solution uses a scalable AWS based architecture:

  • Amazon SageMaker for training, hosting, and managing the credit scoring model
  • AWS S3 for storing datasets and model artifacts
  • Amazon Athena and Glue for feature querying
  • AWS Lambda and EventBridge for automation
  • CloudWatch and SNS for monitoring and alerts
  • Python based pipeline components

The system supports real time inference and batch scoring for the credit scoring model.  

Quality assurance

Validation ensures reliability of the credit scoring model:

  • End to end pipeline testing
  • Data validation and schema checks
  • Model evaluation on unseen datasets
  • Inference testing for prediction accuracy
  • System validation for artifacts and outputs

Impacts

  • 60-70% reduction in credit scoring model training time
  • 2-3X faster loan decision cycles
  • 80% improved accuracy and consistency of the credit scoring model
  • Better visibility into model performance through monitoring
  • Reduced manual effort for ML teams

About

Location 

Brazil 

Tech stack 

AWS, SageMaker, Lambda, Athena, S3, Python, XGBoost 

Before MLops and after MLops

Area 

Before MLOps 

After MLOps 

Credit scoring model training 

Manual notebooks 

Automated pipeline 

Model comparison 

Ad hoc 

Structured benchmarking 

Monitoring 

Minimal 

Drift detection and alerts 

Retraining 

Manual 

Automated 

Deployment 

Inconsistent 

Standardized 

Governance 

Limited 

Version controlled 

“Facio transformed its credit scoring model pipeline into a scalable system that improves speed, consistency, and decision quality.”

Prashanna Rao, Head of Engineering, GoML

Key takeaways for fintech companies

Common challenges

  • Manual workflows slow down credit scoring model updates
  • Lack of monitoring reduces trust in credit scoring models
  • Difficulty in benchmarking model performance

Practical guidance

  • Automate the lifecycle of the credit scoring model
  • Implement benchmarking before deployment
  • Monitor drift and retrain proactively
  • Use model registries for governance

Ready to scale your credit scoring model with automated MLOps?  

Partner with GoML and build production grade ML systems using AI Matic.

Outcomes

60-70%
Reduction in credit scoring model training time
2-3X
Faster loan decision cycles
80%
Improved accuracy and consistency of the credit scoring model