Back

Conversational Chat Agent for Doppelio

Vimal Kumar

April 18, 2025
Table of Content

Business Problem

  • Their current OpenAI implementation struggled with consistency in responses, particularly when processing domain-specific documents and testing data
  • Inefficient document processing and information retrieval, requiring manual intervention to extract and analyze data from various document formats
  • The existing solution couldn't reliably handle the growing number of concurrent enterprise users, leading to degraded performance during peak usage
  • There was a strong need for a secure, scalable, multi-tenant AI solution capable of supporting concurrent users across large enterprise teams
  • Requirement for a secure, multi-tenant architecture that respects data privacy and segregation

About

Doppelio is a leading provider of AI-driven testing and automation solutions that help enterprises validate and optimize their digital experiences. As part of their AI expansion strategy, Doppelio aims to develop a scalable conversational chat agent that integrates seamlessly into enterprise workflows. This solution is designed to provide context-aware, real-time responses tailored to specific business needs, enhancing operational efficiency and delivering intelligent, data-driven insights to users.

Solution

goML developed a Conversational Chat Agent MVP leveraging Generative AI capabilities and a robust AWS-native architecture. A key part of the engagement involved migrating Doppelio’s existing OpenAI-powered solution to AWS Bedrock, significantly improving cost efficiency, latency, and accuracy while aligning with enterprise cloud strategy.

OpenAI to AWS Bedrock Migration
Doppelio’s existing OpenAI integration was migrated to AWS Bedrock, resulting in enhanced control, reduced latency, and improved model accuracy for domain-specific tasks.

Multi-modal Interface
Support for both text and voice-based interactions with improved accuracy and response times

Scalable Infrastructure
Powered by FastAPI and Python, capable of handling 1,000+ concurrent user sessions with low-latency performance.

Natural Language Interface
Built using Claude 3.5 via AWS Bedrock, supporting both text and voice interactions.

Document Intelligence
Enabled parsing and understanding of PDFs, Word docs, and images for test-related insights

Enhanced RAG Implementation
Built a more sophisticated Retrieval-Augmented Generation system specifically optimized for document processing and contextual understanding

Multi-Tenant Secure Architecture
Ensured strict data segregation and encryption using AWS S3, DynamoDB, Lambda, CloudWatch, and API Gateway.

Architecture

  • Input Layer
    Accepts user inputs in multiple formats: text messages, voice commands, and document uploads (PDF, Word)
    • Provides a unified entry point for all user interactions regardless of format
    • Handles initial preprocessing of data before passing to infrastructure layer
  • Infrastructure Layer (AWS Services)
    • AWS S3: Stores uploaded documents and processed data securely with improved encryption compared to previous solution
    • AWS Lambda: Executes serverless functions with predictable performance characteristics and auto-scaling
    • API Gateway: Manages API endpoints with enhanced throttling controls to prevent performance degradation
    • DynamoDB: Maintains multi-tenant data with proper segregation and enterprise-grade security
    • CloudWatch: Provides comprehensive monitoring and auto-scaling triggers based on actual usage patterns
  • Application Layer
    • Conversational Chat Agent API: Built with FastAPI in Python for superior performance compared to previous implementation
    • Enhanced RAG Processing: Optimized Retrieval-Augmented Generation specifically tuned for Claude 3.5 to improve response accuracy
    • Guardrails: Enterprise-specific content filtering and security controls
  • Intelligence Layer
    • AWS Bedrock:
    Provides managed foundation model access with better performance characteristics than OpenAI
    • Claude 3.5: Delivers more accurate and consistent responses, especially for technical document processing
    • User-Imported Models: Supports custom models with better integration than the previous solution

Outcomes

67%
Improvement in information extraction
99.95%
Uptime under peak loads
50%
Higher domain-specific accuracy