Back

Sun Pharma: Microsoft Autogen powered with OpenAI models

Deveshi Dabbawala

February 14, 2025
Table of Content

Business Problem

  • Difficulty in retrieving insights from relational databases (RDBMS) containing sales, stock levels, and agent performance data.
  • Dependency on manual processes for sales analysis and profit evaluation, making decision-making slow and effort intensive. 
  • Lack of a natural language interface for business users to query data intuitively.
  • No real-time automated chart generation for quick data visualization.
  • Absence of a dedicated graph tool for customized data representation without building a full dashboard.

About Sun Pharma

Sun Pharma, a global leader in pharmaceuticals, faced challenges in analyzing and extracting insights from its vast sales data. To overcome this, they sought an AI-driven solution that could provide real-time insights, automate data visualization, and enhance decision-making.

Solution

To address these challenges, we developed a Proof of Concept (PoC) leveraging OpenAI and Autogen’s multi-agent framework, PostgreSQL as the database, and Streamlit for UI and Pygwalker for visualization. This system enabled intelligent collaboration between specialized AI agents to process and respond to user queries efficiently.

Conversational Agent
Built using OpenAI and Autogen, this chatbot interprets natural language queries and interacts with other agents.

Visualization Agent
Uses Streamlit to generate automated charts, making data insights more accessible.

Query Agent
Dynamically converts user queries into SQL statements to retrieve data from PostgreSQL.

LLM Used
The solution leveraged OpenAI's GPT-4 within the Autogen framework to power natural language understanding, query generation, and data analysis. This enabled seamless multi-agent collaboration for accurate and efficient insights extraction.

Analysis Agent
Processes and structures the retrieved data into meaningful insights, enhancing decision-making.

Architecture

  • User Interaction
    • Users input queries through a user interface.
    Two types of queries: User Query and Conversational Query.
  • Data Ingestion & Preprocessing
    API Gateway accepts external requests.
    AWS Glue orchestrates data movement and ETL processes.
    • Data Source Datastore stores structured/unstructured data.
    • Python Native Scripts perform preprocessing for both:
        a. Standard query processing
        b. BI/Data visualization
  • Data Enrichment & Optimization
    Feature Optimization enhances data attributes.
    • Embedding Models (e.g., SageMaker or custom embeddings) create vector representations.
    SageMaker hosts models.
    • BI Layer (QuickSight) provides dashboards and reports.
  • Knowledge Retrieval & Storage
    Vector DB (like StackCloud) stores embeddings.
    Focused AI Agents retrieve domain-specific insights.
    • Multiple agents interact with embedding models via RAG (Retrieval-Augmented Generation).
  • Conversational Interface & Processing
    Query Processing prepares user query for understanding.
    Hybrid LLM + Retrieval Model for accurate response generation.
    AWS Bedrock used for LLM access.
    S3 Buckets store RAG data.
    Lambda (Prompt + Knowledge) refines prompt inputs.
  • Execution & Output
    • Intent & Data Visualization handled via Bedrock + QuickSight.
    Chat Platform provides interface for conversational results.
    Parameter Execution and Generation Layer generates actionable outputs.
    Function Calling LLMs + SE & Module Crew enables downstream automation or insights.
  • Code & Deployment
    Code Repository connects to CI/CD
    (e.g., CodePipeline/Deploy) for updates.

Outcomes

85%
Faster Data Retrieval: Business users can access complex insights instantly without requiring SQL expertise.
70%
Reduction in Manual Effort: Automating sales analysis and visualization significantly cuts down manual intervention.
80%
Simplified Data Representation: The dedicated graph tool provides a lightweight alternative to traditional BI dashboards.