This system accepts input in PDF, JSON, or Email (text) format, classifies its format and intent, and routes it to the appropriate specialized agent for processing. It maintains a shared context (memory) for traceability. This version is specifically configured to use the Groq API with the llama3-8b-8192 model.
- Input:
- Raw file (PDF, JSON, TXT representing email).
- Raw text string (simulating email content).
- Raw JSON string.
- Classifier Agent:
- Determines the input format (PDF, JSON, Email/Text).
- Extracts text content (for PDF/Text).
- Uses the Groq Llama3-8B model to classify the intent (e.g., Invoice, RFQ, Complaint, Regulation) and urgency.
- Logs format, intent, source, and other details to shared memory.
- Routes the input (or extracted text/parsed JSON) to a specialized agent.
- JSON Agent:
- Accepts structured JSON payloads.
- If the classified intent is "Invoice" (or as configured), it validates against a target schema (
JSON_TARGET_SCHEMAinconfig.py). - Extracts/reformats data based on the schema.
- Flags anomalies or missing fields.
- If the intent is not "Invoice" or doesn't match schema criteria, it processes the JSON as a generic payload.
- Logs results to shared memory.
- Email Agent:
- Accepts email content (text, including text extracted from PDFs).
- Uses the Groq Llama3-8B model to extract key information (sender, refined subject, refined summary, urgency, action items, contact person/phone).
- Formats the extracted information for CRM-style usage according to
EMAIL_CRM_FORMAT_KEYSinconfig.py. - Logs results to shared memory.
- Shared Memory Module:
- A lightweight in-memory store (Python dictionary).
- Stores: conversation ID, source, format type, timestamp, classified intent, extracted values from agents, errors, and processing steps.
- Accessible across all agents for context and traceability.
- Python 3.9+
- Groq API with the
llama3-8b-8192model for all LLM tasks. pdfplumberfor PDF text extraction.- Standard Python
jsonlibrary. - In-memory Python dictionary for Shared Memory (
memory/shared_memory.py).
├── agents/ # Agent implementations (classifier, json, email)
│ ├── init.py
│ ├── base_agent.py
│ ├── classifier_agent.py
│ ├── json_agent.py
│ └── email_agent.py
├── memory/ # Shared memory module
│ ├── init.py
│ └── shared_memory.py
├── utils/ # Utility functions (LLM interaction, PDF parsing)
│ ├── init.py
│ ├── llm_utils.py
│ └── pdf_parser.py
├── sample_inputs/ # Example input files (JSON, TXT, and a placeholder for PDF)
│ ├── complaint_sample.txt
│ ├── invoice_sample.json
│ ├── regulation_sample.txt
│ └── rfq_sample_text_fallback.txt
├── main.py # Main script to run examples and orchestrate agents
├── config.py # Configuration (API keys, model names, schemas)
├── requirements.txt # Python dependencies
├── README.md # This file
└── .env # For API keys (gitignored)
-
Clone the repository (if you have one):
git clone <your-repo-url> cd multi_agent_system
(If you don't have a repo, just ensure all files are in the correct structure shown above.)
-
Create a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Linux/macOS # venv\Scripts\activate # On Windows
-
Install dependencies:
pip install -r requirements.txt
-
Set up Groq API Key:
- Sign up at GroqCloud to get an API key if you don't have one.
- Create a file named
.envin the root of themulti_agent_systemdirectory. - Add your Groq API key to it:
GROQ_API_KEY="your_groq_api_key_here"
Execute the main.py script from the root of the multi_agent_system directory:
python main.py