🚀 Real-Time Audio Processing API

A FastAPI-based real-time audio processing system for live customer calls that performs speaker diarization, transcription, and sentiment analysis.

Features

Real-time microphone input capture
Speaker diarization (agent vs customer)
Live speech-to-text transcription
Sentiment analysis (both text and voice)
Voice feature extraction (pitch, energy, speaking rate)
WebSocket-based real-time output streaming

Project Structure

.
├── api/                    # FastAPI routes and WebSocket handlers
├── capture/               # Audio capture module
├── diarization/          # Speaker diarization module
├── transcription/        # Speech-to-text module
├── sentiment/            # Sentiment analysis module
├── tests/                # Unit tests
├── main.py              # FastAPI application entry point
├── requirements.txt     # Project dependencies
└── README.md           # This file

Setup

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Run the application:

uvicorn main:app --reload

API Endpoints

POST /api/start: Start the audio processing pipeline
POST /api/stop: Stop the audio processing pipeline
GET /api/status: Get current pipeline status
WS /ws: WebSocket endpoint for real-time output

Testing

Run the test suite:

pytest

Real-Time Output Format

The system outputs results in real-time with the following format:

12:01:23 | SPEAKER_00 | Hello, how can I help you today?
⚠️ [12:01:23] NATURAL (score=0.92)
   Text: NATURAL, Voice: NATURAL
   Voice features - Pitch: 185.23, Energy: 0.12, Rate: 0.08

Requirements

Python 3.8+
Working microphone
Sufficient CPU/GPU for real-time processing
Internet connection for model downloads

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
api		api
capture		capture
diarization		diarization
sentiment		sentiment
tests		tests
transcription		transcription
vad_model		vad_model
.gitignore		.gitignore
README.md		README.md
RealTimePipelineMic.py		RealTimePipelineMic.py
RealTimeWavPipeline.py		RealTimeWavPipeline.py
client.py		client.py
main.py		main.py
mic_test.py		mic_test.py
ordinary.wav		ordinary.wav
output.wav		output.wav
requirements.txt		requirements.txt
run_server.py		run_server.py
taken_in.wav		taken_in.wav
test_output.wav		test_output.wav
transcript.txt		transcript.txt
transcription.log		transcription.log
vad_output.wav		vad_output.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Real-Time Audio Processing API

Features

Project Structure

Setup

API Endpoints

Testing

Real-Time Output Format

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 Real-Time Audio Processing API

Features

Project Structure

Setup

API Endpoints

Testing

Real-Time Output Format

Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages