RealtimeAgent over WebSockets

This project demonstrates how to create a voice assistant using Python, FastAPI, WebSockets, and an AG2 RealtimeAgent. The application streams audio from a browser to a FastAPI server and enables real-time voice communication with the RealtimeAgent.

Key Features

WebSocket Audio Streaming: Direct real-time audio streaming between the browser and server.
FastAPI Integration: A lightweight Python backend for handling WebSocket traffic.

Prerequisites

Before you begin, ensure you have the following:

Python 3.9+: The project was tested with 3.9. Download here.
An OpenAI account and an OpenAI API Key. You can sign up here.
- OpenAI Realtime API access.

Local Setup

Follow these steps to set up the project locally:

1. Clone the Repository

git clone https://github.com/ag2ai/realtime-agent-over-websockets.git
cd realtime-agent-over-websockets

2. Set Up Environment Variables

Create a OAI_CONFIG_LIST file based on the provided OAI_CONFIG_LIST_sample:

cp OAI_CONFIG_LIST_sample OAI_CONFIG_LIST

To use OpenAI Realtime API

In the OAI_CONFIG_LIST file, update the api_key to your OpenAI API key for the configuration with the tag "gpt-4o-mini-realtime"

To use Gemini Live API

In the OAI_CONFIG_LIST file, update the api_key to your Gemini API key for the configuration with the tag "gemini-realtime"
In realtime_over_websockets/main.py update filter_dict tag to "gemini-realtime"

(Optional) Create and use a virtual environment

To reduce cluttering your global Python environment on your machine, you can create a virtual environment. On your command line, enter:

python3 -m venv env
source env/bin/activate

3. Install Dependencies

Install the required Python packages using pip:

pip install -r requirements.txt

4. Start the Server

Run the application with Uvicorn:

uvicorn realtime_over_websockets.main:app --port 5050

Test the App

With the server running, open the client application in your browser by navigating to http://localhost:5050/start-chat/. Speak into your microphone, and the AI assistant will respond in real time.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
realtime_over_websockets		realtime_over_websockets
.gitignore		.gitignore
LICENSE		LICENSE
OAI_CONFIG_LIST_sample		OAI_CONFIG_LIST_sample
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RealtimeAgent over WebSockets

Key Features

Prerequisites

Local Setup

1. Clone the Repository

2. Set Up Environment Variables

To use OpenAI Realtime API

To use Gemini Live API

(Optional) Create and use a virtual environment

3. Install Dependencies

4. Start the Server

Test the App

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RealtimeAgent over WebSockets

Key Features

Prerequisites

Local Setup

1. Clone the Repository

2. Set Up Environment Variables

To use OpenAI Realtime API

To use Gemini Live API

(Optional) Create and use a virtual environment

3. Install Dependencies

4. Start the Server

Test the App

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages