🧠 Local Edge Voice Agent (Project Jarvis)

An offline, privacy-preserving Voice User Interface (VUI) that processes speech, logic, and intelligence entirely on local hardware.

Unlike cloud-based assistants (Alexa/Siri), this project demonstrates an Edge AI architecture, utilizing OpenAI Whisper for speech recognition and Meta's Llama 3.2 for general intelligence, decoupled from the internet. It features a reactive GUI that visualizes agent states (Sleeping, Listening, Thinking) in real-time.

📸 Project Demo

🚀 Key Features

🔒 100% Offline Privacy: All audio processing and inferencing happen locally on the CPU.
🧠 Local LLM Brain: Integrated with Ollama (Llama 3.2) to handle complex, open-ended conversations.
⚡ Multi-Threaded Architecture: Decoupled the GUI (Main Thread) from the Inference Engine (Worker Thread) to prevent freezing during heavy computation.
🗣️ Continuous Conversation: Implements a "Wake-and-Sustain" logic loop that listens for commands and auto-sleeps after 5 seconds of inactivity.
🎨 Dynamic GUI: Event-driven Tkinter interface that swaps visual assets based on the agent's internal state machine (Wake/Sleep/Think).
💻 OS Automation: Capable of executing system commands (Opening apps, YouTube, System Time) via direct OS subprocess calls.

🏗️ System Architecture

The system follows a standard Wake-Listen-Think-Act pipeline optimized for consumer hardware.

Data Flow Breakdown:

Input Stage: SpeechRecognition library monitors audio stream for the Wake Word ("Jarvis").
ASR Stage: OpenAI Whisper (Base) model converts audio tensor to text.
Router:
- Deterministic Path: Simple commands (e.g., "Open Notepad") trigger Python functions immediately.
- Probabilistic Path: Complex queries (e.g., "Why is the sky blue?") are sent to the local LLM.
Output Stage: PowerShell's SAPI5 engine is invoked via subprocess for non-blocking Text-to-Speech.

🛠️ Installation & Setup

1. Prerequisites

Python 3.10+
FFmpeg (Must be installed and added to System PATH).
Ollama: Download from ollama.com and install.

2. Setup the Brain

Open your terminal and pull the lightweight Llama 3.2 model:

ollama run llama3.2
# Type /bye to exit once it loads

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
README.md		README.md
awake.png		awake.png
flowchart.png		flowchart.png
requirements.txt		requirements.txt
robot_ultimate.py		robot_ultimate.py
sleep.png		sleep.png
sleepYToverlay.png		sleepYToverlay.png
thinking.png		thinking.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Local Edge Voice Agent (Project Jarvis)

📸 Project Demo

🚀 Key Features

🏗️ System Architecture

Data Flow Breakdown:

🛠️ Installation & Setup

1. Prerequisites

2. Setup the Brain

About

Uh oh!

Releases

Packages

Languages

prasanta10/Edge-Voice-Agent

Folders and files

Latest commit

History

Repository files navigation

🧠 Local Edge Voice Agent (Project Jarvis)

📸 Project Demo

🚀 Key Features

🏗️ System Architecture

Data Flow Breakdown:

🛠️ Installation & Setup

1. Prerequisites

2. Setup the Brain

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages