🧙‍♂️ TubeSage

TubeSage is an LLM-based app that understands YouTube videos and answers questions about them.

Project Architecture Overview

YouTube Video Transcription Tool: Converts video content into text.
Text Embedding: Uses the Ollama server with Llama3.1 to embed the video transcription.
Vector Storage: Stores the embedded vectors in the Chroma vector database.
Q&A System: Utilizes LangChain with a custom prompt and RAG injection from the Chroma DB to answer questions about the video.
Web Interface: Provides a Streamlit chat UI.
Deployment: Dockerizes all components (TubeSage API, Ollama server for the LLM, and Streamlit UI) and runs them with Docker Compose. Use the provided Dockerfiles for deployment.

Getting Started

To get started with TubeSage, follow these steps:

Prerequisites

NVIDIA GPU: Required for LLM inference with Ollama.
Docker & Docker Compose: Ensure Docker and Docker Compose are installed on your system.

Setup

Clone the Repository

git clone https://github.com/guscl/tubesage.git
cd tubesage

Install Dependencies

Ensure you have the NVIDIA toolkit installed for CUDA support.
- NVIDIA Toolkit: NVIDIA Installation Guide.
Run Docker Compose

Use Docker Compose to start all services:
```
docker compose up
```
Access the Application

Open your web browser and navigate to http://localhost:8501 to interact with the TubeSage interface.

Usage

Transcribe a Video

Enter a YouTube video URL.
Ask Questions

Type questions in the textbox at the end of the page. TubeSage will provide answers based on the transcribed and embedded text.

Contributing

Feel free to submit issues or pull requests. Contributions are welcome!

Contact

For questions, please contact correia.gustavol@gmail.com or open an issue on the GitHub repository.

Areas for Improvement

LangChain: LangChain API is not very well organized; Future experiments with LlamaIndex are planned.
YouTube Transcription API: This API is a wrapper over an undocumented API, which may become unreliable.
Ollama: Ollama currently does not feel production-ready. Future experimentation with VLLM is planned.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
images		images
streamlit-ui		streamlit-ui
tests		tests
tubesage		tubesage
.gitignore		.gitignore
Dockerfile		Dockerfile
DockerfileOllama		DockerfileOllama
DockerfileStreamlit		DockerfileStreamlit
README.md		README.md
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧙‍♂️ TubeSage

Project Architecture Overview

Getting Started

Prerequisites

Setup

Usage

Contributing

Contact

Areas for Improvement

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧙‍♂️ TubeSage

Project Architecture Overview

Getting Started

Prerequisites

Setup

Usage

Contributing

Contact

Areas for Improvement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages