KnowledgeVault

A self-hosted web service for ingesting thousands of technical documents and interacting with them through natural language chat powered by RAG (Retrieval-Augmented Generation).

Overview

KnowledgeVault enables you to:

Upload and index documents in formats: docx, xlsx, pptx, pdf, csv, sql, txt, and code files
Chat with your documents using AI-powered RAG responses
Store and retrieve memories for persistent knowledge across sessions
Search your knowledge base with semantic similarity
Self-host everything on your own infrastructure with local LLMs

Features

Feature	Description
Multi-Format Support	Process Word, Excel, PowerPoint, PDF, CSV, SQL, and text documents
Semantic Chunking	Structure-aware document processing preserves tables and code blocks
Vector Search	LanceDB-powered semantic search with relevance scoring
Memory System	SQLite FTS5-backed memory storage with natural language retrieval
Streaming Chat	Real-time AI responses with source citations
File Watcher	Automatic detection and processing of new documents
Email Ingestion	Ingest documents via email with IMAP polling and vault routing
Web UI	Modern React interface with Material 3 design
API Access	Full REST API with OpenAPI documentation

Architecture

Development Architecture

+------------------+     +------------------+     +------------------+
|   React Frontend |---->|  FastAPI Backend |---->|   LanceDB Vector |
|   (Port 5173*)   |     |   (Port 8080)    |     |   Store          |

*Port 5173 is for development only. Production access is via port 8080.
+------------------+     +------------------+     +------------------+
                               |                           |
                               |                    +------v------+
                               |                    |  SQLite     |
                               |                    |  Memories   |
                               |                    +-------------+
                               |
                        +------v---------------------------+
                        |  Ollama (External)               |
                         |  - Embeddings (bge-m3) |
                        |  - Chat (your choice of model)   |
                        +----------------------------------+

Production Architecture

In production, the frontend is built as static files and served directly by the backend container on port 8080. Both the React frontend and FastAPI backend are accessible through a single port (8080).

Technology Stack

Component	Technology
Frontend	React 18, TypeScript, Vite, shadcn/ui, Tailwind CSS, assistant-ui
Backend	Python 3.11, FastAPI, Pydantic
Vector DB	LanceDB (embedded)
Memory DB	SQLite with FTS5
Document Processing	Unstructured.io
LLM Integration	Ollama API (OpenAI-compatible)
Deployment	Docker Compose

Quick Start

Prerequisites

Docker and Docker Compose installed
Ollama installed and running (see Ollama Setup below)
At least 8GB RAM (16GB+ recommended)

1. Clone and Configure

git clone <repository-url>
cd RAGAPPv2
cp .env.example .env

Edit .env to match your setup:

# Required: Set your data directory
HOST_DATA_DIR=/path/to/your/data

# Optional: Change default models
CHAT_MODEL=llama3.2:latest

2. Start Ollama

Ensure Ollama is running on your host machine:

# macOS/Linux
ollama serve

# Windows (Ollama runs as a service by default)
# Verify with:
ollama list

3. Pull Required Models

# Required: Embedding model
ollama pull bge-m3

# Required: Chat model (choose one)
ollama pull qwen2.5:32b    # Recommended for technical content
ollama pull llama3.2:latest # Lighter alternative

4. Start KnowledgeVault

docker compose up -d

5. Access the Application

Open your browser to: http://localhost:8080

Environment Setup

Environment Variables

Variable	Default	Description
`PORT`	8080	Web server port
`HOST_DATA_DIR`	./data	Host path for data persistence
`DATA_DIR`	/app/data	Container data path
`OLLAMA_EMBEDDING_URL`	http://host.docker.internal:11434	Ollama embedding endpoint
`OLLAMA_CHAT_URL`	http://host.docker.internal:11434	Ollama chat endpoint
`EMBEDDING_MODEL`	bge-m3	Embedding model name
`CHAT_MODEL`	qwen2.5:32b	Chat model name
`CHUNK_SIZE`	512	Document chunk size (tokens)
`CHUNK_OVERLAP`	50	Chunk overlap (tokens)
`MAX_CONTEXT_CHUNKS`	10	Max chunks in RAG context
`RAG_RELEVANCE_THRESHOLD`	0.1	Minimum relevance score (0.0-1.0)
`LOG_LEVEL`	INFO	Logging level
`AUTO_SCAN_ENABLED`	true	Enable auto-scanning
`AUTO_SCAN_INTERVAL_MINUTES`	60	Scan interval
`IMAP_ENABLED`	false	Enable email ingestion
`IMAP_HOST`	-	IMAP server hostname
`IMAP_PORT`	993	IMAP server port (993 for SSL, 143 for non-SSL)
`IMAP_USE_SSL`	true	Use SSL/TLS for IMAP connection
`IMAP_USERNAME`	-	IMAP account username
`IMAP_PASSWORD`	-	IMAP account password
`IMAP_POLL_INTERVAL`	60	Email poll interval (seconds)

Data Directory Structure

data/
├── knowledgevault/       # Root data directory
│   ├── uploads/          # [LEGACY] Legacy flat uploads directory (deprecated)
│   ├── vaults/           # Vault-specific data directories
│   │   ├── 1/            # Vault 1 (default/orphan vault)
│   │   │   └── uploads/  # Uploads for vault 1
│   │   ├── 2/            # Vault 2
│   │   │   └── uploads/  # Uploads for vault 2
│   │   └── ...           # Additional vaults
│   ├── documents/        # Documents (legacy, kept for compatibility)
│   ├── library/          # Library files
│   ├── lancedb/          # Vector database
│   │   └── chunks.lance/
│   ├── app.db            # SQLite database
│   └── logs/
│       └── app.log

Note: The system now stores uploads in vault-specific directories (/data/knowledgevault/vaults/{vault_id}/uploads/). On first startup, the system automatically migrates files from the legacy flat uploads/ directory to the appropriate vault-specific directories. Files are renamed with .migrated suffix to create a safe backup. If a file cannot be associated with a specific vault, it defaults to the orphan vault (vault 1).

Ollama Models

Recommended Models

Embedding Model

bge-m3 (Required)

768 dimensions
8192 token context
~0.5GB VRAM
Excellent for technical content

ollama pull bge-m3

Chat Models

Model	Size	RAM	Speed	Best For
qwen2.5:32b	32B	~22GB	~15 tok/s	Technical reasoning
qwen2.5:72b	72B	~45GB	~10 tok/s	Complex analysis
llama3.2:latest	3B	~4GB	~30 tok/s	General use, fast
mistral:latest	7B	~8GB	~25 tok/s	Balanced performance

# Pull your preferred chat model
ollama pull qwen2.5:32b

Verifying Ollama Connection

# Test Ollama is running
curl http://localhost:11434/api/tags

# Test embedding model
curl http://localhost:11434/api/embeddings -d '{
  "model": "bge-m3",
  "prompt": "test"
}'

Troubleshooting

Container Won't Start

Problem: docker compose up fails

Solutions:

# Check Docker is running
docker info

# Check port availability
lsof -i :8080  # macOS/Linux
netstat -ano | findstr :8080  # Windows

# View logs
docker compose logs knowledgevault

LLM Unavailable Error

Problem: Health check shows "LLM unavailable"

Solutions:

Verify Ollama is running: ollama list
Check Ollama URL in .env matches your setup
For Linux, use host IP instead of host.docker.internal:
```
OLLAMA_CHAT_URL=http://192.168.1.100:11434
```

Documents Not Processing

Problem: Uploaded files stay in "pending" status

Solutions:

Check logs: docker compose logs -f knowledgevault
Verify file format is supported
Check disk space in data directory
Restart container: docker compose restart

Out of Memory

Problem: Container crashes during document processing

Solutions:

Reduce CHUNK_SIZE in .env (e.g., 256)
Process fewer files at once
Increase Docker memory limit
Use smaller chat model

Slow Responses

Problem: Chat responses are very slow

Solutions:

Use a smaller/faster chat model
Reduce MAX_CONTEXT_CHUNKS in .env
Increase RAG_RELEVANCE_THRESHOLD to filter more chunks
Ensure Ollama has GPU access if available

API Endpoints

Health & Status

Method	Endpoint	Description
GET	`/health`	Service health status

Authentication

Method	Endpoint	Description
POST	`/api/v1/auth/register`	Register new user (first = superadmin)
POST	`/api/v1/auth/login`	Login with username/password
POST	`/api/v1/auth/refresh`	Refresh access token
POST	`/api/v1/auth/logout`	Logout and revoke refresh token
GET	`/api/v1/auth/me`	Get current user profile (includes `must_change_password` flag)
PATCH	`/api/v1/auth/me`	Update current user profile
POST	`/api/v1/auth/change-password`	Change user password (validates strength policy)
GET	`/api/v1/auth/sessions`	List active sessions for current user
DELETE	`/api/v1/auth/sessions/{id}`	Revoke a specific session
DELETE	`/api/v1/auth/sessions`	Revoke all other sessions

Vaults

Method	Endpoint	Description
GET	`/api/v1/vaults`	List all vaults with counts
GET	`/api/v1/vaults/accessible`	List vault IDs user has access to
GET	`/api/v1/vaults/{id}`	Get vault details
POST	`/api/v1/vaults`	Create new vault
PUT	`/api/v1/vaults/{id}`	Update vault
DELETE	`/api/v1/vaults/{id}`	Delete vault

Chat

Method	Endpoint	Description
POST	`/api/v1/chat`	Non-streaming chat (requires vault_id with read access)
POST	`/api/v1/chat/stream`	Streaming chat (SSE, requires vault_id with read access)

Documents

Method	Endpoint	Description
GET	`/api/v1/documents`	List all documents
GET	`/api/v1/documents/stats`	Document statistics
POST	`/api/v1/documents/upload`	Upload file(s)
POST	`/api/v1/documents/scan`	Trigger directory scan
DELETE	`/api/v1/documents/{id}`	Delete document

Search

Method	Endpoint	Description
POST	`/api/v1/search`	Semantic search
POST	`/api/v1/search/chunks`	Search document chunks

Memories

Method	Endpoint	Description
GET	`/api/v1/memories`	List all memories
GET	`/api/v1/memories/search`	Search memories
POST	`/api/v1/memories`	Create memory
PUT	`/api/v1/memories/{id}`	Update memory
DELETE	`/api/v1/memories/{id}`	Delete memory

Settings

Method	Endpoint	Description
GET	`/api/v1/settings`	Get settings
PUT	`/api/v1/settings`	Update settings

API Documentation

Interactive API docs available at: http://localhost:8080/docs

OpenAPI schema: http://localhost:8080/openapi.json

Frontend Usage

Navigation

The web interface uses a navigation rail with five sections:

Chat - Ask questions about your documents using the streaming AI interface
Search - Find specific content in your knowledge base
Documents - Upload and manage documents
Memory - View and manage stored memories
Settings - Configure application settings (includes password change for forced updates)

Authentication Flow

The frontend implements secure route protection with the following features:

JWT Authentication: Access tokens with automatic refresh on expiration
Password Policy Enforcement: Users with must_change_password=true are automatically redirected to change their password
Role-Based Access: Admin-only routes are protected via AdminRoute component
Session Management: Users can view and revoke active sessions from Settings

Chat Interface

Type your question in the input field
Press Enter or click Send
Watch the AI response stream in real-time (powered by SSE)
Click "Sources" to see which documents were referenced
Say "Remember that..." to save information to memory

Streaming: The chat interface uses Server-Sent Events (SSE) for real-time response streaming with automatic token refresh on 401 errors.

Document Upload

Method 1: Web Upload

Go to Documents page
Click "Upload" or drag files onto the drop zone
Files are automatically processed and indexed

Method 2: Direct File Placement

Place files in data/knowledgevault/vaults/{vault_id}/uploads/ (e.g., data/knowledgevault/vaults/1/uploads/)
Click "Scan Directory" on Documents page
Or wait for auto-scan (if enabled)

Search

Go to Search page
Enter search query
Use filters to narrow results:
- File type
- Date range
- Relevance threshold
Click results to view source context

Memory Management

Go to Memory page to view all memories
Use search to find specific memories
Click edit icon to modify
Click delete icon to remove
Memories are automatically used in chat context

Development

Backend Development

# Run with hot-reload (includes frontend dev service)
docker compose -f docker-compose.yml -f docker-compose.override.yml up -d

# View logs
docker compose logs -f backend

# Run tests
docker compose exec backend pytest tests/

Frontend Development

cd frontend
npm install
npm run dev

Frontend API Library

The frontend includes a streaming-capable API client (frontend/src/lib/api.ts):

apiRequest<T>(method, path, body?) - Standard REST requests with JWT auth
apiStream(path, body, callbacks) - SSE streaming with automatic token refresh

The auth store (frontend/src/stores/authStore.ts) tracks:

User session state
must_change_password flag from /api/v1/auth/me endpoint
Password change enforcement via useRequirePasswordChange hook

Building Production Images

docker compose -f docker-compose.yml build
docker compose -f docker-compose.yml up -d

Documentation

Feature Guides

Email Ingestion - Ingest documents via email with IMAP polling and automatic vault routing

Administration

Admin Guide - Administrative tasks and configuration
Release Process - Deployment and release procedures
Non-Technical Setup - Setup guide for non-technical users

License

No license file present. Add LICENSE file or update this section as needed.

Support

Documentation: See docs/ directory
Issues: Create an issue in the repository
Admin Guide: See docs/admin-guide.md
Non-Technical Setup: See docs/non-technical-setup.md

Name		Name	Last commit message	Last commit date
Latest commit History 195 Commits
.claude		.claude
.newsroom		.newsroom
backend		backend
docs		docs
frontend		frontend
redesign		redesign
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
=0.9.0		=0.9.0
=6.0.0		=6.0.0
CopencodeRAGAPPv2sample-knowledge.txt		CopencodeRAGAPPv2sample-knowledge.txt
Dockerfile		Dockerfile
README.md		README.md
build.txt		build.txt
docker-compose.override.yml		docker-compose.override.yml
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini
qa-report.md		qa-report.md

Folders and files

Latest commit

History

Repository files navigation

KnowledgeVault

Overview

Features

Architecture

Development Architecture

Production Architecture

Technology Stack

Quick Start

Prerequisites

1. Clone and Configure

2. Start Ollama

3. Pull Required Models

4. Start KnowledgeVault

5. Access the Application

Environment Setup

Environment Variables

Data Directory Structure

Ollama Models

Recommended Models

Embedding Model

Chat Models

Verifying Ollama Connection

Troubleshooting

Container Won't Start

LLM Unavailable Error

Documents Not Processing

Out of Memory

Slow Responses

API Endpoints

Health & Status

Authentication

Vaults

Chat

Documents

Search

Memories

Settings

API Documentation

Frontend Usage

Navigation

Authentication Flow

Chat Interface

Document Upload

Search

Memory Management

Development

Backend Development

Frontend Development

Frontend API Library

Building Production Images

Documentation

Feature Guides

Administration

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages