A framework consisting of a lightweight Retrieval-Augmented Generation (RAG) system and a flexible benchmark infrastructure for systematic, use-case-specific optimization of RAG systems for individual knowledge management scenarios.
This code supplements my master's thesis, titled:
🇩🇪 "Intelligentes Wissensmamagement in KMU: Konzeption und Evaluation eines lokalen, RAG-basierten Wissensmanagementsystems"
🇬🇧 "Intelligent Knowledge Management in SMEs: Design and Evaluation of a Local, RAG-Based Knowledge Management System"
💡 Abstract
Small and medium-sized enterprises (SMEs) engaged in custom development accumulate substantial domain-specific knowledge that is frequently distributed across heterogeneous internal systems, limiting its accessibility and interpretability. Retrieval-Augmented Generation (RAG) presents a viable approach to address this challenge by enabling structured retrieval of fragmented information and facilitating interactive knowledge exploration. This thesis investigates the critical performance factors of RAG systems designed for knowledge management in resource-constrained SME environments with strict data sovereignty requirements, and proposes methods for their systematic quantification and optimization. A modular framework comprising a baseline RAG system and a flexible benchmarking infrastructure was developed to support use-case-specific, differentiated evaluation. An empirical optimization study conducted within this framework demonstrated statistically significant overall performance gains, with notable improvements in trustworthiness, response groundedness and context precision. The findings confirm that RAG systems can deliver immediate utility under simple configurations using publicly available components, while also establishing that no universally optimal configuration exists; optimization strategies must be evaluated contextually, as their efficacy is not transferable across arbitrary application scenarios or knowledge bases.
- the RAG module (baseline and final configuration after the optimization study) in
/rag - the customizable benchmark infrastructure in
/benchmark - the notebooks for the evaluation of the benchmark results in
/eval
- the raw benchmark results of from the thesis (due to them containing internal company data)
🔧 Configuration
- Choose and install language and embedding models locally (see Ollama Docs for that)
- Install required packages (see
requirements.txtforrag/benchmark/eval)- Note: for benchmarking the RAG module, install both
rag/requirements.txtandbenchmark/requirements.txt
- Note: for benchmarking the RAG module, install both
- Prepare Q&A samples for evaluation (use cleaned data preparation scripts from thesis or define manually)
- Define a config file for the RAG module (see example)
- Note: the sources from the example are supported from the Loader defined in the thesis (see
loader.py). You can add arbitrary data sources as long as you define your own implementation of thebase_loaderinterface to handle them.
- Note: the sources from the example are supported from the Loader defined in the thesis (see
- Configure benchmark: specify the path to your config and to your Q&A samples in the
CONFIGdict of the benchmark pipeline - Customize metrics: Add or remove metrics in the
metrics_provider. Use metrics from Ragas or define your own (see custom metrics for custom Judge-LLMs)
🧪 Start benchmark
Execute the pipeline.py to start the benchmark
- Note: the benchmark initializes the RAG module under test. When this is initialized and cannot find an index in the specified directory (see config) it creates one automatically.
- Note: when finished it stores the benchmark results as a JSON file in the specified
OUTPUT_DIR(seeCONFIGdict of the benchmark pipeline)
🔬 Evaluate results
See /eval/base for basic analysis and comparison. Extend for custom analysis. Check eval/optimization for inspiration on deeper analysis from the thesis.
🚀 Optimize & Create!
This allows you to optimize and evaluate the RAG module iteratively. After you finished optimizing, you can integrate the RAG module to the desired target environment (e.g. backend of a Web Application, as an agent in your Multi-Agent-Environment, ...)
I recommend using the following scheme for the iterative optimization:
