You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+12-11Lines changed: 12 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,17 +37,18 @@ Deployment are based on released docker images by default, check [docker image l
37
37
38
38
#### Deploy Examples
39
39
40
-
| Use Case | Docker Compose<br/>Deployment on Xeon | Docker Compose<br/>Deployment on Gaudi | Kubernetes with Manifests | Kubernetes with Helm Charts | Kubernetes with GMC |
| ChatQnA |[Xeon Instructions](ChatQnA/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](ChatQnA/docker_compose/intel/hpu/gaudi/README.md)|[ChatQnA with Manifests](ChatQnA/kubernetes/intel/README.md)|[ChatQnA with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/chatqna/README.md)|[ChatQnA with GMC](ChatQnA/kubernetes/intel/README_gmc.md)|
43
-
| CodeGen |[Xeon Instructions](CodeGen/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](CodeGen/docker_compose/intel/hpu/gaudi/README.md)|[CodeGen with Manifests](CodeGen/kubernetes/intel/README.md)|[CodeGen with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/codegen/README.md)|[CodeGen with GMC](CodeGen/kubernetes/intel/README_gmc.md)|
44
-
| CodeTrans |[Xeon Instructions](CodeTrans/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](CodeTrans/docker_compose/intel/hpu/gaudi/README.md)|[CodeTrans with Manifests](CodeTrans/kubernetes/intel/README.md)|[CodeTrans with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/codetrans/README.md)|[CodeTrans with GMC](CodeTrans/kubernetes/intel/README_gmc.md)|
45
-
| DocSum |[Xeon Instructions](DocSum/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](DocSum/docker_compose/intel/hpu/gaudi/README.md)|[DocSum with Manifests](DocSum/kubernetes/intel/README.md)|[DocSum with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/docsum/README.md)|[DocSum with GMC](DocSum/kubernetes/intel/README_gmc.md)|
46
-
| SearchQnA |[Xeon Instructions](SearchQnA/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](SearchQnA/docker_compose/intel/hpu/gaudi/README.md)| Not Supported | Not Supported |[SearchQnA with GMC](SearchQnA/kubernetes/intel/README_gmc.md)|
47
-
| FaqGen |[Xeon Instructions](FaqGen/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](FaqGen/docker_compose/intel/hpu/gaudi/README.md)|[FaqGen with Manifests](FaqGen/kubernetes/intel/README.md)| Not Supported |[FaqGen with GMC](FaqGen/kubernetes/intel/README_gmc.md)|
48
-
| Translation |[Xeon Instructions](Translation/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](Translation/docker_compose/intel/hpu/gaudi/README.md)| Not Supported | Not Supported |[Translation with GMC](Translation/kubernetes/intel/README_gmc.md)|
49
-
| AudioQnA |[Xeon Instructions](AudioQnA/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](AudioQnA/docker_compose/intel/hpu/gaudi/README.md)|[AudioQnA with Manifests](AudioQnA/kubernetes/intel/README.md)| Not Supported |[AudioQnA with GMC](AudioQnA/kubernetes/intel/README_gmc.md)|
50
-
| VisualQnA |[Xeon Instructions](VisualQnA/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](VisualQnA/docker_compose/intel/hpu/gaudi/README.md)|[VisualQnA with Manifests](VisualQnA/kubernetes/intel/README.md)| Not Supported |[VisualQnA with GMC](VisualQnA/kubernetes/intel/README_gmc.md)|
40
+
| Use Case | Docker Compose<br/>Deployment on Xeon | Docker Compose<br/>Deployment on Gaudi | Kubernetes with Manifests | Kubernetes with Helm Charts | Kubernetes with GMC |
| ChatQnA |[Xeon Instructions](ChatQnA/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](ChatQnA/docker_compose/intel/hpu/gaudi/README.md)|[ChatQnA with Manifests](ChatQnA/kubernetes/intel/README.md)|[ChatQnA with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/chatqna/README.md)|[ChatQnA with GMC](ChatQnA/kubernetes/intel/README_gmc.md)|
43
+
| CodeGen |[Xeon Instructions](CodeGen/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](CodeGen/docker_compose/intel/hpu/gaudi/README.md)|[CodeGen with Manifests](CodeGen/kubernetes/intel/README.md)|[CodeGen with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/codegen/README.md)|[CodeGen with GMC](CodeGen/kubernetes/intel/README_gmc.md)|
44
+
| CodeTrans |[Xeon Instructions](CodeTrans/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](CodeTrans/docker_compose/intel/hpu/gaudi/README.md)|[CodeTrans with Manifests](CodeTrans/kubernetes/intel/README.md)|[CodeTrans with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/codetrans/README.md)|[CodeTrans with GMC](CodeTrans/kubernetes/intel/README_gmc.md)|
45
+
| DocSum |[Xeon Instructions](DocSum/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](DocSum/docker_compose/intel/hpu/gaudi/README.md)|[DocSum with Manifests](DocSum/kubernetes/intel/README.md)|[DocSum with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/docsum/README.md)|[DocSum with GMC](DocSum/kubernetes/intel/README_gmc.md)|
46
+
| SearchQnA |[Xeon Instructions](SearchQnA/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](SearchQnA/docker_compose/intel/hpu/gaudi/README.md)| Not Supported | Not Supported |[SearchQnA with GMC](SearchQnA/kubernetes/intel/README_gmc.md)|
47
+
| FaqGen |[Xeon Instructions](FaqGen/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](FaqGen/docker_compose/intel/hpu/gaudi/README.md)|[FaqGen with Manifests](FaqGen/kubernetes/intel/README.md)| Not Supported |[FaqGen with GMC](FaqGen/kubernetes/intel/README_gmc.md)|
48
+
| Translation |[Xeon Instructions](Translation/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](Translation/docker_compose/intel/hpu/gaudi/README.md)| Not Supported | Not Supported |[Translation with GMC](Translation/kubernetes/intel/README_gmc.md)|
49
+
| AudioQnA |[Xeon Instructions](AudioQnA/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](AudioQnA/docker_compose/intel/hpu/gaudi/README.md)|[AudioQnA with Manifests](AudioQnA/kubernetes/intel/README.md)| Not Supported |[AudioQnA with GMC](AudioQnA/kubernetes/intel/README_gmc.md)|
50
+
| VisualQnA |[Xeon Instructions](VisualQnA/docker_compose/intel/cpu/xeon/README.md)|[Gaudi Instructions](VisualQnA/docker_compose/intel/hpu/gaudi/README.md)|[VisualQnA with Manifests](VisualQnA/kubernetes/intel/README.md)| Not Supported |[VisualQnA with GMC](VisualQnA/kubernetes/intel/README_gmc.md)|
51
+
| ProductivitySuite |[Xeon Instructions](ProductivitySuite/docker_compose/intel/cpu/xeon/README.md)| Not Supported |[ProductivitySuite with Manifests](ProductivitySuite/kubernetes/intel/README.md)| Not Supported | Not Supported |
Copy file name to clipboardExpand all lines: VideoQnA/docker_compose/intel/cpu/xeon/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
# Build Mega Service of videoqna on Xeon
1
+
# Build Mega Service of VideoQnA on Xeon
2
2
3
3
This document outlines the deployment process for a videoqna application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `embedding`, `retriever`, `rerank`, and `lvm`. We will publish the Docker images to Docker Hub soon, it will simplify the deployment process for this service.
> **_NOTE:_** The `Language Translation`, `SearchQnA`, `VisualQnA` and other use cases not listing here are in active development. The code structure of these use cases are subject to change.
67
+
### VideoQnA
68
+
69
+
[VideoQnA](./VideoQnA/README.md) is an example of chatbot for question and answering based on the videos. It retrieves video based on provided user prompt. It uses only the video embeddings to perform vector similarity search in Intel's VDMS vector database and performs all operations on Intel Xeon CPU. The pipeline supports long form videos and time-based search.
70
+
71
+
By default, the embedding and LVM models are set to a default value as listed below:
| Rerank Finetuning |[BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large)| Xeon | Rerank model finetuning service |
87
+
88
+
### InstructionTuning
89
+
90
+
The Instruction Tuning example is designed to further train large language models (LLMs) on a dataset consisting of (instruction, output) pairs using supervised learning. This process bridges the gap between the LLM's original objective of next-word prediction and the user’s objective of having the model follow human instructions accurately. By leveraging Instruction Tuning, this example enhances the LLM's ability to better understand and execute specific tasks, improving the model's alignment with user instructions and its overall performance.
91
+
92
+
By default, the base model is set to a default value as listed below:
| InstructionTuning |[meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)| Xeon/Gaudi | LLM model Instruction Tuning service |
97
+
98
+
### DocIndexRetriever
99
+
100
+
The DocRetriever example demonstrates how to match user queries with free-text records using various retrieval methods. It plays a key role in Retrieval-Augmented Generation (RAG) systems by dynamically fetching relevant information from external sources, ensuring responses are factual and up-to-date. Powered by vector databases, DocRetriever enables efficient, semantic retrieval by storing data as vectors and quickly identifying the most relevant documents based on similarity.
|[LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai)|[BGE-Base](https://huggingface.co/BAAI/bge-base-en)|[Redis](https://redis.io/)|[TEI](https://github.com/huggingface/text-embeddings-inference)| Xeon/Gaudi2 | Document Retrieval Service |
105
+
106
+
### AgentQnA
107
+
108
+
The AgentQnA example demonstrates a hierarchical, multi-agent system designed for question-answering tasks. A supervisor agent interacts directly with the user, delegating tasks to a worker agent and utilizing various tools to gather information and generate answers. The worker agent primarily uses a retrieval tool to respond to the supervisor's queries. Additionally, the supervisor can access other tools, such as APIs to query knowledge graphs, SQL databases, or external knowledge bases, to enhance the accuracy and relevance of its responses.
109
+
110
+
Worker agent uses open-source websearch tool (duckduckgo), agents use OpenAI GPT-4o-mini as llm backend.
111
+
112
+
> **_NOTE:_** This example is in active development. The code structure of these use cases are subject to change.
0 commit comments