This project demonstrates a lightweight implementation of a Retrieval-Augmented Generation (RAG) application. The goal is to achieve effective results with minimal dependencies and concise code.
- Embedding Model: Uses SBERT (e.g.,
all-MiniLM-L6-v2
) for generating vector representations of text. - Language Model: Calls a Large Language Model (LLM) from Hugging Face's API (e.g.,
Llama-3.2-1B-Instruct
) to generate answers. - Vector Indexing: Leverages Meta's Faiss library for efficient vector similarity search.
The example dataset consists of 18 paragraphs. Based on the user's question, 1-3 relevant paragraphs are retrieved to provide context to the LLM for generating a response. You can modify the dataset and prompts to experiment with different embeddings and results.
Create .env
and .env.docker
files.
- Use
.env.example
and.env.docker.example
as templates. - Ensure you add your Hugging Face API Key.
-
Install required dependencies:
pip install -r requirements.txt
-
Run the main application:
python app/main.py
- Use the
start.sh
script to create and run the container in Docker:./start.sh
Once the system is running (locally or in docker), you can test the solution by sending a POST request.
- Run the following command in a terminal:
curl -X POST -H "Content-Type: application/json" -d '{"question": "What is the capital of France?"}' http://127.0.0.1:8000/ask
- The application will return a JSON response like this:
{
"answer": "The capital of France is Paris.",
"relevant snippets": [
"As the capital of France, Paris is the epicenter of French language and culture.",
"Paris has been a cradle of artistic and intellectual movements, inspiring figures like Victor Hugo, Voltaire, and Edith Piaf.",
"Paris is also known for its culinary delights, offering everything from croissants and baguettes to haute cuisine."
]
}
To deploy this container to AWS I would use Amazon Elastic Container Service (ECS). I would start by building and pushing the Docker image to Amazon Elastic Container Registry (ECR). Then, create an ECS cluster and define a task using the pushed image. After that, I would configure the necessary security groups, set environment variables (like the Hugging Face API key), and expose the service.