RAG Application: Minimal Dependency Example

This project demonstrates a lightweight implementation of a Retrieval-Augmented Generation (RAG) application. The goal is to achieve effective results with minimal dependencies and concise code.

Overview

Embedding Model: Uses SBERT (e.g., all-MiniLM-L6-v2) for generating vector representations of text.
Language Model: Calls a Large Language Model (LLM) from Hugging Face's API (e.g., Llama-3.2-1B-Instruct) to generate answers.
Vector Indexing: Leverages Meta's Faiss library for efficient vector similarity search.

The example dataset consists of 18 paragraphs. Based on the user's question, 1-3 relevant paragraphs are retrieved to provide context to the LLM for generating a response. You can modify the dataset and prompts to experiment with different embeddings and results.

Instructions

Create .env and .env.docker files.

Use .env.example and .env.docker.example as templates.
Ensure you add your Hugging Face API Key.

Run Locally

Install required dependencies:
```
pip install -r requirements.txt
```
Run the main application:
```
python app/main.py
```

Run using Docker

Use the start.sh script to create and run the container in Docker:
```
./start.sh
```

Testing the Application

Once the system is running (locally or in docker), you can test the solution by sending a POST request.

Example Request

Run the following command in a terminal:

curl -X POST -H "Content-Type: application/json" -d '{"question": "What is the capital of France?"}' http://127.0.0.1:8000/ask

Example Response

The application will return a JSON response like this:

{
  "answer": "The capital of France is Paris.",
  "relevant snippets": [
    "As the capital of France, Paris is the epicenter of French language and culture.",
    "Paris has been a cradle of artistic and intellectual movements, inspiring figures like Victor Hugo, Voltaire, and Edith Piaf.",
    "Paris is also known for its culinary delights, offering everything from croissants and baguettes to haute cuisine."
  ]
}

Deploying to AWS

To deploy this container to AWS I would use Amazon Elastic Container Service (ECS). I would start by building and pushing the Docker image to Amazon Elastic Container Registry (ECR). Then, create an ECS cluster and define a task using the pushed image. After that, I would configure the necessary security groups, set environment variables (like the Hugging Face API key), and expose the service.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
data		data
.env.docker.example		.env.docker.example
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Application: Minimal Dependency Example

Overview

Instructions

Run Locally

Run using Docker

Testing the Application

Example Request

Example Response

Deploying to AWS

About

Releases

Packages

Languages

License

pfcouto/RAG_Application

Folders and files

Latest commit

History

Repository files navigation

RAG Application: Minimal Dependency Example

Overview

Instructions

Run Locally

Run using Docker

Testing the Application

Example Request

Example Response

Deploying to AWS

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages