Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch upload of documents #881

Merged
merged 2 commits into from
Aug 19, 2024
Merged

Conversation

dave90
Copy link
Contributor

@dave90 dave90 commented Aug 7, 2024

Description

This pull request introduces a new endpoint to facilitate the batch uploading of files. The endpoint allows users to upload multiple files. The response of the endpoint returns a dictionary containing the status of each uploaded file. Metadata for each file is passed as a JSON string alongside form data. This new endpoint does not affect the existing endpoint or any current functionality.

Example of the usage:

files = []
files_to_upload = {"sample.pdf":"application/pdf","sample.txt":"application/txt"}

for file_name in files_to_upload:
    content_type = files_to_upload[file_name]
    file_path = f"tests/mocks/{file_name}"
    files.append(  ("files", ((file_name, open(file_path, "rb"), content_type))) )


metadata = {
    "sample.pdf":{
        "source": "sample.pdf",
        "title": "Test title",
        "author": "Test author",
        "year": 2020
    },
    "sample.txt":{
        "source": "sample.txt",
        "title": "Test title",
        "author": "Test author",
        "year": 2021
    }
}
    
# upload file endpoint only accepts form-encoded data
payload = {
    "chunk_size": 128,
    "metadata": json.dumps(metadata)
}

response = requests.post(
    "http://localhost:1865/rabbithole/batch",
    files=files,
    data=payload
)

Related to issue #871

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas

@pieroit
Copy link
Member

pieroit commented Aug 19, 2024

Appreciated really much you also wrote the tests. Awesome addition thanks!

@pieroit pieroit merged commit 9d6960e into cheshire-cat-ai:develop Aug 19, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants