Skip to content

AutoGGUFEmbeddings with nomic-embed-text-v1.5.Q8_0.gguf not able to achieve high context length #14530

Answered by DevinTDHa
pwyang123 asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @pwyang123,

I was able to reproduce it and I'm working on a fix. There are some issues with the error handling and it shouldn't fail silently. I'll update this discussion, when the fix is ready. Thanks for reporting!

In the meantime, can you try the following:

autoGGUFModel = (
  AutoGGUFEmbeddings.loadSavedModel("path_to/nomic-embed-text-v1.5.Q8_0.gguf", spark)\
    .setInputCols("document")
    .setOutputCol("embeddings")
    .setBatchSize(4)
    .setNGpuLayers(99)
    .setNCtx(8192)
    .setNBatch(2048) # Set logical batch size
    .setNUbatch(2048) # Set physical batch size
)

Explanation (for reference see this discussion):

llama.cpp allows for setting of the 1. logical and 2. phys…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@pwyang123
Comment options

@DevinTDHa
Comment options

Answer selected by DevinTDHa
@pwyang123
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants