-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Split remote inference text list if its number exceeds user configured limitation #2455
Conversation
Signed-off-by: Liyun Xiu <[email protected]>
Signed-off-by: Liyun Xiu <[email protected]>
} else { | ||
// if the originalOrder is not empty, reorder based on it | ||
List<ModelTensors> sortedModelTensors = Arrays.asList(new ModelTensors[modelTensorsList.size()]); | ||
assert (originalOrderIndexes.size() == modelTensors.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the originalOrderIndexes
size is the doc number, while modelTensors
size is request size. So it would cause error. A simple way to reproduce is to try predict a four size input and set connector max batch size to 2. We may need to fetch all tensor and re-create ModelTensors Again.
One more small detail here: What is the list of modelTensors
size here we want to keep? If we keep the size equal to request number, it may confuse user because actually sort and modelTensors is not the actual remote response (we re-sort it). So maybe keep all tensor into one modelTensors can be a choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this, it is a bug, will fix in next revision
Signed-off-by: Liyun Xiu <[email protected]>
@@ -41,6 +45,9 @@ | |||
import org.opensearch.script.ScriptService; | |||
|
|||
public interface RemoteConnectorExecutor { | |||
int DEFAULT_BATCH_SIZE = -1; | |||
String MAX_BATCH_SIZE_KEY = "max_batch_size"; | |||
String STEP_SIZE_KEY = "input_docs_processed_step_size"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is to control the batch size , why need to add another variable max_batch_size
?
Will implement the logic from neural search side. Please refer to #2428 (comment) Closing now |
Description
We'd like to have a solution to cut texts into small batches if total number of them in a batch request exceeds the limitation
Issues Resolved
#2428
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.