Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate OpenSearch Ml-Commons into Data Prepper #5509

Open
Zhangxunmt opened this issue Mar 7, 2025 · 0 comments
Open

Integrate OpenSearch Ml-Commons into Data Prepper #5509

Zhangxunmt opened this issue Mar 7, 2025 · 0 comments

Comments

@Zhangxunmt
Copy link

Zhangxunmt commented Mar 7, 2025

Is your feature request related to a problem? Please describe.
ML Commons is an OpenSearch plugin that manages Machine Learning models to enhance search relevance through semantic understanding. You can deploy models directly within your OpenSearch cluster or connect to externally hosted models.

For neural search, a language model converts text into vector embeddings. During ingestion, OpenSearch generates vector embeddings for text fields in incoming requests. At search time, the same model transforms query text into vector embeddings, enabling vector similarity search. It is crucial to use the same ML model for both ingestion and search to ensure consistency.

To support offline batch ingestion, Data Prepper is proposed as the ingestion engine for transforming text into vector embeddings. This processor will also support streaming mode data transformation.

Describe the solution you'd like
Build a new processor that integrates the ml-commons ML model Predict/batch_predict APIs into the Data Prepper pipelines.

Describe alternatives you've considered (Optional)
The model management and predict/batch_predict API has already been launched in ml-commons. This feature only integrate them into the Data Prepper.

Additional context
#5433

@Zhangxunmt Zhangxunmt changed the title New ML processor to interact with Ml-Commons in OpenSearch Integrate OpenSearch Ml-Commons into Data Prepper Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

1 participant