Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch processing as "mlserver infer ..." CLI #720

Merged
merged 18 commits into from
Sep 29, 2022
Merged

batch processing as "mlserver infer ..." CLI #720

merged 18 commits into from
Sep 29, 2022

Conversation

RafalSkolasinski
Copy link
Contributor

@RafalSkolasinski RafalSkolasinski commented Sep 7, 2022

Closes #753

@RafalSkolasinski RafalSkolasinski marked this pull request as draft September 7, 2022 00:42
Copy link
Contributor

@adriangonz adriangonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great @RafalSkolasinski !

@RafalSkolasinski RafalSkolasinski marked this pull request as ready for review September 28, 2022 09:35
@RafalSkolasinski RafalSkolasinski changed the title WIP: batch processing as "mlserver infer ..." CLI batch processing as "mlserver infer ..." CLI Sep 28, 2022
@RafalSkolasinski
Copy link
Contributor Author

Example input file

{"id": "custom-id", "inputs":[{"name":"predict","data":[0.31177403400087855,0.06077599200388184,0.1222722464604058,0.8507384376975163,0.08642635296618717,0.16270193586222859,0.8280425306726377,0.4534262246592593,0.3300012832025696,0.2526234446985879,0.23293464785223938,0.21788839963226136,0.8776370718523296,0.20664709375873636,0.9859762435646847,0.8663579909519274,0.03244862210354882,0.2874217794917452,0.05416325734773375,0.5624864259288173,0.2160973996879475,0.9066489214480319,0.6509069479866062,0.7688311886026783,0.41852501142154397,0.2810573226084383,0.3681228287090319,0.34444729219914305,0.9110756321461084,0.39927911700574203,0.46424268669452895,0.35830838405086185,0.38371086125181775,0.46982201036606885,0.9476962525321978,0.12091981513104466,0.23146807430890848,0.8612386242912793,0.6106097404155927,0.38764016359823394,0.79274871638451,0.15928211405749892,0.7100371444219623,0.9938178150083873,0.8122341549349049,0.7823611321383721,0.08187338920302922,0.9024453017145617,0.6289873796734957,0.8272189848883423,0.052110859546364074,0.19640187409999754,0.0855972199211621,0.6645421444439578,0.5367209344887608,0.3415219042209401,0.9713036749388746,0.9251368770435912,0.06612251655759804,0.7518484726707216,0.044556141462089194,0.0433367590585817,0.12387907321238179,0.3923179544574654],"datatype":"FP32","shape":[1,8,8]}]}

example output file:

{"model_name":"mnist-svm","model_version":"v0.1.0","id":"custom-id","parameters":{"content_type":null,"headers":null,"batch_index":0},"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[9]}]}

Above can be for example use against mnist-svm (I believe I have exactly the same model there as one in the MLServer docs here) with

mlserver infer -u localhost:8080 -m mnist-svm -i ~/work/tmp/single.txt -o /work/tmp/output.txt --workers 10 -v

@RafalSkolasinski
Copy link
Contributor Author

RafalSkolasinski commented Sep 28, 2022

Currently only http support is included. Follow up will include:

  • micro-batching to align with V1 batch processing
  • grpc support
  • retry mechanism + logging of failed requests to the output file
  • binary data support (already for Triton, requires modification for MLServer)
  • documentation (marking feature as experimental - API may change in regards to CLI flags and env vars)

Some notes from the in person code review with @adriangonz :

  • worth considering OOP-based refactor now once the structure of batch processing component is emerging
  • gather all input arguments into pydantic class (find a way to sync with click arguments)
  • command line input and output to be able to use mlserver infer ... instead of curl and grpcurl

queue.task_done()


async def process_batch(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One for the design doc but there may be some value for providing a process_batch_sync function to simplify interaction if someone wants to call the function from a non-asyncio loop in python - otherwise they'll have to handle the asyncio logic themselves, but perhaps not as bad, maybe it could just be documentation that explains how people could do it if needed

Copy link
Contributor Author

@RafalSkolasinski RafalSkolasinski Sep 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, for now this is meant mostly as the CLI entrypoint, not to be yet used directly anywhere else. I am going to add the docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added small docstrings just now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing it as from mlserver import ... functionality is definitely a worth following up on though 👍

@adriangonz adriangonz merged commit 1fd3ce6 into SeldonIO:master Sep 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compatibility between Seldon Batch Processor and V2
3 participants