Adds PyTorch model management #394

joshdevins · 2021-09-27T15:34:49Z

This is a rough (and working) draft of porting PyTorch code into eland. This is the general shape of what I'm thinking of. It splits concerns into two - (1) generic PyTorch model management in Elasticsearch, and (2) interoperability with HuggingFace transformers and model hub.

Help and comments are welcome. There's surely a lot of formatting and general Python-ification that needs to happen still so help is welcome.

joshdevins · 2021-09-27T15:38:54Z

PS. I'm happy to just move this work into a branch in eland so we can collaborate more easily on this more easily. I suspect it will take some work to get it ready.

benwtrent

This looks great to me.

But, I am no Pythonista. Would be good to get a review from @sethmlarson

eland/ml/pytorch/pytorch_model.py

sethmlarson · 2021-09-27T16:34:11Z

@joshdevins I like your idea of developing this on a branch, I think we can create a branch right off of main and then change the target branch for this PR without much fuss. Reviewing this PR now.

sethmlarson

This is very exciting! I left a whole bunch of review comments. a high level review comment is you can run nox -rs format (or black eland/ will do a lot too) to automatically format code to be compliant with the linter. You'll also get type-checking as well.

eland/ml/pytorch/external_models.py

sethmlarson · 2021-09-27T19:54:47Z

bin/upload_hub_model.py

@@ -0,0 +1,80 @@
+#!venv/bin/python


I'm not sure what the intent of this file is, if it's intended to be "installed" on the users system after running python -m pip install eland then we'll need to structure and configure this much differently. After knowing what the intent is can review this more carefully.

its an example of how to download a model from hugging face and then deploy it. Its a useful script that the user can run.

I don't think there is a plan to have it installed on the users system.

Sorry this was a copy-paste from the old script so it's incorrect.

I discussed this a couple months ago with @sethmlarson but to repeat for a wider audience, we should be providing scripts for the most common use-cases so people don't need to write a notebook or their own script. The original intention is indeed that this is installed (on the system/in a venv) with pip install eland[pytorch]. Happy to restructure as needed. Also open to any other ideas for how this script should run though.

eland/ml/pytorch/external_models.py

sethmlarson · 2021-09-27T20:15:11Z

eland/ml/pytorch/pytorch_model.py

+
+        return True
+
+    def upload(self, model_path: str, config_path: str, vocab_path: str, chunk_size: int = DEFAULT_CHUNK_SIZE) -> bool:


Is there any cleanup we need to do if one part of the upload process passes and the next fails?

@dimitris-athanasiou ^^ ?

@joshdevins @sethmlarson, calling delete _ml/trained_models/<model_id> will clean up the bad state. The config will have to be pushed again.

eland/ml/pytorch/pytorch_model.py

sethmlarson · 2021-09-27T20:17:40Z

eland/ml/pytorch/pytorch_model.py

+        else:
+            ignorables = ()
+
+        return self._client.transport.perform_request(


Is there any way to distinguish between the "model deployment is already stopped" and "model isn't found" case? If so might be good to use that here and ignore the "already stopped" case unconditionally and always reraise the 404 model not found case?

@benwtrent ^^ ?

Looking at the code, _stop model deployment doesn't throw when the model is missing or when the deployment is already stopped.

eland/ml/pytorch/pytorch_model.py

joshdevins · 2021-09-28T10:39:15Z

Ok, let me go through the comments here and I will get this into a cleaner state. @sethmlarson can you create a new, empty branch (pytorch or whatever) and I can change the target of this PR then.

sethmlarson · 2021-09-28T15:08:14Z

@joshdevins The branch has been created and updated the PR base.

Will continue to iterate in more PR's once this work is on the new branch

Adding PyTorch model support through two main mechanisms: (1) Generic PyTorch model management in Elasticsearch (2) Interoperability with HuggingFace transformers and model hub This change aims to provide the foundations of this support and all of the basic functionality that one would need to get up and running. Note that this functionality is available in Elasticsearch 8.x builds only.

Initial commit of PyTorch support in eland

ecac249

joshdevins added enhancement New feature or request topic:NLP Issue or PR about NLP model support and eland_import_hub_model labels Sep 27, 2021

joshdevins requested review from davidkyle, benwtrent, dimitris-athanasiou and sethmlarson September 27, 2021 15:34

joshdevins self-assigned this Sep 27, 2021

joshdevins marked this pull request as ready for review September 27, 2021 15:39

benwtrent approved these changes Sep 27, 2021

View reviewed changes

eland/ml/pytorch/pytorch_model.py Outdated Show resolved Hide resolved

sethmlarson previously requested changes Sep 27, 2021

View reviewed changes

joshdevins removed request for davidkyle and dimitris-athanasiou September 28, 2021 10:39

joshdevins changed the title ~~Initial commit of PyTorch support in eland~~ Adds PyTorch model management Sep 28, 2021

sethmlarson changed the base branch from main to pytorch September 28, 2021 15:07

joshdevins added 8 commits September 28, 2021 17:14

Address various review comments

f91e0a4

Cleans up formatting after nox run

1a2470e

Fixes ambiguous variable name

faa086f

Addresses timeouts and not found errors

0e74c0b

Fixes variable to not shadow type

5b9bf18

Fixes linter indentation

4614949

Fixes sort function

1f05c64

Fixes sorting and joining, for realz

95eb986

joshdevins merged commit 2e7a393 into elastic:pytorch Sep 29, 2021

joshdevins deleted the pytorch-model-upload branch September 29, 2021 07:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds PyTorch model management #394

Adds PyTorch model management #394

joshdevins commented Sep 27, 2021 •

edited

Loading

joshdevins commented Sep 27, 2021

benwtrent left a comment

sethmlarson commented Sep 27, 2021

sethmlarson left a comment

sethmlarson Sep 27, 2021

benwtrent Sep 28, 2021

joshdevins Sep 28, 2021 •

edited

Loading

sethmlarson Sep 27, 2021

joshdevins Sep 28, 2021

benwtrent Sep 28, 2021

sethmlarson Sep 27, 2021

joshdevins Sep 28, 2021

benwtrent Sep 28, 2021

joshdevins commented Sep 28, 2021 •

edited

Loading

sethmlarson commented Sep 28, 2021


		return True

		def upload(self, model_path: str, config_path: str, vocab_path: str, chunk_size: int = DEFAULT_CHUNK_SIZE) -> bool:

Adds PyTorch model management #394

Adds PyTorch model management #394

Conversation

joshdevins commented Sep 27, 2021 • edited Loading

joshdevins commented Sep 27, 2021

benwtrent left a comment

Choose a reason for hiding this comment

sethmlarson commented Sep 27, 2021

sethmlarson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshdevins Sep 28, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joshdevins commented Sep 28, 2021 • edited Loading

sethmlarson commented Sep 28, 2021

joshdevins commented Sep 27, 2021 •

edited

Loading

joshdevins Sep 28, 2021 •

edited

Loading

joshdevins commented Sep 28, 2021 •

edited

Loading