Hub model import script improvements #461

dolaru · 2022-04-21T15:31:06Z

Changes

Better logging

Switched from print statements to logging for a cleaner and more informative output - timestamps and log level are shown. The logging is now a bit more verbose, but it will help users to better understand what the script is doing.

Add support for ES authentication using username/password or api key

Instead of being limited to passing credentials in the URL, there are now 2 additional methods:

username/password using --es-username and --es-password
API key using --es-api-key

Credentials can also be specified as environment variables with ES_USERNAME/ES_PASSWORD or ES_API_KEY

Graceful handling of missing PyTorch requirements

In order to use the eland_import_hub_model script, PyTorch extras are required to be installed. If the user does not have the required packages installed, a helpful message is logged with a hint to install eland[pytorch] with pip.

Graceful handling of already existing trained model

If a trained model with the same ID as the one we're trying to import already exists, and --clear-previous was not specified, we now log a clearer message about why the script can't proceed along with a hint to use the --clear-previous flag.

Prior to this change, we were letting the API exception seep through and the user was faced with a stack trace.

`tqdm` added to main dependencies

If the user doesn't have eland[pytorch] extras installed, the first module to be reported as missing is tqdm. Since this module is used in eland codebase directly, it makes sense to me to have it as part of the main set of requirements.

Nit: Set tqdm unit to `parts` in `_pytorch_model.put_model`

The default unit is it, but parts better describes what the progress bar is tracking - uploading trained model definition parts.

benwtrent

lgtm

sethmlarson

Looks good, one comment on ignore= being deprecated:

bin/eland_import_hub_model

- Move bulky code out of `__main__` - Add support for authentication using username/password or api key - Graceful handling of missing PyTorch requirements - Graceful handling of already existing trained model - Use logging instead of print statements - Make logging a bit more verbose

dolaru changed the title ~~Hub import improvements~~ Hub model import script improvements Apr 21, 2022

dolaru self-assigned this Apr 21, 2022

dolaru added the enhancement New feature or request label Apr 21, 2022

dolaru force-pushed the hub_import_improvements branch from 05e99c3 to 0a69eb8 Compare April 21, 2022 15:45

dolaru requested review from benwtrent and sethmlarson April 21, 2022 15:49

benwtrent approved these changes Apr 21, 2022

View reviewed changes

sethmlarson reviewed Apr 27, 2022

View reviewed changes

bin/eland_import_hub_model Outdated Show resolved Hide resolved

dolaru added 3 commits April 27, 2022 13:30

Use 'parts' as unit for tqdm in _pytorch_model.put_model

7f653c7

Resolve deprecated use of ignore

987b759

dolaru force-pushed the hub_import_improvements branch from 0a69eb8 to 987b759 Compare April 27, 2022 12:32

sethmlarson approved these changes Apr 27, 2022

View reviewed changes

dolaru merged commit fe34221 into elastic:main Apr 27, 2022

dolaru deleted the hub_import_improvements branch April 27, 2022 14:14

szabosteve mentioned this pull request May 9, 2022

[DOCS] Adds authentication methods to NLP import guide elastic/stack-docs#2132

Merged

lcawl mentioned this pull request May 17, 2022

Add authentication methods for import model script #466

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hub model import script improvements #461

Hub model import script improvements #461

dolaru commented Apr 21, 2022 •

edited

Loading

benwtrent left a comment

sethmlarson left a comment

Hub model import script improvements #461

Hub model import script improvements #461

Conversation

dolaru commented Apr 21, 2022 • edited Loading

Changes

Better logging

Add support for ES authentication using username/password or api key

Graceful handling of missing PyTorch requirements

Graceful handling of already existing trained model

tqdm added to main dependencies

Nit: Set tqdm unit to parts in _pytorch_model.put_model

benwtrent left a comment

Choose a reason for hiding this comment

sethmlarson left a comment

Choose a reason for hiding this comment

dolaru commented Apr 21, 2022 •

edited

Loading

`tqdm` added to main dependencies

Nit: Set tqdm unit to `parts` in `_pytorch_model.put_model`