Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hf inference] Implement HuggingFaceImage2TextRemoteInference #1018

Merged
merged 1 commit into from
Jan 25, 2024

Conversation

rholinshead
Copy link
Member

@rholinshead rholinshead commented Jan 25, 2024

[hf inference] Implement HuggingFaceImage2TextRemoteInference

[hf inference] Implement HuggingFaceImage2TextRemoteInference

Implement HuggingFaceImage2TextRemoteInference for running hf image-to-text models via inference API for gradio. The api takes in an image of various supported types: Union[str, Path, bytes, BinaryIO]. For now, just implementing support for path and uri since it's not needed for gradio and my python skills aren't great.

Screenshot 2024-01-24 at 7 31 23 PM

Testing:

Build/install the local hugging face package with these changes

(hf) ryanholinshead@Ryans-MacBook-Pro aiconfig % cd extensions/HuggingFace
pip3 install build && cd python && python -m build && pip3 install dist/*.whl
pip3 install -e .

Register these parsers in /Users/ryanholinshead/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py then run aiconfig editor with the local parsers and gradio config:

aiconfig_path=/Users/ryanholinshead/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json

(hf) ryanholinshead@Ryans-MacBook-Pro aiconfig % parsers_path=/Users/ryanholinshead/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py


(hf) ryanholinshead@Ryans-MacBook-Pro aiconfig % aiconfig edit --aiconfig-path=$aiconfig_path --server-mode=debug_servers --parsers-module-path=$parsers_path
  • Ensure the model works in the config. Ensure changing settings works & is persisted to aiconfig.
  • Test setting fake model ('test') propagates expected error
  • Test setting invalid api_token in the InferenceOptions in run server-side propagates expected error
  • Set to my own key and ensure execution works

Stack created with Sapling. Best reviewed with ReviewStack.

@rholinshead rholinshead marked this pull request as ready for review January 25, 2024 00:45
Ankush-lastmile pushed a commit that referenced this pull request Jan 25, 2024
Implementation of the HuggingFaceAutomaticSpeechRecognition Model parser using the inference endpoint to run inference. Python API takes in bytes as well as path, skip binary for now.

Very similar to #1018

## Testplan
<img width="1000" alt="Screenshot 2024-01-24 at 10 37 05 PM" src="https://github.com/lastmile-ai/aiconfig/assets/141073967/808956ce-e3be-4528-9f34-c8d31d704ddb">

1. Temporarily add model parser to Gradio Cookbook model parser registry.
```
    asr = HuggingFaceAutomaticSpeechRecognitionRemoteInference()
    AIConfigRuntime.register_model_parser(
        asr, asr.id()
    )
```

2. run AIConfig Edit on Gradio example

`python3 -m 'aiconfig.scripts.aiconfig_cli' edit --aiconfig-path=cookbooks/Gradio/huggingface.aiconfig.json --parsers-module-path=cookbooks/Gradio/hf_model_parsers.py --server-mode=debug_servers`
)
)

# Translation api doesn't support stream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can remove


# HuggingFace image_to_text outputs should only ever be string
# format so shouldn't get here, but just being safe
return json.dumps(output_data, indent=2)
Copy link
Member

@Ankush-lastmile Ankush-lastmile Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if its not a string it might make more sense to raise an exception here instead

Copy link
Member

@Ankush-lastmile Ankush-lastmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, surprising that the api only takes in one image input.

Ankush-lastmile pushed a commit that referenced this pull request Jan 25, 2024
Implementation of the HuggingFaceAutomaticSpeechRecognition Model parser using the inference endpoint to run inference. Python API takes in bytes as well as path, skip binary for now.

Very similar to #1018

## Testplan
<img width="1000" alt="Screenshot 2024-01-24 at 10 37 05 PM" src="https://github.com/lastmile-ai/aiconfig/assets/141073967/808956ce-e3be-4528-9f34-c8d31d704ddb">

1. Temporarily add model parser to Gradio Cookbook model parser registry.
```
    asr = HuggingFaceAutomaticSpeechRecognitionRemoteInference()
    AIConfigRuntime.register_model_parser(
        asr, asr.id()
    )
```

2. run AIConfig Edit on Gradio example

`python3 -m 'aiconfig.scripts.aiconfig_cli' edit --aiconfig-path=cookbooks/Gradio/huggingface.aiconfig.json --parsers-module-path=cookbooks/Gradio/hf_model_parsers.py --server-mode=debug_servers`
# [hf inference] Implement HuggingFaceImage2TextRemoteInference

Implement `HuggingFaceImage2TextRemoteInference` for running hf image-to-text models via inference API for gradio.  The api takes in an image of various supported types: `Union[str, Path, bytes, BinaryIO]`. For now, just implementing support for path and uri since it's not needed for gradio and my python skills aren't great.

<img width="1323" alt="Screenshot 2024-01-24 at 7 31 23 PM" src="https://github.com/lastmile-ai/aiconfig/assets/5060851/3581191f-3295-4dc0-b455-d2e613179639">



## Testing:
Build/install the local hugging face package with these changes
```
(hf) ryanholinshead@Ryans-MacBook-Pro aiconfig % cd extensions/HuggingFace
pip3 install build && cd python && python -m build && pip3 install dist/*.whl
pip3 install -e .
```
Register these parsers in `/Users/ryanholinshead/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py` then run aiconfig editor with the local parsers and gradio config:
```
aiconfig_path=/Users/ryanholinshead/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json

(hf) ryanholinshead@Ryans-MacBook-Pro aiconfig % parsers_path=/Users/ryanholinshead/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py


(hf) ryanholinshead@Ryans-MacBook-Pro aiconfig % aiconfig edit --aiconfig-path=$aiconfig_path --server-mode=debug_servers --parsers-module-path=$parsers_path
```

- Ensure the model works in the config. Ensure changing settings works & is persisted to aiconfig.
- Test setting fake model ('test') propagates expected error
- Test setting invalid api_token in the InferenceOptions in run server-side propagates expected error
- Set to my own key and ensure execution works
@rholinshead
Copy link
Member Author

Changes from review:

  • remove copy-pasted comment about translation api
  • raise ValueError if output data type isn't string, instead of silently failing by dumping to json

rholinshead added a commit that referenced this pull request Jan 25, 2024
HuggingFaceImage2TextRemoteInferencePromptSchema

# HuggingFaceImage2TextRemoteInferencePromptSchema

From
https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/inference/_client.py#L731.
Only 'model' is supported in settings. For input, multiple image types
are supported (but only one image), so using `image/*` mimetype to
support all subtypes

<img width="1323" alt="Screenshot 2024-01-24 at 7 31 23 PM"
src="https://github.com/lastmile-ai/aiconfig/assets/5060851/73891022-1270-45f5-a888-69afca3651cc">

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with
[ReviewStack](https://reviewstack.dev/lastmile-ai/aiconfig/pull/1019).
* __->__ #1019
* #1018
@rholinshead rholinshead merged commit b9ae489 into main Jan 25, 2024
1 check passed
Ankush-lastmile pushed a commit that referenced this pull request Jan 25, 2024
Implementation of the HuggingFaceAutomaticSpeechRecognition Model parser using the inference endpoint to run inference. Python API takes in bytes as well as path, skip binary for now.

Very similar to #1018

## Testplan
<img width="1000" alt="Screenshot 2024-01-24 at 10 37 05 PM" src="https://github.com/lastmile-ai/aiconfig/assets/141073967/808956ce-e3be-4528-9f34-c8d31d704ddb">

1. Temporarily add model parser to Gradio Cookbook model parser registry.
```
    asr = HuggingFaceAutomaticSpeechRecognitionRemoteInference()
    AIConfigRuntime.register_model_parser(
        asr, asr.id()
    )
```

2. run AIConfig Edit on Gradio example

`python3 -m 'aiconfig.scripts.aiconfig_cli' edit --aiconfig-path=cookbooks/Gradio/huggingface.aiconfig.json --parsers-module-path=cookbooks/Gradio/hf_model_parsers.py --server-mode=debug_servers`
Ankush-lastmile pushed a commit that referenced this pull request Jan 25, 2024
Implementation of the HuggingFaceAutomaticSpeechRecognition Model parser using the inference endpoint to run inference. Python API takes in bytes as well as path, skip binary for now.

Very similar to #1018

## Testplan
<img width="1000" alt="Screenshot 2024-01-24 at 10 37 05 PM" src="https://github.com/lastmile-ai/aiconfig/assets/141073967/808956ce-e3be-4528-9f34-c8d31d704ddb">

1. Temporarily add model parser to Gradio Cookbook model parser registry.
```
    asr = HuggingFaceAutomaticSpeechRecognitionRemoteInference()
    AIConfigRuntime.register_model_parser(
        asr, asr.id()
    )
```

2. run AIConfig Edit on Gradio example

`python3 -m 'aiconfig.scripts.aiconfig_cli' edit --aiconfig-path=cookbooks/Gradio/huggingface.aiconfig.json --parsers-module-path=cookbooks/Gradio/hf_model_parsers.py --server-mode=debug_servers`
Ankush-lastmile pushed a commit that referenced this pull request Jan 25, 2024
Implementation of the HuggingFaceAutomaticSpeechRecognition Model parser using the inference endpoint to run inference. Python API takes in bytes as well as path, skip binary for now.

Very similar to #1018

## Testplan
<img width="1000" alt="Screenshot 2024-01-24 at 10 37 05 PM" src="https://github.com/lastmile-ai/aiconfig/assets/141073967/808956ce-e3be-4528-9f34-c8d31d704ddb">

1. Temporarily add model parser to Gradio Cookbook model parser registry.
```
    asr = HuggingFaceAutomaticSpeechRecognitionRemoteInference()
    AIConfigRuntime.register_model_parser(
        asr, asr.id()
    )
```

2. run AIConfig Edit on Gradio example

`python3 -m 'aiconfig.scripts.aiconfig_cli' edit --aiconfig-path=cookbooks/Gradio/huggingface.aiconfig.json --parsers-module-path=cookbooks/Gradio/hf_model_parsers.py --server-mode=debug_servers`
Ankush-lastmile added a commit that referenced this pull request Jan 25, 2024
[hf inference] ASR remote inference model parser impl

Implementation of the HuggingFaceAutomaticSpeechRecognition Model parser
using the inference endpoint to run inference. Python API takes in bytes
as well as path, skip binary for now.

Very similar to #1018

## Testplan
<img width="1000" alt="Screenshot 2024-01-24 at 10 37 05 PM"
src="https://github.com/lastmile-ai/aiconfig/assets/141073967/808956ce-e3be-4528-9f34-c8d31d704ddb">

1. Temporarily add model parser to Gradio Cookbook model parser
registry.
```
    asr = HuggingFaceAutomaticSpeechRecognitionRemoteInference()
    AIConfigRuntime.register_model_parser(
        asr, asr.id()
    )
```

2. run AIConfig Edit on Gradio example

`python3 -m 'aiconfig.scripts.aiconfig_cli' edit
--aiconfig-path=cookbooks/Gradio/huggingface.aiconfig.json
--parsers-module-path=cookbooks/Gradio/hf_model_parsers.py
--server-mode=debug_servers`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants