Async HTTP Python Client not working properly #6887

mutkach · 2024-02-15T09:52:00Z

Description

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa6 in position 5: invalid start byte when requesting a model config using an async http client.

When observing the http response directly using tritonclient.http.aio without wrapper, I've noticed that async client does not decompress the http response by itself, so it seems to be fixable with setting auto_decompress=True on aiohttp.ClientSession. I believe in my case the compression is imposed elsewhere (by nginx?).

If that was intended (as the auto_decompress=True is the default setting) then some additional logic is required to process the compressed response (calling to brotli.decompress() fixed it in my case).
In any case I'll be happy to provide the necessary fixes.

Triton Information

Triton server version 2.38.0
Triton container 23.09
tritonclient==2.33.0, 2.42.0
nvidia-pytriton==0.2.5, 0.3.0 and 0.5.1 when built from source

To Reproduce

import asyncio
import numpy
from pytriton.client import AsyncioModelClient, ModelClient
from pytriton.client.utils import get_model_config

HOST = "HOST:80"
model_name = "intent_classifier" #does not work
message = "Simple and correct query for testing"

async def run_classification(inferer_server_endpoint: str, clf_model_name: str, message: str, **_):
    async with AsyncioModelClient(inferer_server_endpoint, clf_model_name) as client:
        inference_client = client.create_client_from_url(inferer_server_endpoint)
        config = await inference_client.get_model_config(model_name)
        print(config)

def sync_classification(inferer_server_endpoint: str, clf_model_name: str, message: str, **_):
    with ModelClient(inferer_server_endpoint, clf_model_name) as client:
        inference_client = client.create_client_from_url(inferer_server_endpoint)
        config = inference_client.get_model_config(model_name)
        print(config)

if __name__ == "__main__":
    sync_classification(HOST, model_name, message) # Works
    #asyncio.run(run_classification(HOST, model_name, message)) # Does not

config is attached
classifier_config.txt

Expected behavior
Model configuration should be returned in a valid json format to be parsed into python dict

The text was updated successfully, but these errors were encountered:

kthui · 2024-02-15T17:43:25Z

Thanks for reporting the issue, I have filed a ticket for us to investigate further.

In any case I'll be happy to provide the necessary fixes.

Any contribution is welcomed!

kthui · 2024-03-09T02:08:34Z

Hi @mutkach, I took a deeper look into the Python AsyncIO client and seems like we already have decompression built in. When calling the async infer(), it will:

read the Content-Encoding header from the response header
pass the Content-Encoding header to InferResult class that reads the response body
the InferResult class will auto-decompress the response body based on the Content-Encoding header set by the server.

I believe in my case the compression is imposed elsewhere (by nginx?).

Would you be able to share the response headers received, when encountering this issue?

kthui mentioned this issue Mar 12, 2024

Add AsyncIO HTTP compression test #6975

Merged

kthui closed this as completed in #6975 Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async HTTP Python Client not working properly #6887

Async HTTP Python Client not working properly #6887

mutkach commented Feb 15, 2024 •

edited

Loading

kthui commented Feb 15, 2024

kthui commented Mar 9, 2024

Async HTTP Python Client not working properly #6887

Async HTTP Python Client not working properly #6887

Comments

mutkach commented Feb 15, 2024 • edited Loading

kthui commented Feb 15, 2024

kthui commented Mar 9, 2024

mutkach commented Feb 15, 2024 •

edited

Loading