-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Server cant handle two streaming connections in same time #897
Comments
Find this parameter of server: llama-cpp-python/llama_cpp/server/app.py Lines 165 to 168 in 8207280
Changed this to false and issue resolved - streaming working from 2 terminals. |
@ArtyomZemlyak that should be better documented or maybe not the default behaviour. Currently working on #771 which will improve this by allowing multiple requests to efficiently be processed in parallel. |
agree,because must be response until stream request have to end up |
@abetlen Is there way to enable concurrent stream responses ? because this param only makes the another client idle until the first one finish |
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
Current Behavior
ValueError: invalid literal for int() with base 16: b''
Environment and Context
Failure Information (for bugs)
Error on client side (first terminal):
Error on server side:
The text was updated successfully, but these errors were encountered: