Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI可观测插件未显示首个token的rt(流式请求)的AI监控面板 #1835

Open
5309609 opened this issue Feb 28, 2025 · 7 comments
Assignees

Comments

@5309609
Copy link

5309609 commented Feb 28, 2025

在k8s环境中,使用OLLAMA发布deepseek,配置higress路由,并在策略中开启AI统计插件。
由jmeter向deepseek路由发送流式请求压测,在AI监控面板上未找到首个token的rt(流式请求)的相关信息,并且在发送流式请求的时候token per second,input token per second ,output token per second 也无数据展示

@rinfx
Copy link
Collaborator

rinfx commented Feb 28, 2025

重新获取下最新的proxy插件再试试,或者手动在请求body里面添加一下以下内容试试

"stream_options": {
    "include_usage": true
}

@5309609
Copy link
Author

5309609 commented Feb 28, 2025

@rinfx 加了配置后token per second,input token per second ,output token per second 三个监控有数据了。
但是依然没找到首个token的rt(流式请求)的相关信息。

Image

@5309609
Copy link
Author

5309609 commented Feb 28, 2025

higress是helm重新安装的higress-2.0.5,插件应再重新安装后会重新拉取吧。

Image

@johnlanni
Copy link
Collaborator

@rinfx console 里的 dashboard json文件要更新下

@5309609
Copy link
Author

5309609 commented Feb 28, 2025

OLLAMA发的32b的模型,流式请求里面加了usage配置后请求返回有了usage,但是监控面板tps还是0。

Image

Image

@5309609
Copy link
Author

5309609 commented Feb 28, 2025

使用vllm部署的deepseekcode模型,流式请求就有数据。

Image

Image

@5309609
Copy link
Author

5309609 commented Feb 28, 2025

@rinfx console 里的 dashboard json文件要更新下

最新的文件能发出来,然后我本地直接替换吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants