Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] The chat history content is truncated #2455

Open
TraceIvan opened this issue Mar 3, 2025 · 6 comments
Open

[Bug] The chat history content is truncated #2455

TraceIvan opened this issue Mar 3, 2025 · 6 comments

Comments

@TraceIvan
Copy link

TraceIvan commented Mar 3, 2025

Contact Information

No response

MaxKB Version

v1.10.1

Problem Description

In multiple rounds of Q&A, if the output of the previous round is too long (more than 100,000 tokens), the content in the chat history will be truncated, and only about the first 30,000 characters will be retained. It is hoped that the maximum length of the history record can be manually adjusted.

Steps to Reproduce

After using the deepseek model and outputting about 100,000 tokens, we continued with a round of Q&A and used the Specified Reply component to output the chat history variable. We found that the history content was truncated.

The expected correct result

No response

Related log output

Additional Information

No response

@Shenguobin0102
Copy link

你好,请问你的环境是不是ollama部署的DeepSeek模型,如果是的话,下个版本支持模型中设置 num_ctx 参数,这个参数是设置上下文长度的,下个版本发布后,请升级之后再试一下。

@shaohuzhang1
Copy link
Contributor

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Hello, is your environment a DeepSeek model deployed by ollama? If so, the next version supports setting the num_ctx parameter in the model. This parameter sets the context length. After the next version is released, please upgrade and try again.

@TraceIvan
Copy link
Author

你好,请问你的环境是不是ollama部署的DeepSeek模型,如果是的话,下个版本支持模型中设置 num_ctx 参数,这个参数是设置上下文长度的,下个版本发布后,请升级之后再试一下。

不是,是用vllm部署的,模型的上下文设置的是128k。工作流里的对话历史记录貌似在maxkb后端会直接将组件的输出内容进行截断(前端显示内容是完整的),应该和模型上下文关系不大:

Image

@shaohuzhang1
Copy link
Contributor

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Hello, is your environment a DeepSeek model deployed by ollama? If so, the next version supports setting the num_ctx parameter in the model. This parameter sets the context length. After the next version is released, please upgrade and try again.

No, it is deployed with vllm, and the context of the model is set to 128k. The dialogue history in the workflow seems to directly truncate the component's output content in the maxkb backend (the front-end display content is complete), which should have little to do with the model context:

Image

@z514987467
Copy link

你好,请问你的环境是不是ollama部署的DeepSeek模型,如果是的话,下个版本支持模型中设置 num_ctx 参数,这个参数是设置上下文长度的,下个版本发布后,请升级之后再试一下。

请问docker部署的,下一个版本什么时候发布,输入token总是被截断,确实影响回答的正确率

@shaohuzhang1
Copy link
Contributor

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Hello, is your environment a DeepSeek model deployed by ollama? If so, the next version supports setting the num_ctx parameter in the model. This parameter sets the context length. After the next version is released, please upgrade and try again.

I would like to ask if the next version is deployed by docker, and when will the input token be released? The input token is always truncated, which does affect the correctness of the answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants