Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval bug: Jinja parser not working with QwQ-32B #12231

Closed
Edremon opened this issue Mar 6, 2025 · 4 comments · Fixed by #12235
Closed

Eval bug: Jinja parser not working with QwQ-32B #12231

Edremon opened this issue Mar 6, 2025 · 4 comments · Fixed by #12235
Labels
bug Something isn't working

Comments

@Edremon
Copy link

Edremon commented Mar 6, 2025

Name and Version

version: 4842 (3d652bf)
built with cc (GCC) 14.2.1 20250207 for x86_64-pc-linux-gnu

Operating systems

Linux

GGML backends

CPU

Hardware

Irrelevant

Models

bartowski/Qwen_QwQ-32B-GGUF

Problem description & steps to reproduce

I would like to use --jinja to have the thinking separated. But, it doesn't work with that model. I initially thought that it was a problem with that specific GGUF, but, I tried copying the official jinja template, and I'm getting the same error. The jinja template is extremely similar to the included one Qwen-Qwen2.5-7B-Instruct.jinja (which work), the differences are:

diff --git a/Qwen-Qwen2.5-7B-Instruct.jinja b/QwQ-official.jinja
index bdf7919a..8844e349 100644
--- a/Qwen-Qwen2.5-7B-Instruct.jinja
+++ b/QwQ-official.jinja
@@ -3,7 +3,7 @@
     {%- if messages[0]['role'] == 'system' %}
         {{- messages[0]['content'] }}
     {%- else %}
-        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}
+        {{- '' }}
     {%- endif %}
     {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
     {%- for tool in tools %}
@@ -14,14 +14,16 @@
 {%- else %}
     {%- if messages[0]['role'] == 'system' %}
         {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
-    {%- else %}
-        {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}
-    {%- endif %}
+  {%- endif %}
 {%- endif %}
 {%- for message in messages %}
-    {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
         {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" and not message.tool_calls %}
+        {%- set content = message.content.split('</think>')[-1].lstrip('\n') %}
+        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
     {%- elif message.role == "assistant" %}
+        {%- set content = message.content.split('</think>')[-1].lstrip('\n') %}
         {{- '<|im_start|>' + message.role }}
         {%- if message.content %}
             {{- '\n' + message.content }}
@@ -50,5 +52,5 @@
     {%- endif %}
 {%- endfor %}
 {%- if add_generation_prompt %}
-    {{- '<|im_start|>assistant\n' }}
+    {{- '<|im_start|>assistant\n<think>\n' }}
 {%- endif %}

I have tried touching it a bit, I initially thought that the lstrip weren't supported, since no others included template use it but, even removing those, I get the same error.

First Bad Commit

No response

Relevant log output

srv    load_model: load_model: The chat template that comes with this model is not yet supported, falling back to chatml. This may cause the model to output suboptimal responses
@CISC
Copy link
Contributor

CISC commented Mar 6, 2025

I think the problem is that minja doesn't support str.split (nor str.lstrip I guess).

cc @ochafik

@Edremon
Copy link
Author

Edremon commented Mar 6, 2025

The R1's jinja template have set content = content.split('</think>')[-1] in them, I haven't tested if they work, but why would they be included if they don't?

@CISC
Copy link
Contributor

CISC commented Mar 6, 2025

I guess it would be easier to tell what the problem was if server also output the original runtime error...

@ochafik
Copy link
Collaborator

ochafik commented Mar 6, 2025

@Edremon thanks for the report & @CISC for the ping :-)

Adding the missing str methods to Minja in google/minja#56

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants