Anthropic's "Thinking Mode" API #663

lispy-ai · 2025-02-25T04:03:20Z

lispy-ai
Feb 25, 2025

In light of @mwolson's helpful note today #662 I note also that Claude Sonnet 3.7 makes a critical addition to the anthropic API to enable thinking. It seems to be fully backward compatible.

I am not that knowledgeable of the GPTel code base yet, and I note the general preference to maintain the KISS principle and ensure minimal impact of new changes on existing functionality. With this in mind @karthink, I've started thinking (pun intended) about how one might integrate this new capability into GPTel without actually embarking on trying to do it yet.

Anthropic have done something new here by creating a model with a single API that allows individual calls to scale between a single step inference and a user managed multi-step reasoning response. It is anyone's guess at this point as to whether this idea will catch on to become more universal. On the assumption that it will, this suggestion captures my initial thoughts about how this capability could be integrated into GPTel. I'm interested in know if you think I'm on the right track. If so it might worth a PR.

This capability will likely require:

new request parameters to enable thinking mode
a parameter for controlling the thinking budget and
support for the extended output tokens.

So perhaps something like:

three new model parameters are needed, perhaps
- :thinking-capable flag
- :max-thinking-tokens limit
- :max-output-tokens limit
Another approach is just the second two parameters where thinking capable is implicitly nil when the max thinking tokens is set to zero. That would be my personal preference. This is especially the case if other model vendors choose to implement this form of API to their models. In the contrary perhaps for other reasons it would be important that GPTel has a flag it can test to see whether a model is thinking capable or not.
The gptel--request-data method will need to be modified to add the thinking configuration (when it's enabled).

Perhaps two new customization variables.

(defcustom gptel-thinking-mode nil
  "Whether to enable extended thinking mode for capable models.
Only supported by Claude 3.7 Sonnet and later models at the moment.")

(defcustom gptel-thinking-budget 32000
  "Token budget for extended thinking mode.
Only applicable when `gptel-thinking-mode' is enabled.")

Again, perhaps we only need the second one, where a value zero, implies thinking mode is switched off or unavailable.

add some additions to the gptel-menu menu
- Toggle for thinking mode (maybe, see above)
- An input to specify the thinking budget
- Display current thinking mode status
Changes to how the response is handled for thinking events. This speaks to some related issues and discussions such as Removing <think>....</think> blocks from Deepseek responses #579, Explicit Tool Turns #626, Add response color-coding & role setting for gptel buffer #343, and Include DeepSeek's reasoning_content when present #592. I do not express a strong opinion on this yet.

These changes would enable GPTel users to take advantage of Claude 3.7's extended thinking capabilities while maintaining the existing workflow and interface familiarity.

My goal here is to present a starting point for discussion rather than a complete solution, which I htink is appropriate given GPTel's development philosophy and the novelty of Anthropic's thinking mode approach.

Thoughts?

tshu-w · 2025-02-25T04:58:00Z

tshu-w
Feb 25, 2025

Let me add a small point, I hope gptel can support OpenRouter's "thinking" mode too.

3 replies

karthink Feb 25, 2025
Maintainer

Yes, there are a few different ways that thinking is indicated across the various APIs. I'll try to cover all of them. It'll be a couple of weeks from now.

tshu-w Feb 25, 2025

Take your time ;) I would like to help and submit PRs when I have time (probably in April).

karthink Mar 9, 2025
Maintainer

Done. Openrouter should support gptel-include-reasoning out of the box, i.e. with the usual gptel-make-openai backend definition.

karthink · 2025-02-25T05:08:30Z

karthink
Feb 25, 2025
Maintainer

Thanks for the news and suggestions. I am going to be too busy until the second half of March to work on this. For now:

I added claude-3-7-sonnet-20250219 to the list of Anthropic models in gptel.
Until there is official support for specifying the thinking budget dynamically in gptel, you can set it "statically" by making a new backend.

(gptel-make-anthropic "Claude-thinking"
  :key "your-API-key"
  :stream t
  :models '(claude-3-7-sonnet-20250219)
  :header (lambda () (when-let* ((key (gptel--get-api-key)))
                  `(("x-api-key" . ,key)
                    ("anthropic-version" . "2023-06-01")
                    ("anthropic-beta" . "pdfs-2024-09-25")
                    ("anthropic-beta" . "output-128k-2025-02-19")
                    ("anthropic-beta" . "prompt-caching-2024-07-31"))))
  :request-params '(:thinking (:type "enabled" :budget_tokens 32000)))

Select this backend to use Claude 3.7 Sonnet in "thinking" mode.

The work of including the thinking output in the buffer (or to see it elsewhere) is half done. I should be able to finish it in the first half of March.

9 replies

wlauppe Feb 25, 2025

I was pointed from
#662
to this discussions, thanks!

with the suggestion from Karthik i was able to get Claude 3.7 working thinking mode.
thats the config i added:

(gptel-make-anthropic "Claude Thinking"
  :key gptel-api-key
  :stream t
  :models '(claude-3-7-sonnet-20250219)
  :header (lambda () (when-let* ((key (gptel--get-api-key)))
                       `(("x-api-key" . ,key)
                          ("anthropic-version" . "2023-06-01")
                          ("anthropic-beta" . "pdfs-2024-09-25")
                          ("anthropic-beta" . "output-128k-2025-02-19")
                          ("anthropic-beta" . "prompt-caching-2024-07-31"))))
  :request-params '(:thinking (:type "enabled" :budget_tokens 1024)))

@claytharrison
this is an additional block of configuration in your emacs-config, after that you can either chose "Claude" or "Claude Thinking" in "gptel-menu", If you access sonnet 3.7 via the old "claude" config, you still have the option to get the new model, with extended thinking feature turned off.

Two things i realized: :

Thoughts should be optionally visible
there ar now new "thinking blocks"

you can view them in the gptel-log but they are not exposed to the ui.
I think they should be (optionally) be exposed. Maybe a simple solution is good for starters, like the

<think> Thought1 Thought2 ... </think>

solution with the Deepseek-R1-Model
as i come accustomed to using Deepseek-R1 I realized, that often the toughts are more important than the final response.

extended_thinking_budget_tokens should have a reasonable default in percentage of the max tokens (maybe 30%?)
otherwise you get a lot of error messages, if you change one of these parameters, that they exceed your max tokens, and apparently the minimum value is 1024.

lispy-ai Feb 25, 2025
Author

Have a look at PR #592

wlauppe Feb 25, 2025

thanks for the link

ekenberg Feb 25, 2025

@karthink thank you!

How would one specify a separate max_tokens within gptel-make-anthropic? Tried :max_tokens but that didn't work. Not sure I want to globally setq max_tokens to something high for all defined models just to give room for Claude Thinking?

karthink Feb 25, 2025
Maintainer

@ekenberg

:request-params '(:thinking (:type "enabled" :budget_tokens 1024)
                  :max_tokens 4096)

karthink · 2025-02-25T08:24:40Z

karthink
Feb 25, 2025
Maintainer

So to be clear, if I don't want to use thinking mode, just swap out my model to `claude-3-7-sonnet-20250219` and use as usual?

Yes, there's no change to how gptel or the Anthropic API works out of the box.

0 replies

karthink · 2025-03-08T19:51:56Z

karthink
Mar 8, 2025
Maintainer

Including the thinking block with the response of claude-3-7-sonnet-20250219 is now implemented.

The user option is gptel-include-reasoning, and it should work for all backends/models that include reasoning blocks (except Openrouter, I think). There are many supported behaviors for this block. As with most options you can set it from the transient menu with whatever scope you need.

Please update and test.

0 replies

karthink · 2025-03-09T01:00:50Z

karthink
Mar 9, 2025
Maintainer

I've added support for handling reasoning blocks for Deepseek and OpenRouter as well.

If you use any API that produces reasoning content that you can't see or handle in gptel, please let me know.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anthropic's "Thinking Mode" API #663

{{title}}

Replies: 5 comments 12 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Anthropic's "Thinking Mode" API #663

lispy-ai Feb 25, 2025

Replies: 5 comments · 12 replies

tshu-w Feb 25, 2025

karthink Feb 25, 2025 Maintainer

tshu-w Feb 25, 2025

karthink Mar 9, 2025 Maintainer

karthink Feb 25, 2025 Maintainer

wlauppe Feb 25, 2025

lispy-ai Feb 25, 2025 Author

wlauppe Feb 25, 2025

ekenberg Feb 25, 2025

karthink Feb 25, 2025 Maintainer

karthink Feb 25, 2025 Maintainer

karthink Mar 8, 2025 Maintainer

karthink Mar 9, 2025 Maintainer

lispy-ai
Feb 25, 2025

Replies: 5 comments 12 replies

tshu-w
Feb 25, 2025

karthink Feb 25, 2025
Maintainer

karthink Mar 9, 2025
Maintainer

karthink
Feb 25, 2025
Maintainer

lispy-ai Feb 25, 2025
Author

karthink Feb 25, 2025
Maintainer

karthink
Feb 25, 2025
Maintainer

karthink
Mar 8, 2025
Maintainer

karthink
Mar 9, 2025
Maintainer