feat(google-common): Support get num tokens #7818

hans00 · 2025-03-09T18:29:28Z

Support getNumTokens for Google LLM / chat LLM

vercel · 2025-03-09T18:29:32Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
langchainjs-docs	✅ Ready (Inspect)	Visit Preview		Mar 9, 2025 6:42pm

1 Skipped Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchainjs-api-refs	⬜️ Ignored (Inspect)			Mar 9, 2025 6:42pm

afirstenberg · 2025-03-10T00:53:18Z

We didn't support this before? Ooof. Good catch!

A few thoughts, however.

This only has an implementation on the connection object.
- This object is lower level and typically not used by developers.
- The connection object is somewhat abstract. It doesn't have specific methods or implementations attached to it.
The definition of getNumTokens() is on BaseLanguageModelInterface with a definition in BaseLanguageModel, so we should implement it in ChatGoogleBase and GoogleBaseLLM
- It would make sense that both of these would create and call a GoogleNumTokensConnection or something along those lines that would subclass GoogleAIConnection (or maybe AbstractGoogleLLMConnection) which would define the method for buildUrlMethod() to "countTokens". I'm not sure what, if anything, can be done for Claude or any future API.

Does this make sense @hans00?
Thoughts @benjamincburns?

benjamincburns · 2025-03-10T01:44:58Z

We didn't support this before? Ooof. Good catch!

Agreed - thanks for this @hans00!

@afirstenberg I defer to your expertise on the design here. Just for my own education, though - it looks like countTokens is only relevant for LLMs (not text embeddings), so it makes sense to me to move it out of GoogleConnection, as that's used for embeddings and LLMs. The bit I can't answer for myself however, is why it wouldn't be appropriate to add something to call the countTokens method directly to AbstractGoogleLLMConnection. That would be reachable via the connection classes that service both GoogleChatBase and GoogleBaseLLM, but wouldn't expose it to BaseGoogleEmbeddings.

Probably a heavier lift, but Google's recommendation appears to be to use the tokenizer that ships with the Python SDK. I don't suppose there's a JS port for that, is there? If so, it'd be relatively straightforward to set it (or some function that wraps it) as the _encoder field of GoogleChatBase and GoogleBaseLLM. In theory that would work with the existing machinery in BaseLanguageModel and require even less change.

benjamincburns · 2025-03-10T01:53:54Z

This came up in a quick search - doesn't support gemini-1.0-pro-vision, nor does it list support for any of the gemini-2.0 models (although I'm not sure if that's because they're using a new tokenizer for those, or because they haven't updated the docs).

https://www.npmjs.com/package/@lenml/tokenizer-gemini

jacoblee93

.

benjamincburns

Sorry for the thrashing, but this needs some revision before it can be merged. See discussion on the PR.

feat(google-common): Support get num tokens

7820824

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 9, 2025

dosubot bot added the auto:improvement Medium size change to existing code to handle new use-cases label Mar 9, 2025

vercel bot deployed to Preview – langchainjs-docs March 9, 2025 18:42 View deployment

jacoblee93 approved these changes Mar 11, 2025

View reviewed changes

dosubot bot added the lgtm PRs that are ready to be merged as-is label Mar 11, 2025

jacoblee93 added hold On hold and removed lgtm PRs that are ready to be merged as-is labels Mar 11, 2025

benjamincburns requested changes Mar 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(google-common): Support get num tokens #7818

feat(google-common): Support get num tokens #7818

hans00 commented Mar 9, 2025

vercel bot commented Mar 9, 2025 •

edited

Loading

afirstenberg commented Mar 10, 2025

benjamincburns commented Mar 10, 2025 •

edited

Loading

benjamincburns commented Mar 10, 2025

jacoblee93 left a comment •

edited

Loading

benjamincburns left a comment

feat(google-common): Support get num tokens #7818

Are you sure you want to change the base?

feat(google-common): Support get num tokens #7818

Conversation

hans00 commented Mar 9, 2025

vercel bot commented Mar 9, 2025 • edited Loading

afirstenberg commented Mar 10, 2025

benjamincburns commented Mar 10, 2025 • edited Loading

benjamincburns commented Mar 10, 2025

jacoblee93 left a comment • edited Loading

Choose a reason for hiding this comment

benjamincburns left a comment

Choose a reason for hiding this comment

vercel bot commented Mar 9, 2025 •

edited

Loading

benjamincburns commented Mar 10, 2025 •

edited

Loading

jacoblee93 left a comment •

edited

Loading