langchain: Total token count of openai callback does not count embedding usage

When using embeddings, the total_tokens count of a callback is wrong, e.g. the following example currently returns 0 even though it shouldn’t:


from langchain.callbacks import get_openai_callback

with get_openai_callback() as cb:
    embeddings = OpenAIEmbeddings()
    embeddings.embed_query("helo")
    print(cb.total_tokens)

IMO this is confusing (and there is no way to get the cost from the embeddings class at the moment).

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 16
  • Comments: 23 (6 by maintainers)

Commits related to this issue

Most upvoted comments

This is for those who are looking for a quick workaround solution to count tokens when waiting for the PR to be merged:

import tiktoken

def num_tokens_from_string(string: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding("cl100k_base")
    num_tokens = len(encoding.encode(string))
    return num_tokens

num_tokens_from_string("tiktoken is great!")

Note that tiktoken is a library from OpenAI.

currently counting tokens via embedding is not supported - this is a good feature request though

definitely a vote up. Embeddings on openAI are more expensive as compared to regular prompts/chats for our use cases. This visualisation is needed.

Voting up, this feature would be really valuable to have to track costs for computing embeddings and executing vector store operations.

@thaiminhpv appreciate the workaround! Noob question:

If I’m doing something like…

db = Chroma.from_documents(texts, embeddings)

What would I pass into your function num_tokens_from_string? Iterate through texts and pass each one in? And those would just be the prompt tokens, right? How could I estimate the completion tokens used in the above call?

Thanks!