langchain: Total token count of openai callback does not count embedding usage
When using embeddings, the total_tokens
count of a callback is wrong, e.g. the following example currently returns 0
even though it shouldn’t:
from langchain.callbacks import get_openai_callback
with get_openai_callback() as cb:
embeddings = OpenAIEmbeddings()
embeddings.embed_query("helo")
print(cb.total_tokens)
IMO this is confusing (and there is no way to get the cost from the embeddings class at the moment).
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 16
- Comments: 23 (6 by maintainers)
Commits related to this issue
- #945 implemented total_tokens for OpenAI embeddings — committed to benheckmann/langchain by benheckmann a year ago
- #945 added tests for callback handler token count for embeddings and ChatOpenAI (non-streaming) — committed to benheckmann/langchain by benheckmann a year ago
- #945 base class wrapper around embed_documents and embed_query to implement caching callbacks in — committed to benheckmann/langchain by benheckmann a year ago
This is for those who are looking for a quick workaround solution to count tokens when waiting for the PR to be merged:
Note that
tiktoken
is a library from OpenAI.currently counting tokens via embedding is not supported - this is a good feature request though
definitely a vote up. Embeddings on openAI are more expensive as compared to regular prompts/chats for our use cases. This visualisation is needed.
Voting up, this feature would be really valuable to have to track costs for computing embeddings and executing vector store operations.
@thaiminhpv appreciate the workaround! Noob question:
If I’m doing something like…
What would I pass into your function
num_tokens_from_string
? Iterate throughtexts
and pass each one in? And those would just be the prompt tokens, right? How could I estimate the completion tokens used in the above call?Thanks!