langchain: ConversationalRetrievalChain doesn't return score with sources

System Info

LangChain v0.0.171 ChromaDB v0.3.22 Python v3.10.11

Who can help?

No response

Information

The official example notebooks/scripts
My own modified scripts

Related Components

LLMs/Chat Models
Embedding Models
Prompts / Prompt Templates / Prompt Selectors
Output Parsers
Document Loaders
Vector Stores / Retrievers
Memory
Agents / Agent Executors
Tools / Toolkits
Chains
Callbacks/Tracing
Async

Reproduction

This is my code:

 def askQuestion(self, collection_id, question):
        collection_name = "collection-" + str(collection_id)
        self.llm = ChatOpenAI(model_name=self.model_name, temperature=self.temperature, openai_api_key=os.environ.get('OPENAI_API_KEY'), streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))
        self.memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True,  output_key='answer')
        
        chroma_Vectorstore = Chroma(collection_name=collection_name, embedding_function=self.embeddingsOpenAi, client=self.chroma_client)

        self.chain = ConversationalRetrievalChain.from_llm(self.llm, chroma_Vectorstore.as_retriever(),
                                                             return_source_documents=True,verbose=VERBOSE, 
                                                             memory=self.memory)

    
        result = self.chain({"question": question})
        print(result)

        res_dict = {
            "answer": result["answer"],
        }

        res_dict["source_documents"] = []

        for source in result["source_documents"]:
            res_dict["source_documents"].append({
                "page_content": source.page_content,
                "metadata":  source.metadata
            })

        return res_dict

Expected behavior

When I print the result directly after result = self.chain({"question": question}), I get displayed sources, metadata, kwargs, question, chat_history.

I see here: https://github.com/hwchase17/langchain/blob/0c3de0a0b32fadb8caf3e6d803287229409f9da9/langchain/vectorstores/chroma.py#L165 and in line 182 in the official source code, that the similarity_search_with_score() is being called by default.

How can I also display the score than?

About this issue

Original URL
State: closed
Created a year ago
Reactions: 1
Comments: 21 (3 by maintainers)

Commits related to this issue

Support similarity_score_threshold retrieval with Chroma (#5655) Fixes https://github.com/hwchase17/langchain/issues/5067 Verified the following code now works correctly: ``` db = Chroma(persist... — committed to langchain-ai/langchain by jiayao a year ago
Support similarity_score_threshold retrieval with Chroma (#5655) Fixes https://github.com/hwchase17/langchain/issues/5067 Verified the following code now works correctly: ``` db = Chroma(persist... — committed to Undertone0809/langchain by jiayao a year ago

Most upvoted comments

@hwchase17 @agola11 @vowelparrot

Should we add a flag to ConversationalRetrievalChain like similarity = True to get similarity scores from the chain? I can raise a PR, can you assign me this issue?

+16

AvikantSrivastava on May 21, 2023

Here’s my hacky solution. It’s a retriever that saves the score as part of the documents metadata. I just took the original _get_relevant_documents and rewrote the part @khaledalarja pointed us at in his comment.

class MyVectorStoreRetriever(VectorStoreRetriever):
    # See https://github.com/langchain-ai/langchain/blob/61dd92f8215daef3d9cf1734b0d1f8c70c1571c3/libs/langchain/langchain/vectorstores/base.py#L500
    def _get_relevant_documents(
        self, query: str, *, run_manager: CallbackManagerForRetrieverRun
    ) -> List[Document]:
        docs_and_similarities = (
            self.vectorstore.similarity_search_with_relevance_scores(
                query, **self.search_kwargs
            )
        )

        # Make the score part of the document metadata
        for doc, similarity in docs_and_similarities:
            doc.metadata["score"] = similarity

        docs = [doc for doc, _ in docs_and_similarities]
        return docs

Instead of doing vectordb.as_retriever I instantiate the retriever by doing:

retriever = MyVectorStoreRetriever(
   vectorstore=vectordb,
   search_type="similarity_score_threshold",
   search_kwargs={"score_threshold": similarity_threshold, "k": 3},
)

Good luck 😃

+11

baswenneker on Oct 13, 2023

This issue should be reopened, the issue is still not solved, the PR #5655 is not related to the behavior of ConversationalRetrievalChain, it’s only specific for Chroma, but here it’s the problem, scores are neglected with the _

khaledalarja on Jul 28, 2023

@KeshavSingh29 great question - the short answer is we’re working with the maintainers of the vector stores towards that goal.

The longer answer is that each of the vector stores use different distance or similarity functions to compute scores (that also frequently are sensitive to the embeddings you’re using). Sometimes it’s a distance metric (lower number means more similar), while (less often) it’s a similarity metric (higher number is more similar), and the results are usually unbounded.

We could add a default implementation that tries to capture the most common use cases but we have refrained from trying to put a ‘one size fits all’ normalization function in the base class to avoid providing erroneous results

vowelparrot on May 23, 2023

Probably a dumb question, but what version of which package do I need to have for this to work? I’ve just done a pip install yesterday and it’s still not returning scores.

jezza7770 on Jun 22, 2023

Took me a bit to find the correct imports :

from langchain.schema.vectorstore import VectorStoreRetriever
from langchain.callbacks.manager import CallbackManagerForRetrieverRun
from langchain.schema.document import Document
from typing import List
...

Benvorth on Oct 31, 2023

is this resolved? I am trying to get a score when using ConversationalRetrivalChain with the FAISS vector store.

prasoons075 on Sep 1, 2023

@vowelparrot, I am using ConversationalRetrievalChain to chat over multiple files (some of them are PDF, some docx, txt,…). So for me is very important to also return the source, so I easily know from which file the answer came. This however works great if I ask the model question, which is related to the content from some of the documents. If question is very different, like ‘What is the distance from moon to mars?’ or ‘Write me a song about …’ the model still gives me source, but that source is the first embedding in database. This technically is not ‘wrong’, because this still is source but the distance from vector is very very large.

Here score comes in to make wonders. I could easily than filter the sources before displaying them, if the score is low.

I hope this will be a good enough answer for you to raise the PR for @AvikantSrivastava to integrate this funcionality.

Wish all of you a nice day.

zigax1 on May 22, 2023