langchain: Pinecone retriever throwing: KeyError: 'text'
My query code is below:
pinecone.init(
api_key=os.environ.get('PINECONE_API_KEY'), # app.pinecone.io
environment=os.environ.get('PINECONE_ENV') # next to API key in console
)
index = pinecone.Index(index_name)
embeddings = OpenAIEmbeddings(openai_api_key=os.environ.get('OPENAI_API_KEY'))
vectordb = Pinecone(
index=index,
embedding_function=embeddings.embed_query,
text_key="text",
)
llm=ChatOpenAI(
openai_api_key=os.environ.get('OPENAI_API_KEY'),
temperature=0,
model_name='gpt-3.5-turbo'
)
retriever = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectordb.as_retriever()
)
tools = [Tool(
func=retriever.run,
description=tool_desc,
name='Product DB'
)]
memory = ConversationBufferWindowMemory(
memory_key="chat_history", # important to align with agent prompt (below)
k=5,
return_messages=True
)
agent = initialize_agent(
agent='chat-conversational-react-description',
tools=tools,
llm=llm,
verbose=True,
max_iterations=3,
early_stopping_method="generate",
memory=memory,
)
If I run:
agent({'chat_history':[], 'input':'What is a product?'})
It throws:
File “C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\vectorstores\pinecone.py”, line 160, in similarity_search text = metadata.pop(self._text_key) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ KeyError: ‘text’
This is the offending block in site-packages/pinecone.py:
for res in results["matches"]:
# print('metadata.pop(self._text_key) = ' + metadata.pop(self._text_key))
metadata = res["metadata"]
text = metadata.pop(self._text_key)
docs.append(Document(page_content=text, metadata=metadata))
If I remove my tool like the line below, everything executes (just not my tool):
tools = []
Can anyone help me fix this KeyError: ‘text’ issue? My versions of langchain, pinecone-client and python are 0.0.147, 2.2.1 and 3.11.3 respectively.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 15
Pinecone doesn’t store documents explicitly; it only stores ids, embeddings, and metadata. So, if when querying Pinecone you’d like to have access to the documents themselves, you should add them to the metadata, as illustrated here. If you add them under a
text
key in the metadata, theKeyError
should resolve.Alternatively, you can specify your own key name when you initialize
Pinecone
, viatext_str
. Code reference is here.Your alternative is to store the document text elsewhere and then store a reference / URL to it in the pinecone metadata. This is what I do with Tana notes, for example. Pinecone is the search index - Tana is the content repository.
On Thu, Jun 15, 2023 at 3:13 PM Peak Prosperity @.***> wrote:
+1 here. Subscribing.
Same problem here!! I’m using
from_documents