langchain: Pinecone.from_documents() does not create the index if it does not exist

System Info

langchain == 0.0.190 Google Colab

Who can help?

No response

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

docsearch = Pinecone.from_documents(chunks, embeddings, index_name=index_name) calls from_text which returns ValueError: No active indexes found in your Pinecone project, are you sure you’re using the right API key and environment? if the index does not exist.

if index_name in indexes: index = pinecone.Index(index_name) elif len(indexes) == 0: raise ValueError( "No active indexes found in your Pinecone project, " "are you sure you're using the right API key and environment?" )

Expected behavior

The method should create the index if it does not exist. This is the expected behavior from the documentation and the official examples.

Creating a new index if it does not exist is the only recommended behavior as it implies that the dimension of the index is always consistent with the dimension of the embedding model used.

Thanks

Best regards

Jerome

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 1
  • Comments: 16 (1 by maintainers)

Most upvoted comments

@takashi57 As langchain_community is deprecated, you can follow the below steps 

!pip install langchain-pinecone

from langchain_pinecone import Pinecone

import os 
os.environ['PINECONE_API_KEY'] = pinecone_api_key

vector_database_index = Pinecone.from_documents(
                                            index_name = index_name, 
                                            documents = chunks, 
                                            embedding = embedding)

This will help to resolve the issue.

From docs/modules/indexes/vectorstores/examples/pinecone.ipynb:

docsearch = Pinecone.from_documents(docs, embeddings, index_name=index_name)
# if you already have an index, you can load it like this
# docsearch = Pinecone.from_existing_index(index_name, embeddings)

From my understanding, the example here is highlighting that, if you have an existing index, you should use from_existing_index() rather than from_documents(). This might be due to concerns over accidentally overwriting existing indexes. Hope this helps!