langchain: AzureOpenAI InvalidRequestError: Too many inputs. The max number of inputs is 1.
System Info
Langchain version == 0.0.166 Embeddings = OpenAIEmbeddings - model: text-embedding-ada-002 version 2 LLM = AzureOpenAI
Who can help?
Information
- The official example notebooks/scripts
- My own modified scripts
Related Components
- LLMs/Chat Models
- Embedding Models
- Prompts / Prompt Templates / Prompt Selectors
- Output Parsers
- Document Loaders
- Vector Stores / Retrievers
- Memory
- Agents / Agent Executors
- Tools / Toolkits
- Chains
- Callbacks/Tracing
- Async
Reproduction
Steps to reproduce:
- Set up azure openai embeddings by providing key, version etc…
- Load a document with a loader
- Set up a text splitter so you get more then 2 documents
- add them to chromadb with
.add_documents(List<Document>)
This is some example code:
pdf = PyPDFLoader(url)
documents = pdf.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
vectordb.add_documents(texts)
vectordb.persist()
Expected behavior
Embeddings be added to the database, instead it returns the error openai.error.InvalidRequestError: Too many inputs. The max number of inputs is 1. We hope to increase the number of inputs per request soon. Please contact us through an Azure support request at: https://go.microsoft.com/fwlink/?linkid=2213926 for further questions.
This is because Microsoft only allows one embedding at a time while the script tries to add the documents all at once.
The following code is where the issue comes up (I think): https://github.com/hwchase17/langchain/blob/258c3198559da5844be3f78680f42b2930e5b64b/langchain/embeddings/openai.py#L205-L214
The input should be a 1 dimentional array and not multi.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 28 (1 by maintainers)
Commits related to this issue
- remove unsupported params in AzureOpenAIEmbedding — committed to FlowiseAI/Flowise by chungyau97 a year ago
- fix(embeddings): number of texts in Azure OpenAIEmbeddings batch (#10707) This PR addresses the limitation of Azure OpenAI embeddings, which can handle at maximum 16 texts in a batch. This can be so... — committed to langchain-ai/langchain by mspronesti 9 months ago
- fix https://github.com/langchain-ai/langchain/issues/4575 — committed to exdatic/langsync by zeitderforschung 8 months ago
- fix https://github.com/langchain-ai/langchain/issues/4575 — committed to exdatic/langsync by zeitderforschung 8 months ago
I might have mitigated the issue by adding the chunk size to the embeddings:
embedding = OpenAIEmbeddings(deployment="embeddings",model="text-embedding-ada-002", chunk_size = 1)
In the javscript version of langchain the parameter chunk_size is named batchSize
@huislaw here is my solution, use
chunkify
function to customize max number of inputs (max=16)I was able to pass in 16 documents at a time too without the max number of input error, however I had quite a few documents. I used a for loop which worked for me however I had to add
time.sleep(2)
otherwise I got a rate limit warning:see this thread.
example code:
It looks like the team increased the limit, a
chunk_size
of 16 works for me (deployedtext-embedding-ada-002
).I’ve deployed my instance of Azure OpenAI to
eastus
(maybe quotas differ per Azure Region)chunk_size here in the Azure OpenAIEmbeddings() is referring to the number of embeddings it creates in parallel as opposed to the langchain chunk_size which is calculating the size of the chunk. chunk_size = 16 worked at this time.
As of 8/5/23, the easiest fix is to pass in
chunk_size=16
when creatingOpenAIEmbeddings
for an Azure deployment. Some of the other solutions here are more complicated than using this built-in functionality. As some have noted, the limit has been increased to 16 from 1.Confusingly, this value is distinct from the chunk size for text splitting. Here, the configuration tells the
OpenAIEmbeddings
object to create 16 embeddings at a time, which conforms to the Azure limit. In the TypeScript version of langchain, the name of this configuration is isbatchSize
.Not sure if @sunyq1995 means this, but this worked for me, and I think it was faster than doing
from_texts
How does following code work?