haystack: Changing Elastic Search Docstore refresh_type=false with Embedding Retriever, resulting in error during query time
Describe the bug Unable to run query_by_embedding after indexing the documents with Elastic Search Docstore refresh_type=false with Embedding Retriever.
Error message A document doesn’t have a value for a vector field.
{"error":{"root_cause":[{"type":"script_exception","reason":"runtime error","script_stack":["org.elasticsearch.xpack.vectors.query.ScoreScriptUtils$DenseVectorFunction.getEncodedVector(ScoreScriptUtils.java:100)","org.elasticsearch.xpack.vectors.query.ScoreScriptUtils$CosineSimilarity.cosineSimilarity(ScoreScriptUtils.java:179)","cosineSimilarity(params.query_vector,'embedding') + 1000"," ^---- HERE"],"script":"cosineSimilarity(params.query_vector,'embedding') + 1000","lang":"painless","position":{"offset":37,"start":0,"end":56}}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"1july_test4","node":"j7lOBAVgT625FixOyow97Q","reason":{"type":"script_exception","reason":"runtime error","script_stack":["org.elasticsearch.xpack.vectors.query.ScoreScriptUtils$DenseVectorFunction.getEncodedVector(ScoreScriptUtils.java:100)","org.elasticsearch.xpack.vectors.query.ScoreScriptUtils$CosineSimilarity.cosineSimilarity(ScoreScriptUtils.java:179)","cosineSimilarity(params.query_vector,'embedding') + 1000"," ^---- HERE"],"script":"cosineSimilarity(params.query_vector,'embedding') + 1000","lang":"painless","position":{"offset":37,"start":0,"end":56},"caused_by":{"type":"illegal_argument_exception","reason":"A document doesn't have a value for a vector field!"}}}]},"status":400}
raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'runtime error')
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16 (12 by maintainers)
Thanks @tholor. Adding 10sec delay between write_documents() and update_embeddings() has made it work as expected for 1k document.
Hey, when i tried i was able to make it work. In the above case it looks like embedding field was not indexed in elastic. Was update embedding called ? When I tried with independent colab i was able to make it work with refresh_type=‘False’.