llama_index: [Bug]: VectorStoreIndex.from_document KeyError

Bug Description

Getting key error with: VectorStoreIndex.from_documents using a MLFloatGateway Langchain embedding model.

The trace shows:

    |_embedding ->  0.970113 seconds
    |_embedding ->  0.973638 seconds
    |_embedding ->  0.97082 seconds
    |_embedding ->  0.969955 seconds
    |_embedding ->  0.494139 seconds
    |_exception ->  0.0 seconds
      |_exception ->  0.0 seconds

Version

0.9.1

Steps to Reproduce

I loaded a folder of .pdf files and then used VectorStoreIndex

Would it be due to some issue to do with the embedding model? If so, How can I debug and check? I tried using logging:
https://docs.llamaindex.ai/en/stable/getting_started/starter_example.html#viewing-queries-and-events-using-logging

but that was too verbose and didn’t show any 404 or other errors

Relevant Logs/Tracbacks

No response

About this issue

  • Original URL
  • State: closed
  • Created 7 months ago
  • Comments: 15

Most upvoted comments

I think it’s an error on VectorStoreIndex. I had a running code, and now running it, throws an adjacent error:

index = VectorStoreIndex.from_documents(documents, storage_context=storage_context, service_context=service_context)

throws:

----> 1 index = VectorStoreIndex.from_documents(documents, storage_context=storage_context, service_context=service_context)

File ~/opt/anaconda3/envs/test/lib/python3.11/site-packages/llama_index/indices/base.py:97, in BaseIndex.from_documents(cls, documents, storage_context, service_context, show_progress, **kwargs)
     95 with service_context.callback_manager.as_trace("index_construction"):
     96     for doc in documents:
---> 97         docstore.set_document_hash(doc.get_doc_id(), doc.hash)
     99     nodes = run_transformations(
    100         documents,  # type: ignore
    101         service_context.transformations,
    102         show_progress=show_progress,
    103         **kwargs,
    104     )
    106     return cls(
    107         nodes=nodes,
    108         storage_context=storage_context,
   (...)
    111         **kwargs,
    112     )

AttributeError: 'str' object has no attribute 'set_document_hash'