meilisearch: Input/output error (os error 5) with a 800GB index

Describe the bug I’m currently encountering an issue with MeiliSearch where I’m unable to add or update any documents in my index. The error message I’m receiving is:

  "indexUid": "tweets",
  "status": "failed",
  "type": "documentAdditionOrUpdate",
  "uid": 411,
  "details": {
    "receivedDocuments": 300000,
    "indexedDocuments": 0
  },
  "canceledBy": null,
  "error": {
    "message": "Maximum database size has been reached.",
    "code": "database_size_limit_reached",
    "type": "internal",
    "link": "https://docs.meilisearch.com/errors#database_size_limit_reached"
  },
  "duration": "PT724.524772376S",
  "startedAt": "2023-04-09T18:44:16.101Z",
  "enqueuedAt": "2023-04-09T17:26:50.353Z",
  "finishedAt": "2023-04-09T18:56:20.626Z"
}

My database size is only 700GB, so I’m not sure why I’m hitting the limit. I’ve tried deleting some documents to free up space, but I’m still receiving the same error message.

To Reproduce Steps to reproduce the behavior:

  • Attempt to add or update documents to an existing index
  • Receive error message “Maximum database size has been reached.”

Expected behavior

I should be able to add or update documents to my index without encountering a database size limit error because I haven’t reached the documented limit https://docs.meilisearch.com/learn/advanced/known_limitations.html#maximum-database-size.

Meilisearch version: 1.0.2

Additional context

  • Official Docker image (getmeili/meilisearch - e9738e170728)

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 22 (11 by maintainers)

Most upvoted comments

Hello @omarmhaimdat 👋, some news from me,

I was able to reproduce the issue on the machine you provided to me. Thank you for being so helpful 😊

Adding debug statements to the database that Meilisearch depends on (LMDB), I was able to track down the issue 🎉. It seems related to us sending a value with a larger size than supported by LMDB in practice on Linux. I opened an issue upstream: https://bugs.openldap.org/show_bug.cgi?id=10054.

I want to do some further investigation to understand why we are sending such a large value to store in LMDB in this case. I can’t think of a value we would send that should be bigger than 600MB, but here we’re trying to store a value bigger than 2GB. Regardless of LMDB’s issue that prevents us from storing these 2GBs, the documented maximum size for values in LMDB is around 4GB, and 2GB is getting dangerously close to that.

Meanwhile, you could probably mitigate the issue by tuning the settings to shorten the list of searchableAttributes. Please note that changing your settings will trigger a reindexing process that might take a very long time given the current size of your database.

Hey @dureuill,

I wanted to express my gratitude for your prompt response and for keeping me informed about the progress of the investigation into the EIO error.

Following your advice, I have completely revamped the indexes by carefully selecting the most relevant searchable attributes and even added some filterable attributes. The results have been remarkable. Initially, I was stuck at 68 million indexed items, but now I have surpassed that number. The indexing process is still ongoing and may take another week or so to complete. Currently, it’s close to reaching 100 million items, and I haven’t encountered any issues whatsoever. The only drawback is that the indexing time is gradually increasing as the index size grows, but I suppose that’s to be expected.

Additionally, I was curious if there might be a bug affecting the number of cores used for indexing. To expedite the process, I have adjusted the configuration file to increase the max_indexing_threads to 28. Surprisingly, the CPU usage has remained quite low thus far.

Thank you 👍, I hope we’ll get to the bottom of this!

Hey @omarmhaimdat,

I misread your issue. Yours is unrelated to the tasks queue, but your tweets index is full. In v1.0, we have a limit on the index size. What you should do is dump your database and reimport it in v1.1. Version v1.1 no more have index size limits ✈️

I hope this help!

Hey @omarmhaimdat, @sam-ulrich1,

I am sorry to hear that you found a bug. We also found it on our side. The issue here is not with the size of an index but the size of the tasks queue. It is full as it reached 10 GiB. We do not have a proper fix for now and would love your feedback in this thread to implement the best solution.

I have implemented a quick fix in this PR, but we are discussing it with the team as we are wondering if it is great to increase the size of the tasks queue. It will just make the problem to happen later.

In the meantime, you should reindex and make sure to clear the tasks you don’t care about anymore i.e., successful, canceled.

I am sorry about this bug 🌨️

Currently experiencing the same issue repeatedly with as little as 9M docs and ~46G