nmslib: Optimized index raises RuntimeError on load when saved with `negdotprod` space

Basically, this is what I am trying to do

import nmslib

space = 'negdotprod'

vectors = [[1, 2], [3, 4], [5, 6]]

index = nmslib.init(space=space, method='hnsw')
index.addDataPointBatch(vectors)
index.createIndex(
    {'M': 15, 'efConstruction': 200, 'skip_optimized_index': 0, 'post': 0}
)
index.saveIndex('test.index')

new_index = nmslib.init(space=space, method='hnsw')
new_index.loadIndex('test.index')

and it raises

Check failed: totalElementsStored_ == this->data_.size() The number of stored elements 3 doesn't match the number of data points ! Did you forget to re-load data?
Traceback (most recent call last):
  File "8.py", line 15, in <module>
    new_index.loadIndex('test.index')
RuntimeError: Check failed: The number of stored elements 3 doesn't match the number of data points ! Did you forget to re-load data?

If I change space variable to cosinesimil, it works just fine. It seems that data points are not stored, even though hnsw method with skip_optimized_index=0 is used.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 2
  • Comments: 22 (15 by maintainers)

Commits related to this issue

Most upvoted comments

@chomechome this makes sense: so the default False (not loading the data) is probably Ok? The reason I want to keep by default is that many people are using HNSW and just query data, i.e., they don’t compute distances between arbitrary points. HNSW, for these common cases, does save the data on its own. So, for these people, saving/loading an additional copy of the data doesn’t make sense.

@ibeltagy if you are willing to help, please, drop me a line in a couple of weeks (leo at boytsov dot info). For now, I won’t be able to do anything (even giving instructions). Thank you!

@earboxer we need some generic code to save/load data, which can be invoked when the method doesn’t save data on its own. Plus, helping me to ship this (including testing, etc.).