annoy: Memory Leakage
Hi there,
annoy (great library!) is our goto solution for NN search. I’m creating multiple annoy indices, add multiple samples to each, save each, and unload each:
import memory
from annoy import AnnoyIndex
import numpy as np
dim = 49152
nsamples = 209
def vectors_to_add():
v = nsamples * [None]
for i in range(len(v)):
v[i] = np.zeros(dim)
return v
def build_index(count):
knn = AnnoyIndex(dim, 'euclidean')
v = vectors_to_add()
for i,vector in enumerate(v):
knn.add_item(i,vector)
knn.save('knn-%d.aix' %count)
knn.unload()
del knn # to make sure (?)
for count in range(100):
build_index(count)
print 'Added index no.', count, ': total memory -> ', memory.memory()/(1024**3), 'GB'
When running on a (virtualized) Debian 9, I experience massive memory leakage: Added index no. 0 : total memory -> 0.459384918213 GB Added index no. 1 : total memory -> 0.769920349121 GB Added index no. 2 : total memory -> 1.003074646 GB Added index no. 3 : total memory -> 1.2364730835 GB Added index no. 4 : total memory -> 1.46962738037 GB …
When running on an Ubuntu 16.04, things are fine: Added index no. 0 : total memory -> 0.232261657715 GB Added index no. 1 : total memory -> 0.231666564941 GB Added index no. 2 : total memory -> 0.231666564941 GB Added index no. 3 : total memory -> 0.231666564941 GB Added index no. 4 : total memory -> 0.231666564941 GB Added index no. 5 : total memory -> 0.231666564941 GB …
Remarks:
- I’m not even running build() on the index structure, just adding samples.
- I looked at the C-code in annoylib.h and cannot find any leak.
- I wonder if memory mapping (mmap/munmap) is the issue (?) or anything with the C-Python-interface (?)
- The memory measurement (memory.py) is self-implemented and drawn from here (reads from /proc/pid/status, should be OK, I can share if it helps).
- I tried with annoy installed from pip (1.9.3) and from source (same issue).
My questions:
- Is there any hint to what’s going on? (I’m aware this may be rather a platform issue than an issue with annoy itself. Still, any pointers are greatly appreciated 😉
- What system libraries can I check/update?
- Should I check if MAP_POPULATE (annoylib.h) is set? (and if yes, how do I do that?)
Kind regards (and thank you so much!), Adrian
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 15
thanks for all the help identifying the bug and nailing down the problematic version!