SPTAG: When the number ofvectors too big, index build will fail to complete!
When there are too many vectors, such as 5 million (n = 1024×1024×5), index build will fail to complete. The program stops at this place, prompting: “Save Data To xxxxx\vectors.bin” And I noticed that the file size of vectors.bin reach to 300G! that is not normal,the The correct file size should be 10G.
The code that caused the error is as follows:
import SPTAG
import numpy as np
n = 1024*1024*5 #this szie will cause the file size of vectors.bin to reach 300G!
k = 3
r = 3
Dimension=512 #the size of vectors.bin will be 1024×1024×512×4=2G n×Dimension×4
def testBuild(algo, distmethod, x, out):
i = SPTAG.AnnIndex(algo, 'Float', x.shape[1])
i.SetBuildParam("NumberOfThreads", '4')
i.SetBuildParam("DistCalcMethod", distmethod)
ret = i.Build(x.tobytes(), x.shape[0])
i.Save(out)
def Test(algo, distmethod):
x = np.ones((n, Dimension), dtype=np.float32) * np.reshape(np.arange(n, dtype=np.float32), (n, 1))
q = np.ones((r, Dimension), dtype=np.float32) * np.reshape(np.arange(r, dtype=np.float32), (r, 1)) * 2
print ("Build.............................")
testBuild(algo, distmethod, x, 'testindices')
if __name__ == '__main__':
Test('BKT', 'L2')
How can I solve this error? Please Help Me!
About this issue
- Original URL
- State: open
- Created 5 years ago
- Comments: 15 (3 by maintainers)
I have tried your python code, and the index build can finish successfully. What is your runtime environment?