faiss: Faiss assertion 'err == CUBLAS_STATUS_SUCCESS'
Platform
OS: Debian 9
Faiss version: 1.6.3
Running on:
- CPU
- GPU
Interface:
- C++
- Python
Reproduction instructions
I’m trying to train faiss IVF index on Nvidia A100 instances. Currently A100 instances support cuda 11.0 I installed faiss with cuda toolkit 10.0
Code
>>> import numpy as np
>>> import faiss
>>>
>>> d = 256
>>> quantizer = faiss.IndexBinaryFlat(d)
>>> index = faiss.IndexBinaryIVF(quantizer, d, 4096)
>>> xt = faiss.randint((100000, 256 // 8)).astype('uint8')
>>>
>>> index.train(xt)
WARNING clustering 100000 points to 4096 centroids: please provide at least 159744 training points
>>> quantizer2 = faiss.IndexBinaryFlat(d)
>>> index2 = faiss.IndexBinaryIVF(quantizer2, d, 4096)
>>> index2 = faiss.IndexBinaryIVF(quantizer2, d, 4096)
>>> clustering_index = faiss.index_cpu_to_all_gpus(faiss.IndexFlatL2(d))
>>>
>>> index2.clustering_index = clustering_index
>>> index2.train(xt)
Exception log
Faiss assertion 'err == CUBLAS_STATUS_SUCCESS' failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<float, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<IndexType, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with AT = float; BT = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at ./faiss/gpu/utils/MatrixMult-inl.cuh:133; details: cublas failed (13): (512, 256) x (4096, 256)' = (512, 4096)
I guess this is due to cuda version mismatch. Is there a plan to support cuda 11.0?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15 (4 by maintainers)
It helped me to install a specific wheel with faiss-gpu==1.7.3:
Again, having the issue too.
A100, faiss-gpu==1.7.1, cuda==11.1, Ubuntu 20.04
CUBLAS_STATUS_SUCCESS is a relatively generic error. If you want help, open a new issue and give more context.
@naveenkumarmarri Were you able to solve your issue? I’m currently having the same problem with an a100 and CUDA 11