dlib: Error while calling cudaGetDevice(&the_device_id) in file /home/ubuntu/src/dlib-19.4/dlib/dnn/gpu_data.cpp:178

The error happens only when I try to use compute_face_descriptor all other python bindings are working fine as far as I can tell.

The initialization is:

RESNET_MODEL = '/mnt/d1/faces_1/models/dlib_face_recognition_resnet_model_v1.dat'
facerec = dlib.face_recognition_model_v1(RESNET_MODEL)

Here is the result

  File "alignmentCheck.py", line 26, in avgEuclideanDistanceCalculation
    fps.append(facerec.compute_face_descriptor(img, shape, 10))
RuntimeError: Error while calling cudaGetDevice(&the_device_id) in file /home/ubuntu/src/dlib-19.4/dlib/dnn/gpu_data.cpp:178. code: 3, reason: initialization error
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "alignmentCheck.py", line 51, in <module>
    for x in p.imap_unordered(avgEuclideanDistanceCalculation, df['starId'].unique()):
  File "/usr/lib/python3.5/multiprocessing/pool.py", line 695, in next
    raise value
RuntimeError: Error while calling cudaGetDevice(&the_device_id) in file /home/ubuntu/src/dlib-19.4/dlib/dnn/gpu_data.cpp:178. code: 3, reason: initialization error

This is EC2 p2.xlarge instance with Tesla K80 GPU

$ nvidia-smi
Wed May 31 14:55:17 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.51                 Driver Version: 375.51                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 0000:00:1E.0     Off |                    0 |
| N/A   37C    P0    55W / 149W |      0MiB / 11439MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

$ cat /usr/local/cuda/version.txt
CUDA Version 8.0.61

$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR      5
#define CUDNN_MINOR      1
#define CUDNN_PATCHLEVEL 10

$ uname -a
Linux ip-172-31-23-151 4.4.0-1017-aws #26-Ubuntu SMP Fri Apr 28 19:48:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 1
  • Comments: 16 (5 by maintainers)

Most upvoted comments

@AMilkov @terencezl @shang-vikas @OptimusPrimeCao @axmadjon If using Python 3.4+ on unix-like platforms, you can use ‘spawn’ instead of ‘fork’ to start a process. That will not cause the problem. Example code from https://docs.python.org/3/library/multiprocessing.html:

import multiprocessing as mp

def foo(q):
    q.put('hello')

if __name__ == '__main__':
    ctx = mp.get_context('spawn')
    q = ctx.Queue()
    p = ctx.Process(target=foo, args=(q,))
    p.start()
    print(q.get())
    p.join()

That’s not python, that’s just how subprocesses and forking work in general. You can’t allocate arbitrary resources in a process and then fork and assume those resources are still valid. This is especially true for hardware resources like a GPU context.

Anyway, this isn’t a dlib problem. You just have to read the manual carefully for the pyhton multi-processing features you are using and understand their limitations and how to use them.