tensorflow: CUDA_ERROR_MISALIGNED_ADDRESS on MNIST example
Summary
What might be causing this error when running python tensorflow/models/image/mnist/convolutional.py?
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_MISALIGNED_ADDRESS
Environment info
Operating System: Linux Lounge 4.5.6-200.fc23.x86_64 #1 SMP Wed Jun 1 21:28:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Installed version of CUDA and cuDNN:
(please attach the output of ls -l /path/to/cuda/lib/libcud*
):
ls -l /usr/local/cuda-7.5/lib64/libcud*
-rw-r–r–. 1 root root 322936 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudadevrt.a
lrwxrwxrwx. 1 root root 16 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudart.so -> libcudart.so.7.5
lrwxrwxrwx. 1 root root 19 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudart.so.7.5 -> libcudart.so.7.5.18
-rwxr-xr-x. 1 root root 383336 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudart.so.7.5.18
-rw-r–r–. 1 root root 720192 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudart_static.a
-rwxr-xr-x. 1 root root 61453024 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so
-rwxr-xr-x. 1 root root 61453024 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so.4
-rwxr-xr-x. 1 root root 61453024 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so.4.0.7
-rwxr-xr-x. 1 root root 59909104 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so.5
-rwxr-xr-x. 1 root root 59909104 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so.5.0.5
-rw-r–r–. 1 root root 62025862 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn_static.a
If installed from binary pip package, provide:
1. Which pip package you installed.
export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0rc0-cp27-none-linux_x86_64.whl pip install --upgrade $TF_BINARY_URL
2. The output from python -c "import tensorflow; print(tensorflow.__version__)"
.
python -c “import tensorflow; print(tensorflow.version)”
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
If installed from sources, provide the commit hash:
Steps to reproduce
1 python tensorflow/models/image/mnist/convolutional.py. 2. Observe errror CUDA_ERROR_MISALIGNED_ADDRESS 3. Scratch head
What have you tried?
- Searching the internet for clues, none found
Logs or other output that would be helpful
(If logs are large, please upload as attachment). Results of cuda-memcheck and dmesg error.txt
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 21 (7 by maintainers)
From an offline conversation, we can confirm that this problem goes away:
So it does seem like a JIT compiler issue that goes away the latest driver.
@johnfrombluff, @tsitsilin, @acowlikeobject, @kalleknast, @dzupin, @floringogianu, @MartianWearables, sorry that we cannot reproduce this problem on our side. I will try to guess where the problem is and see whether it could be fixed.
Among folks who encountered this problem, what is common is that all used gm107 and gm108 based GPUs. That is compute capability 5.0. TensorFlow binary by default carries compute capability 3.5 and 5.2. The Cuda driver will extract the compute 3.5 PTX and JIT compile into compute 5.0 SASS upon the first run. Given the error message is “Invalid local read of size 16”, my current guess is that the JIT compiler in the Cuda driver is generating wrong code for tf.nn.softmax on GPUs with compute capability 5.0.
Here are a number of things to try:
If #1 still fails, we can dump the SASS code from your binary and see what goes wrong.
I’ve run into the same problem exactly as described by floringogianu except w/ Ubuntu 16.04 and gcc 4.9. Also, i used the --override flag when installing cuda toolkit via the .run script, which may or may not be relevant. cifar10 runs fine.
To expand @zheng-xq fix:
nvidia-smi | grep "Driver Version"
(or by using the GUI:nvidia-settings
)nvidia-smi
might be handy to find your GPU)sudo apt-get install gcc-4.9 g++-4.9
edit: Updating the driver seems not to be that easy (see ask.SE question). @zheng-xq Could you please add some details how to build tensorflow setting it explicitly to 5.0? Is it possible to build tensorflow when one installed CUDA via
apt-get
(and thus does not have one cuda folder)?I tried running the mnist example after I installed TensorFlow in virtualenv and I got the same error, Ubuntu 16, gcc 5.3.1, python 3.5.1, Driver Version: 361.42, cuda 7.5, this time with a GTX960 with 4GiB, which should be more than enough for this network model:
python -m tensorflow.models.image.mnist.convolutional
edit: Running
cifar10
model seems to be working just fine…@zheng-xq See the same error when running MNIST test.
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_MISALIGNED_ADDRESS
Also on Ubuntu 14.04, Cuda v7.5, Cudnn v4. Use the nvidia-docker using this image.
This is using a GTX 960M (use it for sanity checks before spinning up servers).
I’m calling via Keras MNIST example. Same example works fine using Theano backend (via Keras configuration).
cuda-memcheck.txt environment.txt
Sorry, I’m still confused. What block size are you referring to? Which file(s) should I look at to find what you’re talking about?
I’m trying to run example code that comes with the tensorflow distribution. Shouldn’t that code run on all supported architectures? Maybe GNU/Linux or my GPU is not supported, but I haven’t noticed that in the documentation?
And thank you for your attempt to help me!