tensorflow: TensorFlow binary crashes on Apple M1 in x86_64 Docker container

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
  • TensorFlow installed from (source or binary): Binary
  • TensorFlow version (use command below): TensorFlow 2.6.0, tf-nightly 2.8.0.dev20211028
  • Python version: 3.6.9, 3.7.x, 3.8.x
  • CUDA/cuDNN version: N/A
  • GPU model and memory: N/A

Describe the current behavior

dwyatte-macbookpro:~ dwyatte$ docker run tensorflow/tensorflow:latest python -c "import tensorflow as tf"    
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
2021-10-28 22:50:41.481158: F tensorflow/core/lib/monitoring/sampler.cc:42] Check failed: bucket_limits_[i] > bucket_limits_[i - 1] (0 vs. 10)
qemu: uncaught target signal 6 (Aborted) - core dumped

Describe the expected behavior Clean exit

Standalone code to reproduce the issue Requires an Apple M1 (arm64) host OS: docker run tensorflow/tensorflow:latest python -c "import tensorflow as tf"

This was previously mentioned in https://github.com/tensorflow/tensorflow/issues/42387 but unfortunately closed. When importing TensorFlow in an x86_64 docker container on an Apple M1, TensorFlow crashes. As far as I can tell, this should work as I can import and use other Python packages in the same container without problems (including things like numpy).

It’s unclear whether this is something that can be avoided at the TensorFlow level or an unavoidable bug in qemu ([1], [2]), but I wanted to reraise the issue.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 79
  • Comments: 49 (16 by maintainers)

Commits related to this issue

Most upvoted comments

Hi,

I have the exact same issue. It is hindering my development process. While my app is deployed on an x86 server, I do need to use my M1 mac with emulation to develop code locally and to push it to production.

All other major data science packages work correctly under x86 rosetta emulation: pandas, scikit-learn, torch, transformers, spacy, xgboost, lightgbm.

I appreciate the great work you are doing with TensorFlow. I would be really grateful if you could take the time to help the data scientists / ML engineers out there who are using ARM-based development laptops.

Thanks a lot,

Alex

PS: I am not interested in forks like tensorflow-macos etc as I need my work to be cross-platform.

I am taking a class where we use tensorflow inside docker containers and everybody with an M1 mac in that class had this exact same issue including me. Unfortunately nobody has found a fix so I am going to subsribe to this issue as well, I hope there exist some kind of workarround/solution!

While this issue was originally opened around emulating TensorFlow on x86_64 in Docker, it does look like there are now tensorflow aarch64 binaries that can be used in linux/arm64/v8 Docker containers. More info here: https://blog.tensorflow.org/2022/09/announcing-tensorflow-official-build-collaborators.html

Dockerfile

FROM python:3.7-slim

RUN pip install tensorflow==2.10.0 tensorflow-io==0.27.0
CMD python -c "import tensorflow as tf; print(tf.constant(42) / 2 + 2)"
docker build --platform=linux/arm64/v8 . -t tensorflow
docker run --platform=linux/arm64/v8 tensorflow 
tf.Tensor(23.0, shape=(), dtype=float64)

@coreation

So update since I last wrote. Last time above I said I got it working using emulation (so running on x86). This is not ideal because it’s slow. It’s much faster if you can get it running on aarch64. It all depends on what other libraries you are installing and what your base docker image looks like.

What docker image are you using? When you type uname -m within the docker container running do you get aarch64 or x86_64?

If you are doing something like pip install tensorflow==2.3.1 its going to use one of the official tensorflow wheels. And which one it uses depends on if the docker container is running aarch64 or x86_64. But both will have issues.

If you are running x86_64 then its gonna be slow. But to get it working you need to install unofficial community wheel without AVX. You can see a list of such options here. For example, when I was trying to go this x86_64 emulation route I first tried TensorFlow 2.7.0 No AVX, No GPU, Python 3.7, 3.8, 3.9, Ubuntu 18.04, multiple Archs which is this link. I tried the barcelona one and all of its builds are here. You then have to choose the one for your TF version and Python version. I tried TF 2.7 and Python 3.8 so I used tensorflow-2.7.0-cp38-cp38-linux_x86_64.whl which is this link: https://tf.novaal.de/barcelona/tensorflow-2.7.0-cp38-cp38-linux_x86_64.whl. So to install this with pip you do pip install -f https://tf.novaal.de/barcelona/tensorflow-2.7.0-cp38-cp38-linux_x86_64.whl.

I don’t recommend this route ^^ though because emulation on m1 chip not ideal. Better to get your docker container running on aarch64. For example I use the docker image FROM python:3.8-slim. And I am using an aarch64 community wheel for Tensorflow. For example, in dockerfile RUN pip install tensorflow -f https://tf.kmtea.eu/whl/stable.html. I got the wheel from here.

Now in our DEV/PROD ENV we actually manage all dependencies with poetry and are running everything on x86 so we actually just use the official tensorflow wheels. Our apps dont run on aarch64 in production. But since I use a macbook with m1 chip for local development I needed to get my local dev env running on m1 chip. And its great/fast if I use aarch64. So I had to uninstall tensorflow official from my container and then use the above mentioned hack of installing Tensorflow AARCH64. But its working great.

Let me know if you need more details.

So I do think this is due to AVX instructions. If I install an unofficial wheel (e.g., from https://github.com/yaroslavvb/tensorflow-community-wheels/issues/198) and run a variant of the docker run command above, I do not get a crash on import.

dwyatte-macbookpro:~ dwyatte$ docker run -it tensorflow/tensorflow:latest bash -c 'pip uninstall -y tensorflow-cpu && pip install -U https://tf.novaal.de/barcelona/tensorflow-2.6.0-cp38-cp38-linux_x86_64.whl && python -c "import tensorflow as tf; tf.print(\"hello world\")"'
...
2021-11-15 23:44:35.660302: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
hello world

Thanks for the lead @bhack. I agree, some solutions which you mention are: 1.) Publishing non-AVX wheels (or having non-AVX code paths available within a single wheel) 2.) Correctly handling in qemu via emulation/TCG/etc.

any update?

For performance you need to use tensorflow-macos

@sachinprasadhs Will Google release prebuilt ARM64 Docker images to Docker Hub? I’m especially interested in an ARM64 tensorflow/serving image.

Thanks @DrChrisLevy!

It also turns out tensorflow has Dockerfiles for arm64v8, but they don’t push the images, so you need to build them yourself. Useful for trying to keep development docker images as close as possible to the official ones.


Edit: Two gotchas:

  1. Remove the enum34 python package from the Dockerfile
  2. Remove the fixed version of numpy in the Dockerfile (can let the version be the latest)

For posterity:

git clone https://github.com/tensorflow/tensorflow.git --depth 1
cd tensorflow/tensorflow/tools/dockerfiles

# edit arm64v8/devel-cpu-arm64v8-jupyter.Dockerfile and remove enum34 / remove the version from numpy

docker build -f ./arm64v8/devel-cpu-arm64v8-jupyter.Dockerfile -t tf-devel-cpu-arm64v8-jupyter .

Then make a new Dockerfile, and use this:

FROM tf-devel-cpu-arm64v8-jupyter

# Or use this version: RUN pip install tensorflow-aarch64 -f https://tf.kmtea.eu/whl/stable.html

RUN pip install tensorflow -f https://tf.kmtea.eu/whl/stable.html

@learning-to-play It would be great for the community if we had prebuilt images for all architectures that we support. 🙏

This is still an issue. It is just sad the fact that we (M-chip users)do not have official docker support.

@dwyatte , Thanks for confirming, if your issue is resolved, could you please close this issue.

Sure, I think we can close this now. QEMU also appears to have merged AVX instructions, so once that is pulled into Docker, it might also be possible to run via emulation.

https://gitlab.com/qemu-project/qemu/-/issues/164#note_1140802183

No update on this?

@janvdp have you made any more progress in the last 2 months ? I added the --platform argument to my docker file to run Linux/amd and installed an unofficial non avx tensor flow 2.7. Finally got the image built but damn it’s slow. The unit tests are so much slower. Also many of the tests fail because for some reason some of the rounded precision assertions are different etc.

If you have made any progress let me know !

https://github.com/apple/tensorflow_macos/issues/164#issuecomment-776785984

https://github.com/ARM-software/Tool-Solutions/tree/master/docker/tensorflow-aarch64

But as someone still needs to use this in emulation I suppose in that It could be a qemu BUG with DBL_MAX in emulation

Are there any update here? Not being able to run tensorflow in docker on our new powerful M1-based machines is both a big downer and quite the annoyance to “solve”. ATM our best workaround is having an intel-based VM for development, which feels quite unnecessary.

Thanks @mohantym

The links just reference the warning above which I believe is innocuous since Docker can emulate the image’s platform. TensorFlow doesn’t publish official linux/arm64/v8 images (would require an aarch64 TensorFlow build), but I would think that would remove the warning. Note that the problem is specifically with TensorFlow’s assumptions about the emulated platform and not the image or other libraries, which run fine when emulating linux/amd64:

dwyatte-macbookpro:~ dwyatte$ docker run tensorflow/tensorflow:latest python -c "import numpy as np; print(np.random.rand(10))"   
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
[0.86125896 0.40657583 0.76832123 0.77205272 0.99326573 0.513298
 0.64218547 0.15977918 0.37553315 0.56692333]

I suspect Check failed: bucket_limits_[i] > bucket_limits_[i - 1] (0 vs. 10) is a sanity check that TensorFlow runs on startup that fails under emulation. IMO this issue is about whether there is anything that can be done on the TensorFlow side to relax or correct this check or whether this is a critical check that is violated e.g., by qemu (https://gitlab.com/qemu-project/qemu/-/issues/601 suggests it could be floating point inaccuracy, although that seems to just be a guess).

The slowdown with non-AVX TensorFlow wheels is likely mainly due to emulation (compared to anything inherent to TensorFlow itself). IMO an ideal outcome would be to have official aarch64 wheels on PyPI that can use platform detection to pip install tensorflow without having to specify a community webpage. This would also seamlessly support the workflows being discussed here (running an aarch64 image locally and building/deploying an amd64 image in production)

While I suspect this issue might have more visibility, https://github.com/tensorflow/tensorflow/issues/52973 is probably better for discussing aarch64 wheels.

Here are some benchmarks for a subset of my unit tests (roughly an order of magnitude slower under emulation):

dwyatte-macbookpro:tensorflow-test dwyatte$ ./build_and_test.sh 
[+] Building 0.9s (8/8) FINISHED                                                                                                                                                                                        
 => [internal] load build definition from Dockerfile_amd64                                                                                                                                                         0.0s
 => => transferring dockerfile: 209B                                                                                                                                                                               0.0s
 => [internal] load .dockerignore                                                                                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                                                                    0.0s
 => [internal] load metadata for docker.io/library/python:3.7-slim                                                                                                                                                 0.8s
 => [1/4] FROM docker.io/library/python:3.7-slim@sha256:71287598b4d9fcc01fa3949035aeace14c6bde733462bb4f64fb4ee13c6b3fec                                                                                           0.0s
 => CACHED [2/4] RUN pip install https://tf.novaal.de/westmere/tensorflow-2.7.0-cp37-cp37m-linux_x86_64.whl                                                                                                        0.0s
 => CACHED [3/4] RUN pip install pandas                                                                                                                                                                            0.0s
 => CACHED [4/4] RUN pip install pytest                                                                                                                                                                            0.0s
 => exporting to image                                                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                                                            0.0s
 => => writing image sha256:af60a6f5973c2a8568619c233580d5046670c3dfa5633e287e20869a99a9365e                                                                                                                       0.0s
 => => naming to docker.io/library/tensorflow-test                                                                                                                                                                 0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
================================================================================================= test session starts ==================================================================================================
platform linux -- Python 3.7.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /host
collected 19 items                                                                                                                                                                                                     

tests/test_callbacks.py ...........                                                                                                                                                                              [ 57%]
tests/test_datasets.py ...                                                                                                                                                                                       [ 73%]
tests/test_layers.py .....                                                                                                                                                                                       [100%]

=================================================================================================== warnings summary ===================================================================================================
../usr/local/lib/python3.7/site-packages/flatbuffers/compat.py:19
  /usr/local/lib/python3.7/site-packages/flatbuffers/compat.py:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp

-- Docs: https://docs.pytest.org/en/stable/warnings.html
============================================================================================ 19 passed, 1 warning in 3.17s =============================================================================================
[+] Building 0.5s (8/8) FINISHED                                                                                                                                                                                        
 => [internal] load build definition from Dockerfile_aarch64                                                                                                                                                       0.0s
 => => transferring dockerfile: 474B                                                                                                                                                                               0.0s
 => [internal] load .dockerignore                                                                                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                                                                    0.0s
 => [internal] load metadata for docker.io/library/python:3.7-slim                                                                                                                                                 0.4s
 => [1/4] FROM docker.io/library/python:3.7-slim@sha256:71287598b4d9fcc01fa3949035aeace14c6bde733462bb4f64fb4ee13c6b3fec                                                                                           0.0s
 => CACHED [2/4] RUN pip install https://snapshots.linaro.org/ldcg/python/tensorflow-io-manylinux/7/tensorflow-io-gcs-filesystem/tensorflow_io_gcs_filesystem-0.21.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2  0.0s
 => CACHED [3/4] RUN pip install pandas                                                                                                                                                                            0.0s
 => CACHED [4/4] RUN pip install pytest                                                                                                                                                                            0.0s
 => exporting to image                                                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                                                            0.0s
 => => writing image sha256:f1a848557f756bf7c99fdbe7bf666d3cb4272eec73a63bdd8abe20283e492aef                                                                                                                       0.0s
 => => naming to docker.io/library/tensorflow-test                                                                                                                                                                 0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
================================================================================================= test session starts ==================================================================================================
platform linux -- Python 3.7.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /host
collected 19 items                                                                                                                                                                                                     

tests/test_callbacks.py ...........                                                                                                                                                                              [ 57%]
tests/test_datasets.py ...                                                                                                                                                                                       [ 73%]
tests/test_layers.py .....                                                                                                                                                                                       [100%]

=================================================================================================== warnings summary ===================================================================================================
../usr/local/lib/python3.7/site-packages/flatbuffers/compat.py:19
  /usr/local/lib/python3.7/site-packages/flatbuffers/compat.py:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp

-- Docs: https://docs.pytest.org/en/stable/warnings.html
============================================================================================ 19 passed, 1 warning in 0.32s =============================================================================================

If you are using an old 1.x version of Tensorflow, downgrading to max 1.5, Python 3.6, and running docker targeting x86 is a quick, crappy solution since AVX was introduced in 1.6.

We don’t really use Tensorflow for anything critical (was kinda a devs pet project a few years back) so it works for us. Sounds like either qemu needs to support avx to move this foward OR Tensorflow starts publishing multi arch wheels without avx?

@hoangmt So what do I put in my Pipfile, if I want a version that I can install both in docker on my M1 machine, and build on linux/x64 to send to for example Vertex AI as a training job?

Locally, I don’t care about GCP access, it just needs to work so I can develop models with a tiny version of my dataset. The M1-ish GPUs are not good for training models anyway. The “Neural Engine” only accelerates inference AFAIK.

Basically this

IMO an ideal outcome would be to have official aarch64 wheels on PyPI that can use platform detection to pip install tensorflow without having to specify a community webpage. This would also seamlessly support the workflows being discussed here (running an aarch64 image locally and building/deploying an amd64 image in production)

@coreation not sure about tensorflow-serving, I’m not using it.

Re: numpy I hit that too, updated the comment above.