tensorflow: TensorFlow binary crashes on Apple M1 in x86_64 Docker container
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
- TensorFlow installed from (source or binary): Binary
- TensorFlow version (use command below): TensorFlow 2.6.0, tf-nightly 2.8.0.dev20211028
- Python version: 3.6.9, 3.7.x, 3.8.x
- CUDA/cuDNN version: N/A
- GPU model and memory: N/A
Describe the current behavior
dwyatte-macbookpro:~ dwyatte$ docker run tensorflow/tensorflow:latest python -c "import tensorflow as tf"
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
2021-10-28 22:50:41.481158: F tensorflow/core/lib/monitoring/sampler.cc:42] Check failed: bucket_limits_[i] > bucket_limits_[i - 1] (0 vs. 10)
qemu: uncaught target signal 6 (Aborted) - core dumped
Describe the expected behavior Clean exit
Standalone code to reproduce the issue
Requires an Apple M1 (arm64) host OS:
docker run tensorflow/tensorflow:latest python -c "import tensorflow as tf"
This was previously mentioned in https://github.com/tensorflow/tensorflow/issues/42387 but unfortunately closed. When importing TensorFlow in an x86_64 docker container on an Apple M1, TensorFlow crashes. As far as I can tell, this should work as I can import and use other Python packages in the same container without problems (including things like numpy).
It’s unclear whether this is something that can be avoided at the TensorFlow level or an unavoidable bug in qemu ([1], [2]), but I wanted to reraise the issue.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 79
- Comments: 49 (16 by maintainers)
Commits related to this issue
- Use community version of Tensorflow that works with M1 The TensorFlow binary downloaded from a normal TensorFlow 2.3.1 pip install (from requirements) was crashing when we used the linux/x86_64 emula... — committed to meedan/alegre by hartsick 2 years ago
- Use community version of Tensorflow that works with M1 The TensorFlow binary downloaded from a normal TensorFlow 2.3.1 pip install (from requirements) was crashing when we used the linux/x86_64 emula... — committed to meedan/alegre by hartsick 2 years ago
- Bump ujson from 1.35 to 5.4.0 (#243) * Bump ujson from 1.35 to 5.4.0 Bumps [ujson](https://github.com/ultrajson/ultrajson) from 1.35 to 5.4.0. - [Release notes](https://github.com/ultrajson/ultra... — committed to meedan/alegre by dependabot[bot] 2 years ago
Hi,
I have the exact same issue. It is hindering my development process. While my app is deployed on an x86 server, I do need to use my M1 mac with emulation to develop code locally and to push it to production.
All other major data science packages work correctly under x86 rosetta emulation: pandas, scikit-learn, torch, transformers, spacy, xgboost, lightgbm.
I appreciate the great work you are doing with TensorFlow. I would be really grateful if you could take the time to help the data scientists / ML engineers out there who are using ARM-based development laptops.
Thanks a lot,
Alex
PS: I am not interested in forks like tensorflow-macos etc as I need my work to be cross-platform.
I am taking a class where we use tensorflow inside docker containers and everybody with an M1 mac in that class had this exact same issue including me. Unfortunately nobody has found a fix so I am going to subsribe to this issue as well, I hope there exist some kind of workarround/solution!
While this issue was originally opened around emulating TensorFlow on x86_64 in Docker, it does look like there are now
tensorflowaarch64 binaries that can be used in linux/arm64/v8 Docker containers. More info here: https://blog.tensorflow.org/2022/09/announcing-tensorflow-official-build-collaborators.htmlDockerfile
@coreation
So update since I last wrote. Last time above I said I got it working using emulation (so running on x86). This is not ideal because it’s slow. It’s much faster if you can get it running on aarch64. It all depends on what other libraries you are installing and what your base docker image looks like.
What docker image are you using? When you type
uname -mwithin the docker container running do you getaarch64orx86_64?If you are doing something like
pip install tensorflow==2.3.1its going to use one of the official tensorflow wheels. And which one it uses depends on if the docker container is runningaarch64orx86_64. But both will have issues.If you are running
x86_64then its gonna be slow. But to get it working you need to install unofficial community wheel without AVX. You can see a list of such options here. For example, when I was trying to go thisx86_64emulation route I first triedTensorFlow 2.7.0 No AVX, No GPU, Python 3.7, 3.8, 3.9, Ubuntu 18.04, multiple Archswhich is this link. I tried thebarcelonaone and all of its builds are here. You then have to choose the one for your TF version and Python version. I tried TF 2.7 and Python 3.8 so I usedtensorflow-2.7.0-cp38-cp38-linux_x86_64.whlwhich is this link:https://tf.novaal.de/barcelona/tensorflow-2.7.0-cp38-cp38-linux_x86_64.whl. So to install this with pip you dopip install -f https://tf.novaal.de/barcelona/tensorflow-2.7.0-cp38-cp38-linux_x86_64.whl.I don’t recommend this route ^^ though because emulation on m1 chip not ideal. Better to get your docker container running on
aarch64. For example I use the docker imageFROM python:3.8-slim. And I am using an aarch64 community wheel for Tensorflow. For example, in dockerfileRUN pip install tensorflow -f https://tf.kmtea.eu/whl/stable.html. I got the wheel from here.Now in our DEV/PROD ENV we actually manage all dependencies with poetry and are running everything on x86 so we actually just use the official tensorflow wheels. Our apps dont run on aarch64 in production. But since I use a macbook with m1 chip for local development I needed to get my local dev env running on m1 chip. And its great/fast if I use
aarch64. So I had to uninstall tensorflow official from my container and then use the above mentioned hack of installing Tensorflow AARCH64. But its working great.Let me know if you need more details.
So I do think this is due to AVX instructions. If I install an unofficial wheel (e.g., from https://github.com/yaroslavvb/tensorflow-community-wheels/issues/198) and run a variant of the
docker runcommand above, I do not get a crash on import.Thanks for the lead @bhack. I agree, some solutions which you mention are: 1.) Publishing non-AVX wheels (or having non-AVX code paths available within a single wheel) 2.) Correctly handling in qemu via emulation/TCG/etc.
any update?
For performance you need to use
tensorflow-macos@sachinprasadhs Will Google release prebuilt ARM64 Docker images to Docker Hub? I’m especially interested in an ARM64 tensorflow/serving image.
Thanks @DrChrisLevy!
It also turns out tensorflow has Dockerfiles for arm64v8, but they don’t push the images, so you need to build them yourself. Useful for trying to keep development docker images as close as possible to the official ones.
Edit: Two gotchas:
enum34python package from the Dockerfilenumpyin the Dockerfile (can let the version be the latest)For posterity:
Then make a new Dockerfile, and use this:
@learning-to-play It would be great for the community if we had prebuilt images for all architectures that we support. 🙏
This is still an issue. It is just sad the fact that we (M-chip users)do not have official docker support.
Sure, I think we can close this now. QEMU also appears to have merged AVX instructions, so once that is pulled into Docker, it might also be possible to run via emulation.
https://gitlab.com/qemu-project/qemu/-/issues/164#note_1140802183
No update on this?
@janvdp have you made any more progress in the last 2 months ? I added the --platform argument to my docker file to run Linux/amd and installed an unofficial non avx tensor flow 2.7. Finally got the image built but damn it’s slow. The unit tests are so much slower. Also many of the tests fail because for some reason some of the rounded precision assertions are different etc.
If you have made any progress let me know !
https://github.com/apple/tensorflow_macos/issues/164#issuecomment-776785984
https://github.com/ARM-software/Tool-Solutions/tree/master/docker/tensorflow-aarch64
But as someone still needs to use this in emulation I suppose in that It could be a qemu BUG with
DBL_MAXin emulationAre there any update here? Not being able to run tensorflow in docker on our new powerful M1-based machines is both a big downer and quite the annoyance to “solve”. ATM our best workaround is having an intel-based VM for development, which feels quite unnecessary.
Thanks @mohantym
The links just reference the warning above which I believe is innocuous since Docker can emulate the image’s platform. TensorFlow doesn’t publish official linux/arm64/v8 images (would require an aarch64 TensorFlow build), but I would think that would remove the warning. Note that the problem is specifically with TensorFlow’s assumptions about the emulated platform and not the image or other libraries, which run fine when emulating linux/amd64:
I suspect
Check failed: bucket_limits_[i] > bucket_limits_[i - 1] (0 vs. 10)is a sanity check that TensorFlow runs on startup that fails under emulation. IMO this issue is about whether there is anything that can be done on the TensorFlow side to relax or correct this check or whether this is a critical check that is violated e.g., by qemu (https://gitlab.com/qemu-project/qemu/-/issues/601 suggests it could be floating point inaccuracy, although that seems to just be a guess).The slowdown with non-AVX TensorFlow wheels is likely mainly due to emulation (compared to anything inherent to TensorFlow itself). IMO an ideal outcome would be to have official aarch64 wheels on PyPI that can use platform detection to
pip install tensorflowwithout having to specify a community webpage. This would also seamlessly support the workflows being discussed here (running an aarch64 image locally and building/deploying an amd64 image in production)While I suspect this issue might have more visibility, https://github.com/tensorflow/tensorflow/issues/52973 is probably better for discussing aarch64 wheels.
Here are some benchmarks for a subset of my unit tests (roughly an order of magnitude slower under emulation):
If you are using an old 1.x version of Tensorflow, downgrading to max 1.5, Python 3.6, and running docker targeting x86 is a quick, crappy solution since AVX was introduced in 1.6.
We don’t really use Tensorflow for anything critical (was kinda a devs pet project a few years back) so it works for us. Sounds like either qemu needs to support avx to move this foward OR Tensorflow starts publishing multi arch wheels without avx?
@hoangmt So what do I put in my Pipfile, if I want a version that I can install both in docker on my M1 machine, and build on linux/x64 to send to for example Vertex AI as a training job?
Locally, I don’t care about GCP access, it just needs to work so I can develop models with a tiny version of my dataset. The M1-ish GPUs are not good for training models anyway. The “Neural Engine” only accelerates inference AFAIK.
Basically this
@coreation not sure about tensorflow-serving, I’m not using it.
Re: numpy I hit that too, updated the comment above.