tensorflow: Illegal instruction (core dumped) in a CPU with AVX support

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution: Linux Ubuntu 20.10
  • TensorFlow installed from (source or binary): pip, tensorflow-2.4.0-cp38-cp38-manylinux2010_x86_64.whl
  • TensorFlow version (use command below): tensorflow-2.4.0-cp38-cp38-manylinux2010_x86_64.whl
  • Python version: 3.8.6
  • CUDA/cuDNN version: N/A
  • GPU model and memory: N/A

Describe the current behavior

I was using tensorflow 2 without any issue until a few days ago. I installed it using pip in a new virtualenv environment and now it throws illegal instruction error. My CPU is an i5-3230M, which, according to /proc/cpuinfo supports AVX, so it seems not related to #19584 .

My CPU (according to /proc/cpuinfo): model name : Intel® Core™ i5-3230M CPU @ 2.60GHz flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d

System:

$ uname -a
Linux e431 5.8.0-33-generic #36-Ubuntu SMP Wed Dec 9 09:14:40 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Describe the expected behavior The import should work and not give the Illegal instruction (core dumped) error.

Standalone code to reproduce the issue

$ python3 --version
Python 3.8.6
$ python3 -m venv test
$ . test/bin/activate
$ pip install --upgrade pip
Collecting pip
  Using cached pip-20.3.3-py2.py3-none-any.whl (1.5 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 20.1.1
    Uninstalling pip-20.1.1:
      Successfully uninstalled pip-20.1.1
Successfully installed pip-20.3.3
$ pip install tensorflow
Collecting tensorflow
  Using cached tensorflow-2.4.0-cp38-cp38-manylinux2010_x86_64.whl (394.8 MB)
Collecting gast==0.3.3
  Using cached gast-0.3.3-py2.py3-none-any.whl (9.7 kB)
Collecting absl-py~=0.10
  Using cached absl_py-0.11.0-py3-none-any.whl (127 kB)
Collecting astunparse~=1.6.3
  Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting flatbuffers~=1.12.0
  Using cached flatbuffers-1.12-py2.py3-none-any.whl (15 kB)
Collecting google-pasta~=0.2
  Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Collecting grpcio~=1.32.0
  Using cached grpcio-1.32.0-cp38-cp38-manylinux2014_x86_64.whl (3.8 MB)
Collecting h5py~=2.10.0
  Using cached h5py-2.10.0-cp38-cp38-manylinux1_x86_64.whl (2.9 MB)
Collecting keras-preprocessing~=1.1.2
  Using cached Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
Collecting numpy~=1.19.2
  Using cached numpy-1.19.4-cp38-cp38-manylinux2010_x86_64.whl (14.5 MB)
Collecting opt-einsum~=3.3.0
  Using cached opt_einsum-3.3.0-py3-none-any.whl (65 kB)
Collecting protobuf>=3.9.2
  Using cached protobuf-3.14.0-cp38-cp38-manylinux1_x86_64.whl (1.0 MB)
Collecting six~=1.15.0
  Using cached six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting tensorboard~=2.4
  Using cached tensorboard-2.4.0-py3-none-any.whl (10.6 MB)
Requirement already satisfied: setuptools>=41.0.0 in ./test/lib/python3.8/site-packages (from tensorboard~=2.4->tensorflow) (44.0.0)
Collecting google-auth<2,>=1.6.3
  Using cached google_auth-1.24.0-py2.py3-none-any.whl (114 kB)
Collecting cachetools<5.0,>=2.0.0
  Using cached cachetools-4.2.0-py3-none-any.whl (12 kB)
Collecting google-auth-oauthlib<0.5,>=0.4.1
  Using cached google_auth_oauthlib-0.4.2-py2.py3-none-any.whl (18 kB)
Collecting markdown>=2.6.8
  Using cached Markdown-3.3.3-py3-none-any.whl (96 kB)
Collecting pyasn1-modules>=0.2.1
  Using cached pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
Collecting pyasn1<0.5.0,>=0.4.6
  Using cached pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)
Collecting requests<3,>=2.21.0
  Using cached requests-2.25.0-py2.py3-none-any.whl (61 kB)
Collecting certifi>=2017.4.17
  Using cached certifi-2020.12.5-py2.py3-none-any.whl (147 kB)
Collecting chardet<4,>=3.0.2
  Using cached chardet-3.0.4-py2.py3-none-any.whl (133 kB)
Collecting idna<3,>=2.5
  Using cached idna-2.10-py2.py3-none-any.whl (58 kB)
Collecting requests-oauthlib>=0.7.0
  Using cached requests_oauthlib-1.3.0-py2.py3-none-any.whl (23 kB)
Collecting oauthlib>=3.0.0
  Using cached oauthlib-3.1.0-py2.py3-none-any.whl (147 kB)
Collecting rsa<5,>=3.1.4
  Using cached rsa-4.6-py3-none-any.whl (47 kB)
Collecting tensorboard-plugin-wit>=1.6.0
  Using cached tensorboard_plugin_wit-1.7.0-py3-none-any.whl (779 kB)
Collecting tensorflow-estimator<2.5.0,>=2.4.0rc0
  Using cached tensorflow_estimator-2.4.0-py2.py3-none-any.whl (462 kB)
Collecting termcolor~=1.1.0
  Using cached termcolor-1.1.0.tar.gz (3.9 kB)
Collecting typing-extensions~=3.7.4
  Using cached typing_extensions-3.7.4.3-py3-none-any.whl (22 kB)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.2-py2.py3-none-any.whl (136 kB)
Collecting werkzeug>=0.11.15
  Using cached Werkzeug-1.0.1-py2.py3-none-any.whl (298 kB)
Collecting wheel~=0.35
  Using cached wheel-0.36.2-py2.py3-none-any.whl (35 kB)
Collecting wrapt~=1.12.1
  Using cached wrapt-1.12.1.tar.gz (27 kB)
Using legacy 'setup.py install' for termcolor, since package 'wheel' is not installed.
Using legacy 'setup.py install' for wrapt, since package 'wheel' is not installed.
Installing collected packages: urllib3, pyasn1, idna, chardet, certifi, six, rsa, requests, pyasn1-modules, oauthlib, cachetools, requests-oauthlib, google-auth, wheel, werkzeug, tensorboard-plugin-wit, protobuf, numpy, markdown, grpcio, google-auth-oauthlib, absl-py, wrapt, typing-extensions, termcolor, tensorflow-estimator, tensorboard, opt-einsum, keras-preprocessing, h5py, google-pasta, gast, flatbuffers, astunparse, tensorflow
    Running setup.py install for wrapt ... done
    Running setup.py install for termcolor ... done
Successfully installed absl-py-0.11.0 astunparse-1.6.3 cachetools-4.2.0 certifi-2020.12.5 chardet-3.0.4 flatbuffers-1.12 gast-0.3.3 google-auth-1.24.0 google-auth-oauthlib-0.4.2 google-pasta-0.2.0 grpcio-1.32.0 h5py-2.10.0 idna-2.10 keras-preprocessing-1.1.2 markdown-3.3.3 numpy-1.19.4 oauthlib-3.1.0 opt-einsum-3.3.0 protobuf-3.14.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 requests-2.25.0 requests-oauthlib-1.3.0 rsa-4.6 six-1.15.0 tensorboard-2.4.0 tensorboard-plugin-wit-1.7.0 tensorflow-2.4.0 tensorflow-estimator-2.4.0 termcolor-1.1.0 typing-extensions-3.7.4.3 urllib3-1.26.2 werkzeug-1.0.1 wheel-0.36.2 wrapt-1.12.1
$ python --version
Python 3.8.6
$ python
Python 3.8.6 (default, Sep 25 2020, 09:36:53) 
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
Illegal instruction (core dumped)

Other info / logs

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 8
  • Comments: 41 (4 by maintainers)

Commits related to this issue

Most upvoted comments

We found the culprit, working on a fix and then we will do a patch release on r2.4. Not certain on the timeline, but probably will be at least a few weeks as we’re gathering other regressions on the 2.4 release.

Thank you for taking the time over the holidays to look into this.

Unsurprisingly I’d like to request keeping in support for the older CPU’s, but it appears that I may be 1 out of only 10 or so people on the internet experiencing issues. However, I guess I’d be curious what the Tensorflow project’s stance is on the CPU backend (assuming that might be what is driving AVX2). As a user, I’ve seen it as primarily the last resort or “the thing that loads when I need TF as a dependency but not for training”. As such I appreciate small advances in speed but have always assumed it would prioritize maximum compatibility first due to its unique position. To that end, it’s jarring to hear that my CPU rather than my GPU would potentially be the limiting factor; similarly odd that I would not be able to train small multi-layer dense neural nets due to my CPU being too old.

Thank you for your time and consideration in this matter; I’ll be watching for the eventual outcome.

And it has been released. Please test it to see that the issue has been fixed

We should have the patched 2.4.1 release done by the end of the week

Personally I also vote for avoiding requiring AVX2 – models requiring a lot of computation should use an accelerator anyway, and I value the broad application of the PyPI tensorflow CPU backend more.

It seems we accidentally built the official wheels using AVX2. Now we are during holiday period so not many people are present but we will revisit this situation in January and issue a patch which either will build using only AVX or will contain release notes updates that from 2.4 onwards AVX2 is required.

I don’t know yet which path will be taken. Nor do I know why we got to use AVX2.

@moorugi98 Sure, here’s my setup: https://gitlab.flux.utah.edu/alex_orange/tensorflow-2.4.0-build . That includes the Dockerfile I use for my build environment and the command history for building and testing. Not exactly perfect but you should be able to figure out how to make your own build (and the maintainers can hopefully use this to get an idea of what’s different from the official build).

Just to pile on, same problem here, 2.4.0 doesn’t work, 2.3.0 and earlier do. CPU supports AVX:

vendor_id	: GenuineIntel
cpu family	: 6
model		: 45
model name	: Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
stepping	: 7
microcode	: 0x718
cpu MHz		: 1197.138
cache size	: 20480 KB
physical id	: 1
siblings	: 16
core id		: 7
cpu cores	: 8
apicid		: 47
initial apicid	: 47
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips	: 4395.39
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management: