DeepSpeed: Cannot install DeepSpeed on Ubuntu 20.04

Attempting to install DeepSpeed using the following steps:

  1. Cloned DeepSpeed repository
  2. Created virtual environment: python3 -m venv env
  3. Activated virtual environment: source env/bin/activate
  4. Ran install script: ./install.sh

gcc: 8.4.0 nvcc: 10.2 g++: 8.4.0

$ pip3 list
Package       Version
------------- -------
apex          0.1    
cpufeature    0.1.1  
future        0.18.2 
numpy         1.19.2 
Pillow        7.2.0  
pip           20.0.2 
pkg-resources 0.0.0  
protobuf      3.13.0 
psutil        5.7.2  
setuptools    44.0.0 
six           1.15.0 
tensorboardX  1.8    
torch         1.6.0  
torchvision   0.7.0  
tqdm          4.49.0 
wheel         0.35.1 

The install script throws the error message below:

removing build/bdist.linux-x86_64/wheel
/home/user/code/DeepSpeed
Installing apex locally so that deepspeed will build
Found existing installation: apex 0.1
Uninstalling apex-0.1:
  Successfully uninstalled apex-0.1
Non-user install because user site-packages disabled
Created temporary directory: /tmp/pip-ephem-wheel-cache-8wavy60a
Created temporary directory: /tmp/pip-req-tracker-a5l1t8ki
Initialized build tracking at /tmp/pip-req-tracker-a5l1t8ki
Created build tracker: /tmp/pip-req-tracker-a5l1t8ki
Entered build tracker: /tmp/pip-req-tracker-a5l1t8ki
Created temporary directory: /tmp/pip-install-tzd3lyjo
Processing ./third_party/apex/dist/apex-0.1-cp38-cp38-linux_x86_64.whl
  Added apex==0.1 from file:///home/user/code/DeepSpeed/third_party/apex/dist/apex-0.1-cp38-cp38-linux_x86_64.whl to build tracker '/tmp/pip-req-tracker-a5l1t8ki'
  Removed apex==0.1 from file:///home/user/code/DeepSpeed/third_party/apex/dist/apex-0.1-cp38-cp38-linux_x86_64.whl from build tracker '/tmp/pip-req-tracker-a5l1t8ki'
Installing collected packages: apex
  Created temporary directory: /tmp/pip-unpacked-wheel-rdepxzch

Successfully installed apex-0.1
Cleaning up...
Removed build tracker: '/tmp/pip-req-tracker-a5l1t8ki'
Building deepspeed wheel
./install.sh: line 196: 41599 Floating point exception(core dumped) python setup.py -v bdist_wheel
Error on line 195
Fail to install deepspeed

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 19 (6 by maintainers)

Most upvoted comments

Thanks @drfinkus and @rople380 for the help with diagnosis. It seems that cpufeature does not play nicely with all systems. I’m working on a fix to remove it.