apex: Install Failure on GCP Deep Learning VM
I created a simple GCP Deep Learning VM: https://cloud.google.com/deep-learning-vm/
I followed the install directions, and the install failed with errors:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .
...
Command "/opt/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-j0qgf5ds/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code,
__file__, 'exec'))" --cpp_ext --cuda_ext install --record /tmp/pip-record-1yr2fag5/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-req-build-j0qgf5ds/
Exception information:
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 143, in main
status = self.run(options, args)
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 366, in run
use_user_site=options.use_user_site,
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/__init__.py", line 49, in install_given_reqs
**kwargs
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 791, in install
spinner=spinner,
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/utils/misc.py", line 705, in call_subprocess
% (command_desc, proc.returncode, cwd))
The Python-only option also failed:
pip install -v --no-cache-dir .
...
Command "/opt/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-eedemek6/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code,
__file__, 'exec'))" install --record /tmp/pip-record-ehl5a4y7/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-req-build-eedemek6/
Exception information:
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 143, in main
status = self.run(options, args)
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 366, in run
use_user_site=options.use_user_site,
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/__init__.py", line 49, in install_given_reqs
**kwargs
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/req/req_install.py", line 791, in install
spinner=spinner,
File "/opt/anaconda3/lib/python3.7/site-packages/pip/_internal/utils/misc.py", line 705, in call_subprocess
% (command_desc, proc.returncode, cwd))
It would seem like installation on a GCP Deep Learning VM would be one of the tested use cases here no?? If it doesn’t work there of all places, where is it intended to work?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 21 (17 by maintainers)
@see-- this works! I was able to successfully install on a GCP VM with the following commands:
UPDATE 1: On running a mixed precision model with the above install I get the following warning:
Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback.
Installing instead with the following line removed the warning:
Using
--user
worked for me with python3. I guess that is what you get from using 3 different python versions.I had to use Conda forge to get this working within my conda environment
conda install -c conda-forge nvidia-apex
My guess that it’s installing to a python 3.5 because it’s using’ OS’s pip3 version 3.5, rather than conda’s python 3.7, you can confirm by running
pip3 --version
andpython --version
.