xmanager: `xmanager launch` cannot resolve `'docker'` in subprocess
Hi All,
I am trying to run the example xmanager launch ./xmanager/examples/cifar10_tensorflow/launcher.py.
However, I get the following error. Do you have any suggestion where this error may coming from and how could I fix it?
I1020 10:57:30.250377 4561079808 build_image.py:134] Local docker: {'Platform': {'Name': 'Docker Engine - Community'}, 'Components': [{'Name': 'Engine', 'Version': '20.10.8', 'Details': {'ApiVersion': '1.41', 'Arch': 'amd64', 'BuildTime': '2021-07-30T19:52:10.000000000+00:00', 'Experimental': 'false', 'GitCommit': '75249d8', 'GoVersion': 'go1.16.6', 'KernelVersion': '5.10.47-linuxkit', 'MinAPIVersion': '1.12', 'Os': 'linux'}}, {'Name': 'containerd', 'Version': '1.4.9', 'Details': {'GitCommit': 'e25210fe30a0a703442421b0f60afac609f950a3'}}, {'Name': 'runc', 'Version': '1.0.1', 'Details': {'GitCommit': 'v1.0.1-0-g4144b63'}}, {'Name': 'docker-init', 'Version': '0.19.0', 'Details': {'GitCommit': 'de40ad0'}}], 'Version': '20.10.8', 'ApiVersion': '1.41', 'MinAPIVersion': '1.12', 'GitCommit': '75249d8', 'GoVersion': 'go1.16.6', 'Os': 'linux', 'Arch': 'amd64', 'KernelVersion': '5.10.47-linuxkit', 'BuildTime': '2021-07-30T19:52:10.000000000+00:00'}
I1020 10:57:30.250654 4561079808 docker_lib.py:64] Building Docker image
Dockerfile:
FROM gcr.io/deeplearning-platform-release/tf2-gpu.2-6
RUN if ! id 1000; then useradd -m -u 1000 clouduser; fi
ENV LANG=C.UTF-8
RUN apt-get update && apt-get install -y git netcat
RUN python -m pip install --upgrade pip setuptools
COPY cifar10_tensorflow/requirements.txt /cifar10_tensorflow/requirements.txt
RUN python -m pip install -r cifar10_tensorflow/requirements.txt
COPY cifar10_tensorflow/ /cifar10_tensorflow
RUN chown -R 1000:root /cifar10_tensorflow && chmod -R 775 /cifar10_tensorflow
WORKDIR cifar10_tensorflow
COPY entrypoint.sh ./entrypoint.sh
RUN chown -R 1000:root ./entrypoint.sh && chmod -R 775 ./entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]
Size of Docker input: 7.0 kB
Building Docker image, please wait...
Traceback (most recent call last):
File "/Users/chuchu/anaconda3/envs/jax/bin/xmanager", line 8, in <module>
sys.exit(entrypoint())
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/cli/cli.py", line 65, in entrypoint
app.run(main)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/cli/cli.py", line 41, in main
app.run(m.main, argv=argv)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/Users/chuchu/Documents/gt_local/try/xmanager/examples/cifar10_tensorflow/launcher.py", line 48, in main
args={},
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/xm/core.py", line 484, in package
return cls._async_packager.package(packageables)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/xm/async_packager.py", line 104, in package
executables = self._package_batch(packageables)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/xm_local/packaging/router.py", line 56, in package
for packageable in packageables
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/xm_local/packaging/router.py", line 56, in <listcomp>
for packageable in packageables
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/xm/pattern_matching.py", line 113, in apply
return case.handle(*values)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/xm_local/packaging/router.py", line 27, in _visit_caip_spec
packageable.executable_spec)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/xm_local/packaging/cloud.py", line 153, in package_cloud_executable
return _CLOUD_PACKAGING_ROUTER(packageable, executable_spec)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/xm/pattern_matching.py", line 113, in apply
return case.handle(*values)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/xm_local/packaging/cloud.py", line 129, in _package_python_container
packageable.env_vars, push_image_tag))
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/cloud/build_image.py", line 110, in build
image_name, project, bucket)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/cloud/build_image.py", line 154, in build_by_dockerfile
show_docker_command_progress=_SHOW_DOCKER_COMMAND_PROGRESS.value)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/cloud/docker_lib.py", line 70, in build_docker_image
dockerfile, show_docker_command_progress)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/site-packages/xmanager/cloud/docker_lib.py", line 113, in _build_image_with_docker_command
subprocess.run(command, check=True, env={'DOCKER_BUILDKIT': '1'})
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/subprocess.py", line 488, in run
with Popen(*popenargs, **kwargs) as process:
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "/Users/chuchu/anaconda3/envs/jax/lib/python3.7/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'docker': 'docker'
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 23 (2 by maintainers)
In IAM Admin, could you add
xmanager@lraexp.iam.gserviceaccount.comas aStorage Admin? This was supposed to have been done in this function.Just fyi, this service account (
xmanager@lraexp.iam.gserviceaccount.com) is owned by you and is bound to this project, which you can view in Service Accounts. Granting this account additional permissions does not provide anyone else access other than the owners/editors of your project.Regarding
dockerbug, perhaps we should include a environment-variable that users can overwrite to point to the full path ofdocker.Pshiko, thank you for investigating the problem. We will apply the fixes you propose with some amendments. In run_container_subprocess we shouldn’t have been overriding environment at all. These variables should be set inside of the container, not for the
docker runprocess.I wonder if appending
shell=Truetosubprocess.runon line 113 makes a difference…