dvc: dvc pull: unexpected error - [Errno 22] Bad Request
Bug Report
dvc pull: unexpected error
Description
I have several dvc resources imported into a project… The tracking files (.dvc) are committed to the repository utilizing these resources. When attempting to pull (dvc pull), the associated tracked resources - am getting an error:
An error occurred (400) when calling the HeadObject operation: Bad Request (relevant log from dvc pull —v below).
Reproduce
Example:
- dvc import resource into project
- latter or from a fresh checkout of the above git repo and then attempt to dvc pull
Expected
Expecting tracked resource to be retrieved from DVC. I am able to perform a dvc update <resource> and that will pull the dvc resource into the folder structure, of course this causes the .dvc file to show as modified (even though it points to the same location/revision).
How to proceed here? something corrupted?
Environment information
Output of dvc doctor
:
$ dvc doctor
DVC version: 2.8.3 (pip)
---------------------------------
Platform: Python 3.8.0 on Linux-3.10.0-1160.45.1.el7.x86_64-x86_64-with-glibc2.27
Supports:
webhdfs (fsspec = 2021.11.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
s3 (s3fs = 2021.11.0, boto3 = 1.17.106)
Cache types: hardlink, symlink
Cache directory: nfs on LEB1MLNAS.company.com:/leb1mlnas_projects
Caches: local
Remotes: None
Workspace directory: nfs on LEB1MLNAS.company.com:/leb1mlnas_projects
Repo: dvc, git
$ dvc pull -vvv
CUT......
2021-11-23 09:03:02,390 TRACE: Assuming '/projects/shared_dvc_cache/d6/9a712149e586998fb73c9566bd7e9f' is unchanged since it is read-only
2021-11-23 09:03:02,394 DEBUG: Preparing to transfer data from 's3://dvc-inspection-ai/bit_tool_artifacts/RepoA' to '../../../../../../../shared_dvc_cache'
2021-11-23 09:03:02,394 DEBUG: Preparing to collect status from '../../../../../../../shared_dvc_cache'
2021-11-23 09:03:02,394 DEBUG: Collecting status from '../../../../../../../shared_dvc_cache'
2021-11-23 09:03:02,395 DEBUG: Preparing to collect status from 's3://dvc-inspection-ai/bit_tool_artifacts/RepoA'
2021-11-23 09:03:02,396 DEBUG: Collecting status from 's3://dvc-inspection-ai/bit_tool_artifacts/RepoA'
2021-11-23 09:03:02,396 DEBUG: Querying 1 hashes via object_exists
2021-11-23 09:03:02,510 ERROR: unexpected error - [Errno 22] Bad Request: An error occurred (400) when calling the HeadObject operation: Bad Request
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/s3fs/core.py", line 250, in _call_s3
out = await method(**additional_kwargs)
File "/usr/local/lib/python3.8/dist-packages/aiobotocore/client.py", line 155, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (400) when calling the HeadObject operation: Bad Request
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/dvc/main.py", line 55, in main
ret = cmd.do_run()
File "/usr/local/lib/python3.8/dist-packages/dvc/command/base.py", line 45, in do_run
return self.run()
File "/usr/local/lib/python3.8/dist-packages/dvc/command/data_sync.py", line 30, in run
stats = self.repo.pull(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 50, in wrapper
return f(repo, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/pull.py", line 29, in pull
processed_files_count = self.fetch(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/__init__.py", line 50, in wrapper
return f(repo, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/fetch.py", line 67, in fetch
d, f = _fetch(
File "/usr/local/lib/python3.8/dist-packages/dvc/repo/fetch.py", line 87, in _fetch
downloaded += repo.cloud.pull(obj_ids, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/data_cloud.py", line 114, in pull
return transfer(
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/transfer.py", line 153, in transfer
status = compare_status(src, dest, obj_ids, check_deleted=False, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/status.py", line 166, in compare_status
src_exists, src_missing = status(
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/status.py", line 132, in status
odb.hashes_exist(hashes, name=str(odb.path_info), **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/db/base.py", line 468, in hashes_exist
remote_hashes = self.list_hashes_exists(hashes, jobs, name)
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/db/base.py", line 419, in list_hashes_exists
ret = list(itertools.compress(hashes, in_remote))
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 611, in result_iterator
yield fs.pop().result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.8/dist-packages/dvc/objects/db/base.py", line 410, in exists_with_progress
ret = self.fs.exists(path_info)
File "/usr/local/lib/python3.8/dist-packages/dvc/fs/fsspec_wrapper.py", line 136, in exists
return self.fs.exists(self._with_bucket(path_info))
File "/usr/local/lib/python3.8/dist-packages/fsspec/asyn.py", line 91, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/fsspec/asyn.py", line 71, in sync
raise return_result
File "/usr/local/lib/python3.8/dist-packages/fsspec/asyn.py", line 25, in _runner
result[0] = await coro
File "/usr/local/lib/python3.8/dist-packages/s3fs/core.py", line 822, in _exists
await self._info(path, bucket, key, version_id=version_id)
File "/usr/local/lib/python3.8/dist-packages/s3fs/core.py", line 1016, in _info
out = await self._call_s3(
File "/usr/local/lib/python3.8/dist-packages/s3fs/core.py", line 270, in _call_s3
raise err
OSError: [Errno 22] Bad Request
------------------------------------------------------------
2021-11-23 09:03:02,825 DEBUG: Version info for developers:
DVC version: 2.8.3 (pip)
---------------------------------
Platform: Python 3.8.0 on Linux-3.10.0-1160.45.1.el7.x86_64-x86_64-with-glibc2.27
Supports:
webhdfs (fsspec = 2021.11.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
s3 (s3fs = 2021.11.0, boto3 = 1.17.106)
Cache types: hardlink, symlink
Cache directory: nfs on LEB1MLNAS.company.com:/leb1mlnas_projects
Caches: local
Remotes: None
Workspace directory: nfs on LEB1MLNAS.company.com:/leb1mlnas_projects
Repo: dvc, git
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 17 (8 by maintainers)
Yes - that is correct… In this particular cases - all of the individually imported dvc artifacts originate in the same git artifacts repository.
Yes - that work just fine… I just performed
Sorry I had meant to say
dvc pull --jobs 1
produced the same error that I originally posted….
I am in a Linux vm - only 5 cpus - the minio endpoint is a reasonable size - and I am the only one currently interacting with it…. (Doubt it’s a capacity issue)
I get the same error on a Windows client 12core laptop… a colleague of mine also received the same error attempting access to the same repo
The ‘none’ - I am guessing might be the git project doesn’t have its own dvc repo configured (remote), rather it only contains dvc imported resources…. Is that a clue?