dvc: dvc import broken authentication

We seem to no longer be able to import a remote dvc resources from a repository that requires authentication. This worked at some point prior to 2.10. This worked previously.

The output of the import is as follows:

2022-06-14 12:01:24,323 ERROR: failed to import 'model.hdf5' from 'https://github.build.company.com/org/Artifacts.git'. - Failed to clone repo 'https://github.build.company.com/org/Artifacts.git' to '/var/tmp/pbs.1540626.scinfra2/tmpc9sse3s9dvc-clone'
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/scmrepo/git/backend/dulwich/__init__.py", line 196, in clone
    repo = clone_from()
  File "/usr/local/lib/python3.8/dist-packages/dulwich/porcelain.py", line 443, in clone
    return client.clone(
  File "/usr/local/lib/python3.8/dist-packages/dulwich/client.py", line 747, in clone
    result = self.fetch(path, target, progress=progress, depth=depth)
  File "/usr/local/lib/python3.8/dist-packages/dulwich/client.py", line 824, in fetch
    result = self.fetch_pack(
  File "/usr/local/lib/python3.8/dist-packages/dulwich/client.py", line 2079, in fetch_pack
    refs, server_capabilities, url = self._discover_references(
  File "/usr/local/lib/python3.8/dist-packages/dulwich/client.py", line 1938, in _discover_references
    resp, read = self._http_request(url, headers)
  File "/usr/local/lib/python3.8/dist-packages/dulwich/client.py", line 2219, in _http_request
    raise HTTPUnauthorized(resp.getheader("WWW-Authenticate"), url)
dulwich.client.HTTPUnauthorized: No valid credentials provided

The above exception was the direct cause of the following exception:

There was some previous commit appears to fix a similar issue, which was part of the 2.11 release: issue #7670 however, updating to 2.11 did not fix the issue.

DVC version: 2.11.0 (pip)
---------------------------------
Platform: Python 3.8.0 on Linux-3.10.0-1160.66.1.el7.x86_64-x86_64-with-glibc2.27
Supports:
	webhdfs (fsspec = 2022.5.0),
	http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
	https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
	s3 (s3fs = 2022.5.0, boto3 = 1.21.21)
Cache types: hardlink, symlink
Cache directory: nfs on LEB1MLNAS.hpc.company.com:/leb1mlnas_projects
Caches: local
Remotes: None
Workspace directory: nfs on LEB1MLNAS.hpc.company.com:/leb1mlnas_projects
Repo: dvc, git

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 2
  • Comments: 18 (14 by maintainers)

Commits related to this issue

Most upvoted comments

Solved in 2.25.0

@dtrifiro FWIW I tried installing dulwich from your feature branch using

pip install git+https://github.com/dtrifiro/dulwich.git@feature/credential-helper

and I’m unfortunately still running into the same issue:

❯ dvc pull data/dataset.dvc
Everything is up to date.                                                                                                                                       
ERROR: failed to pull data from the cloud - Failed to clone repo 'git@bitbucket.org:organisation/datasets.git' to '/var/folders/b1/d587c0_d6hb6ndr4mv24h_gc0000gq/T/tmpu75hs20xdvc-clone'

Downgrading to 2.9 fixes the issue. I’m on macOS.

Thanks @dtrifiro, that did the trick!! 😃

Hey @d-miketa, thanks for the feedback. If you test out the same branch, it should be working now

I don’t have a configuration that I can run from the same system… but here is the same issue run from a windows system (the traceback doesn’t appear to be as verbose - but the line numbers seem to match the traceback from the earlier linux system). Does this have the info you need?

$ dvc doctor
DVC version: 2.11.0 (exe)
---------------------------------
Platform: Python 3.8.10 on Windows-10-10.0.19042-SP0
Supports:
        azure (adlfs = 2021.9.1, knack = 0.8.2, azure-identity = 1.10.0),
        gdrive (pydrive2 = 1.10.0),
        gs (gcsfs = 2021.10.1),
        hdfs (fsspec = 2021.10.1, pyarrow = 6.0.0),
        webhdfs (fsspec = 2021.10.1),
        http (aiohttp = 3.8.0, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.0, aiohttp-retry = 2.4.6),
        s3 (s3fs = 2021.10.1, boto3 = 1.17.106),
        ssh (sshfs = 2021.11.2),
        oss (ossfs = 2021.8.0),
        webdav (webdav4 = 0.9.3),
        webdavs (webdav4 = 0.9.3)
Cache types: hardlink, symlink
Cache directory: NTFS on C:\
Caches: local
Remotes: None
Workspace directory: NTFS on C:\
Repo: dvc, git
$ dvc import --verbose https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git pose/S1FV_PS_A_20220606-092610-0558-0.0003.hdf5
2022-06-16 11:22:48,163 DEBUG: Removing output 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5' of stage: 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5.dvc'.
2022-06-16 11:22:48,163 DEBUG: Removing 'C:\cygwin64\home\200003581\onwing_s1b_keyframe\onwing_s1b_keyframe\ckpts\tmp\S1FV_PS_A_20220606-092610-0558-0.0003.hdf5'
Importing 'pose/S1FV_PS_A_20220606-092610-0558-0.0003.hdf5 (https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git)' -> 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5'
2022-06-16 11:22:48,173 DEBUG: Computed stage: 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5.dvc' md5: 'fb7f834659a494bd33f117228c03a0d7'
2022-06-16 11:22:48,173 DEBUG: 'md5' of stage: 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5.dvc' changed.
2022-06-16 11:22:48,173 DEBUG: Creating external repo https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git@None
2022-06-16 11:22:48,173 DEBUG: erepo: git clone 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to a temporary dir
2022-06-16 11:22:48,674 ERROR: failed to import 'pose/S1FV_PS_A_20220606-092610-0558-0.0003.hdf5' from 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git'. - Failed to clone repo 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to 'C:\cygwin64\tmp\tmp8zsh7ht1dvc-clone'
------------------------------------------------------------
Traceback (most recent call last):
  File "scmrepo\git\backend\dulwich\__init__.py", line 196, in clone
  File "dulwich\porcelain.py", line 443, in clone
  File "dulwich\client.py", line 747, in clone
  File "dulwich\client.py", line 824, in fetch
  File "dulwich\client.py", line 2079, in fetch_pack
  File "dulwich\client.py", line 1938, in _discover_references
  File "dulwich\client.py", line 2219, in _http_request
dulwich.client.HTTPUnauthorized: No valid credentials provided

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "dvc\scm.py", line 145, in clone
  File "scmrepo\git\__init__.py", line 143, in clone
  File "scmrepo\git\backend\dulwich\__init__.py", line 199, in clone
scmrepo.exceptions.CloneError: Failed to clone repo 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to 'C:\cygwin64\tmp\tmp8zsh7ht1dvc-clone'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dvc\commands\imp.py", line 15, in run
  File "dvc\repo\imp.py", line 6, in imp
  File "dvc\repo\__init__.py", line 49, in wrapper
  File "dvc\repo\scm_context.py", line 152, in run
  File "dvc\repo\imp_url.py", line 83, in imp_url
  File "funcy\decorators.py", line 45, in wrapper
  File "dvc\stage\decorators.py", line 36, in rwlocked
  File "funcy\decorators.py", line 66, in __call__
  File "dvc\stage\__init__.py", line 535, in run
  File "funcy\decorators.py", line 45, in wrapper
  File "dvc\stage\decorators.py", line 36, in rwlocked
  File "funcy\decorators.py", line 66, in __call__
  File "dvc\stage\__init__.py", line 559, in _sync_import
  File "dvc\stage\imports.py", line 47, in sync_import
  File "dvc\dependency\repo.py", line 68, in download
  File "dvc\dependency\repo.py", line 97, in get_used_objs
  File "dvc\dependency\repo.py", line 111, in _get_used_and_obj
  File "contextlib.py", line 113, in __enter__
  File "dvc\external_repo.py", line 39, in external_repo
  File "dvc\external_repo.py", line 169, in _cached_clone
  File "funcy\decorators.py", line 45, in wrapper
  File "funcy\flow.py", line 274, in wrap_with
  File "funcy\decorators.py", line 66, in __call__
  File "dvc\external_repo.py", line 239, in _clone_default_branch
  File "dvc\scm.py", line 150, in clone
dvc.scm.CloneError: Failed to clone repo 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to 'C:\cygwin64\tmp\tmp8zsh7ht1dvc-clone'
------------------------------------------------------------
2022-06-16 11:22:48,689 DEBUG: Analytics is enabled.
2022-06-16 11:22:48,689 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\\cygwin64\\tmp\\tmp13dedzle']'
2022-06-16 11:22:48,705 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\\cygwin64\\tmp\\tmp13dedzle']'

I reverted to version 2.9.5 and can now perform the dvc import of a remote resource as expected.

I would also point out that dvc update was also not working - failing for the same authentication issue.