dvc: [dvc pull version >= 3.24] ERROR: unexpected error - failed to load directory

Bug Report

pull: ERROR, doesn’t create the cache directory and crash.

Description

Collecting |0.00 [00:00, ?entry/s]

Fetching

ERROR: unexpected error - failed to load directory (‘60’, ‘36c8668869290419aec048f26f8deb.dir’): [Errno 2] No such file or directory: ‘/mnt/data2/users/myuser/project/.dvc/cache/files/md5/60/36c8668869290419aec048f26f8deb.dir’

With DVC version 3.23.0 it works OK! DVC >= 3.24 fails

Environment information

Output of dvc doctor:

$ dvc doctor

DVC version: 3.24.0 (pip)

-------------------------

Platform: Python 3.10.12 on Linux-5.15.0-84-generic-x86_64-with-glibc2.35

Subprojects:
        dvc_data = 2.18.1
        dvc_objects = 1.0.1
        dvc_render = 0.6.0
        dvc_task = 0.3.0
        scmrepo = 1.4.0

Supports:
        http (aiohttp = 3.8.6, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.6, aiohttp-retry = 2.8.3)

Config:
        Global: /mnt/data2/users/myuser/.config/dvc
        System: /etc/xdg/dvc

Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme2n1
Caches: local
Remotes: https
Workspace directory: ext4 on /dev/nvme2n1
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/fb1190580eb02ee00297b6211011d5af

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Reactions: 6
  • Comments: 21 (15 by maintainers)

Most upvoted comments

Thanks for the feedback @tonycusackData! Let’s close this then and we can reopen if needed.

@PythonFZ @themaikelman Mind giving 3.38.0 a try?

Was having the same issues above. Deleting everythin in Repo.site_cache_dir didn’t work but this finally fixed it - pip install dvc[gs]==3.38.0

Thank you!

@PythonFZ @themaikelman Mind giving 3.38.0 a try?

@efiop Should we try to bump the path for site_cache_dir so it forces a new cache for existing repos?

I’m still unclear and would like to hear from users watching this issue:

  1. Does deleting site_cache_dir fix the problem for everyone?
  2. Does the problem come back after that?

For me, this is also happening in a GitHub CI runner. The site_cache_dir should only be created upon starting the runner. The CI was using url = https://dagshub.com/PythonFZ/IPS-Examples.dvc. Furthermore, I’ve seen this on a local machine with a S3 remote, but I can’t excactly tell you how I fixed it then. Now I’ve just pinned the DVC version to 3.23.

Hm, we’ve changed index cache key a few times since then, so it shouldn’t pick it up in newer versions. It is possible we are handling this particular case badly, directiories that failed to load should not be marked as loaded thus we should be making an attempt to load them next time. Need to check, maybe there is another bug there somewhere…

Need to look into why it is happening first, otherwise it will only obscure it till the next report.

@efiop I think it’s an index issue we fixed already (maybe in https://github.com/iterative/dvc/issues/9785?) but for people who have an index that predates that change, they will still run into this issue, so forcing the index to be regenerated should resolve it. WDYT?

I’ve encountered the same issue. I tried to pull data but forgot to set the credentials to the S3 remote in config.local. After setting the credentials, I encountered the same error. The suggested fix could resvolve it.

EDIT: I only saw this issue and commented on it. This is my dvc doctor which is not the newest version

DVC version: 3.27.0 (pip)
-------------------------
Platform: Python 3.10.13 on Linux-6.2.0-36-generic-x86_64-with-glibc2.35
Subprojects:
        dvc_data = 2.18.1
        dvc_objects = 1.0.1
        dvc_render = 0.6.0
        dvc_task = 0.3.0
        scmrepo = 1.4.0
Supports:
        http (aiohttp = 3.8.6, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.6, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2023.10.0, boto3 = 1.28.64)
Config:
        Global: /tikhome/fzills/.config/dvc
        System: /etc/xdg/dvc

Also, could you each try to drop everything in the Repo.site_cache_dir location and see if the problem persists?

Deleting the directory of Repo.site_cache_dir does resolve the problem.

Also, could you each try to drop everything in the Repo.site_cache_dir location and see if the problem persists?

@atekoa Please run dvc pull -v and post full log.