dvc: dvc pull: fails with `PermissionError: [WinError 32]`

Bug Report

(Rehash of comments from https://github.com/iterative/dvc/issues/6403#issuecomment-1642817641 and following).

Comment https://github.com/iterative/dvc/issues/9749#issuecomment-1646796239 contains instructions to reproduce the WinError 32 problems from scratch.

Description

On Windows in a DVC project on the console while editing this project in VSCode.

(base) PS C:\Users\achimgaedke\Code\xxx> dvc pull
ERROR: unexpected error - [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\achimgaedke\\Code\\xxx\\train-out\\model_19.pth'

I would expect DVC to pull in my models from S3.

A successful pull looks like this:

(base) PS C:\Users\achimgaedke\Code\xxx> $Env:GIT_OPTIONAL_LOCKS=0
(base) PS C:\Users\achimgaedke\Code\xxx> dvc pull
M       train-out\
1 file modified
(base) PS C:\Users\achimgaedke\Code\xxx>

Reproduce

For me, it reproduces like this (intermittently though):

  • Start working on my project in VSCode
  • Delete the model files in train-out
  • Open the terminal
  • Run dvc pull

The 20ish files pulled are each 300MB. Later (once I have more time) I’ll try to create a sharable project. (Yeah, it is hard to report bugs from ongoing work).

Debug information

dvc_pull_-v.log

Note the line:

2023-07-20 14:39:20,177 DEBUG: failed to create 'C:\Users\achimgaedke\Code\xxx_c14_p113\train-out\model_4.pth' from 'C:\Users\achimgaedke\Code\xxx_c14_p113\.dvc\cache2\files\md5\c9\e8b9140bc7187529a5fcb33fe29702' - [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\achimgaedke\\Code\\xxx_c14_p113\\train-out\\.PLs2vrX8fXNan9LjzfiH3V.tmp' -> 'C:\\Users\\achimgaedke\\Code\\xxx_c14_p113\\train-out\\model_4.pth'

It looks like the creation of the file in the target directory train-out from the cache directory fails (i.e. the caching itself works).

  • I’ve disabled the virus scan and indexer for the Code directory (following the instructions at https://dvc.org/doc/user-guide/how-to/run-dvc-on-windows)
  • I am running VSCode with the git and dvc extensions - git status might interfere.
  • I’ve installed dvc, git and VSCode with scoop.
  • I have enabled long pathname and symlink support in windows (following the advice of scoop checkup).

Which process would be so nosy to pick up new file immediately? Maybe git status if this dir/file is not listed in gitignore?)

In my experience this error does not appear when setting $Env:GIT_OPTIONAL_LOCKS=0. (see https://git-scm.com/docs/git#Documentation/git.txt-codeGITOPTIONALLOCKScode).

Environment information

Output of dvc doctor:

dvc doctor
DVC version: 3.5.1
------------------
Platform: Python 3.10.11 on Windows-10-10.0.22621-SP0
Subprojects:

Supports:
        azure (adlfs = 2023.4.0, knack = 0.10.1, azure-identity = 1.13.0),
        gdrive (pydrive2 = 1.16.1),
        gs (gcsfs = 2023.6.0),
        hdfs (fsspec = 2023.6.0, pyarrow = 12.0.1),
        http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
        oss (ossfs = 2021.8.0),
        s3 (s3fs = 2023.6.0, boto3 = 1.26.161),
        ssh (sshfs = 2023.4.1),
        webdav (webdav4 = 0.9.8),
        webdavs (webdav4 = 0.9.8),
        webhdfs (fsspec = 2023.6.0)
Config:
        Global: C:\Users\achimgaedke\AppData\Local\iterative\dvc
        System: C:\ProgramData\iterative\dvc
Cache types: hardlink, symlink
Cache directory: NTFS on C:\
Caches: local
Remotes: s3
Workspace directory: NTFS on C:\
Repo: dvc, git
Repo.site_cache_dir: C:\ProgramData\iterative\dvc\Cache\repo\a5786f5395aa8da08287147b62b67b3b
scoop list
Installed apps:

Name       Version      Source Updated             Info
----       -------      ------ -------             ----
7zip       23.01        main   2023-06-28 15:33:15     
aws        2.13.2       main   2023-07-20 08:59:51     
aws-vault  7.2.0        main   2023-06-23 09:17:31     
dark       3.11.2       main   2023-06-12 08:34:04
dbeaver    23.1.2       extras 2023-07-10 08:48:20
dvc        3.5.1        main   2023-07-12 07:30:56
firefox    115.0.2      extras 2023-07-12 07:30:03
gh         2.32.0       main   2023-07-12 13:12:19
git        2.41.0.3     main   2023-07-18 20:50:20
innounp    0.50         main   2023-07-07 09:19:53
keepass    2.54         extras 2023-06-20 10:34:13
mambaforge 23.1.0-4     extras 2023-07-17 08:10:19
sourcetree 3.4.14       extras 2023-06-29 08:06:09
sudo       0.2020.01.26 main   2023-07-11 07:30:39
terraform  1.5.3        main   2023-07-13 15:37:19
vscode     1.80.1       extras 2023-07-17 08:10:59

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 22 (10 by maintainers)

Most upvoted comments

Just wanted to add I’m experiencing the same issue on Windows 10 with the VSCode Extension, and running outside of VSCode (with it closed) did work. Thanks!

> dvc config --list
core.analytics=false
remote.aws-storage.url=s3://xxxx/dvc-storage/
core.remote=aws-storage
core.autostage=true
cache.type=copy