dvc: Not able to push data of dependencies to the remote
Bug Report
Description
I’m not able to push data of dependencies in the dvc.yaml to the remote.
Reproduce
…/dvc.yaml
$ dvc repro
$ dcv add ../../data/my_data.csv
$ dvc push ../../data/my_data.csv
Error: failed to push data to the cloud - ‘…/…/data/my_data.csv’ does not exist as an output or a stage name in ‘dvc.yaml’: Stage ‘…/…/data/my_data.csv’ not found inside ‘dvc.yaml’ file
Expected
my_data.csv is uploaded to the cloud successfully.
Environment information
- dvc 2.4.3
Output of dvc doctor
:
DVC version: 2.4.3 (conda)
---------------------------------
Platform: Python 3.8.10 on macOS-10.15.3-x86_64-i386-64bit
Supports: http, https
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5
Caches: local
Remotes: local
Workspace directory: apfs on /dev/disk1s5
Repo: dvc, git
Additional Information (if any):
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 17 (9 by maintainers)
@Christoph-1 using the rules I suggested,
my_data.csv
will be ignored by the first ruledata/**
.The subdirectory exclusion
!data/**/
only applies to the subdirectory paths (which end in a trailing slash), and essentially just forces git to traverse into subdirectories (so that it can see the.dvc
files. All files inside subdirectories will still be ignored due to the first rule.data/folder/my_data.csv
does not match!data/**/
sincemy_data.csv
is not a directory.So the way the rules work together is:
Another way to think about it would be that these rules are equivalent to the following for
data/folder/
:You can verify this behavior yourself using
git check-ignore
You can see that only
.dvc
files are excluded by these rules. My data file paths (foo
andbar
) remain ignored by the first rule.@Christoph-1 to properly exclude your
.dvc
files you will need something likeThe issue is that git will not traverse into subdirectories of an ignored dir unless the subdirectory itself is also explicitly excluded with a
!
rule. So in your example, git won’t traverse intodata/folder
at all, since it is ignored bydata/*
, and the!data/folder/my_data.csv.dvc
exclusion will never be considered.@Christoph-1 can you post the contents of this
.gitignore
? Exclusions for files inside git-ignored directories have to be written in a specific way, so depending on how you are currently writing the exclusion formy_data.csv.dvc
it’s possible that git is still ignoring it.