dvc: checkout: consistency in handling files that are missing version info
dvc checkout
(and dvc pull
since it uses it internally) will error-out if version-info is missing, while dvc push
will just warn us, which creates a confusing inconsistency. For example, it makes CML guys use || true
for dvc pull
https://github.com/DavidGOrtega/cml-dvc-test/runs/2438634469 CC @DavidGOrtega , as they don’t have dvc.lock
on initial run.
Current dvc push
behaviour with warnings is also quite annoying and doesn’t scale well, as it might get way to noisy for a big pipeline. So we should probably agree on one common behaviour for all such cases that will make sense to our users. CC @dberenbaum
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (20 by maintainers)
related #4746
@karajan1001 This is the scenario as far as I remember:
dvc add
data and commit.dvc
file.dvc stage add
and commitdvc.yaml
(the stage does not get run).git push
anddvc push
, which warns the user that version info is missing for the outputs of thedvc.yaml
stage.git pull
anddvc pull
, which throws an error because version info is missing for the outputs of thedvc.yaml
stage.EDIT: A much simpler summary is that no
dvc.lock
file exists.I don’t have a good idea on why it deletes the file on missing version info. DVC behaves like this in a lot of places, which I’d like to fix (eg:
dvc add
removing stage files on failure, etc). Pinging @efiop.Isn’t this what
--quiet
is for? It would be great to have some nice summary option that would be a compromise between no output and overly verbose output, but I think that’s separate from the issue here, which is about an inconsistency betweenpush
andpull
. If I have donedvc stage add ... && dvc push
, thendvc pull
right after should not fail.