dvc.org: dvc run step in tutorial does not give expected results

Bug Report

$ dvc version
DVC version: 1.0.1
Python version: 3.7.1
Platform: Linux-4.4.0-176-generic-x86_64-with-debian-stretch-sid
Binary: True
Package: deb
Supported remotes: azure, gdrive, gs, hdfs, http, https, s3, ssh, oss
Cache: reflink - not supported, hardlink - supported, symlink - supported
Filesystem type (cache directory): ('ext4', '/dev/mapper/qwerty--vg-root')
Repo: dvc, git
Filesystem type (workspace): ('ext4', '/dev/mapper/qwerty--vg-root')

Following https://dvc.org/doc/use-cases/versioning-data-and-model-files/tutorial, dvc run step returns an error instead of creating the Dvcfile/dvc.yaml file:

dvc run -f Dvcfile \
          -d train.py -d data \
          -M metrics.csv \
          -o model.h5 -o bottleneck_features_train.npy -o bottleneck_features_validation.npy \
          python train.py

ERROR: `-n|--name` is required

iterative/dvc#4077 states that -f option is deprecated. Running with the -n option results in a subsequent error:

dvc run \
    -d train.py \
    -d data \
    -n training \
    -M metrics.csv \
    -o model.h5 \
    -o bottleneck_features_train.npy \
    -o bottleneck_features_validation.npy \
    python train.py --verbose

ERROR: output 'model.h5' is already specified in stage: 'model.h5.dvc'.

Please update the tutorial to reflect changes in v1.0.

Thanks!

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 22 (16 by maintainers)

Commits related to this issue

Most upvoted comments

@jorgeorpinel , my question was, how do you proceed when you encounter this error? I think, I solved it by deleting the model.pkl.dvc file.

@sarthakforwet Yes, I followed the tutorial sequentially as written.

@jorgeorpinel Yes, I did (as instructed in the tutorial). But I reviewed it and noticed: We manually added the model output here, which isn't ideal. The preferred way of capturing command outputs is with dvc run. More on this later.

Removing the model.h5.dvc file and re-running the updated tutorial runs successfully! 🎊

Thank you all!

Perhaps it would be clearer if the documentation re-stated this before the Automating capturing section.

OK, this should be resolved then. Thanks!

ERROR: output ‘model.h5’ is already specified in stage: ‘model.h5.dvc’.

Maybe Shachi ran dvc add model.h5 on their own. But anyway, the tutorial is now up to date for 1.0

Oh, actually this was already done in https://github.com/iterative/dvc.org/pull/1526/files but we haven’t double checked the instructions. @shachibista if you try again please let us know your results.

@sarthakforwet can you confirm whether you ran the whole tutorial after the update? Or if you are able to do so and share the results next? Thanks

@shachibista no huge conceptual changes! Just changing commands, altering text here and there (e.g. no Dvcfile anymore, but dvc.yaml).

To start I think we can try to remove -f something and put -n train - that alone should probably fix the issue.

@shcheklein Would it require a lot of (conceptual) changes to update the tutorial? If it is only a single command, I could send a pull request if you could tell the correct parameters.

@shachibista sorry about that. It’s a little bit outdated (DVC version <0.94). We changed the dvc run interface in the DVC 1.0 and haven’t had time to update some tutorials.