dvclive: dvc.yaml ignored if entire dvclive folder is tracked as a stage output
Reproduction script below. exp show
shows nothing in the workspace once dvclive
is added as a stage output.
$ git clone git@github.com:iterative/dvclive-exp-tracking.git
$ cd dvclive-exp-tracking
$ python train.py
$ dvc exp show
────────────────────────────────────────────────────────────────────────────────────
Experiment Created train_loss epoch step encoder_size
────────────────────────────────────────────────────────────────────────────────────
workspace - 0.026054 4 499 64
main Feb 01, 2023 - - - -
└── 4f01b48 [leafy-wage] 01:10 PM 0.026054 4 499 64
────────────────────────────────────────────────────────────────────────────────────
$ dvc stage add -n train -d train.py -O dvclive python train.py
$ dvc exp show
──────────────────────────────────────────────────────────────────────────────────────>
Experiment Created train_loss epoch step encoder_size >
──────────────────────────────────────────────────────────────────────────────────────>
workspace - - - - - >
main Feb 01, 2023 - - - - >
└── 4f01b48 [leafy-wage] 01:10 PM 0.026054 4 499 64 >
──────────────────────────────────────────────────────────────────────────────────────>
This happens because dvc.yaml
is within the output, which DVC doesn’t seem to like. You can avoid it by writing the stage like:
$ dvc stage add -f -n train -d train.py -O dvclive/metrics.json -o dvclive/plots python train.py
However, it’s probably not safe to rely on users doing this. Some ways we can remedy it:
- Write to root
dvc.yaml
(see discussion in #381). - Fix on DVC side to look for
dvc.yaml
inside an output (not sure this makes sense since it could include pipeline stages, which would be odd to have inside an output). - Other ideas?
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 35 (32 by maintainers)
OK so I maybe running into something related to this (apologies if user error). I’m using pipelines. Basically I thought I should write plots and metrics to a folder specified as an output of a stage. However, DVC doesn’t like this and plots don’t show up because the
dvc.yaml
is in a folder that’s an output. After scratching my head for an unreasonably long time I finally saw this issue and realized that having the output folder as the same as where dvclive writes is no bueno.FWIW I found this really unintuitive…
Is it possible to warn the user (via console warning in dvclive) that they shouldn’t use
dir=somedir
ifsomedir
is listed as an output in any pipeline stage?Don’t mention
dvclive
folder at all in the rootdvc.yaml
? What would be the downside?Makes sense since that’s what it was for a long time. Now it does more, which is needed outside dvclive (for example, to configure custom plots or add custom model artifacts), so we need to educate that it’s more than a pipeline file.
Right, it’s auto-generated and not intended to be changed by a user, but it contains the same info as any other
dvc.yaml
file (specifying plots fields, artifact names/types/labels, etc.) and doesn’t have quite the same kind of info as a file likedvc.lock
(it doesn’t contain hash, md5, etc.).If I understand your proposal, we may not need a new file, since this is basically what
dvc.lock
does. We have had discussions about moving away from.dvc
files and consolidating all this info intodvc.lock
. This whole discussion is closely related to #648, and one of the suggestions there is to keep all info in one lock file and get rid of those.dvc
files.Note that we would still need a
dvc.yaml
file to specify info like what field to use as the x-axis for the plot. This is whatdvc.yaml
does today, and I don’t yet see a good reason to that info outsidedvc.yaml
(users still need to be able to configure those fields, both within stages and for standalone plots, artifacts. etc.).This makes sense to me, wdyt @dberenbaum ?
yes, what I said above applies to updates or overwrites or even a new file.
What I mean is:
As soon as it enters the block, it’ll truncate and create an empty file which might be visible to the other processes. Or, multiple
write
might be required to completely write to that file.Again, it’s a general comment regarding safety. We allow other commands to run in parallel.
Why? So you think DVCLive should be repeatedly opening and writing to
dvc.yaml
?Yep, sounds like that. That’s what we do in VS Code btw with @sroy3, at least for now.
So default should be to suggest
$ dvc stage add -f -n train -d train.py
here?I agree with @daavoo, I also don’t see a good reason to mention it at all, tbh. Except if metrics become really large, but probably solving that with a granular
-O dvclive/metrics.json -o dvclive/plots
should be fine?