dvclive: dvc.yaml ignored if entire dvclive folder is tracked as a stage output

Reproduction script below. exp show shows nothing in the workspace once dvclive is added as a stage output.

$ git clone git@github.com:iterative/dvclive-exp-tracking.git

$ cd dvclive-exp-tracking

$ python train.py

$ dvc exp show
 ────────────────────────────────────────────────────────────────────────────────────
  Experiment                 Created        train_loss   epoch   step   encoder_size
 ────────────────────────────────────────────────────────────────────────────────────
  workspace                  -                0.026054       4    499   64
  main                       Feb 01, 2023            -       -      -   -
  └── 4f01b48 [leafy-wage]   01:10 PM         0.026054       4    499   64
 ────────────────────────────────────────────────────────────────────────────────────

$ dvc stage add -n train -d train.py -O dvclive python train.py

$ dvc exp show
 ──────────────────────────────────────────────────────────────────────────────────────>
  Experiment                 Created        train_loss   epoch   step   encoder_size   >
 ──────────────────────────────────────────────────────────────────────────────────────>
  workspace                  -                       -       -      -   -              >
  main                       Feb 01, 2023            -       -      -   -              >
  └── 4f01b48 [leafy-wage]   01:10 PM         0.026054       4    499   64             >
 ──────────────────────────────────────────────────────────────────────────────────────>

This happens because dvc.yaml is within the output, which DVC doesn’t seem to like. You can avoid it by writing the stage like:

$ dvc stage add -f -n train -d train.py -O dvclive/metrics.json -o dvclive/plots python train.py

However, it’s probably not safe to rely on users doing this. Some ways we can remedy it:

  1. Write to root dvc.yaml (see discussion in #381).
  2. Fix on DVC side to look for dvc.yaml inside an output (not sure this makes sense since it could include pipeline stages, which would be odd to have inside an output).
  3. Other ideas?

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 35 (32 by maintainers)

Most upvoted comments

OK so I maybe running into something related to this (apologies if user error). I’m using pipelines. Basically I thought I should write plots and metrics to a folder specified as an output of a stage. However, DVC doesn’t like this and plots don’t show up because the dvc.yaml is in a folder that’s an output. After scratching my head for an unreasonably long time I finally saw this issue and realized that having the output folder as the same as where dvclive writes is no bueno.

FWIW I found this really unintuitive…

Is it possible to warn the user (via console warning in dvclive) that they shouldn’t use dir=somedir if somedir is listed as an output in any pipeline stage?

Other ideas?

Don’t mention dvclive folder at all in the root dvc.yaml? What would be the downside?

  1. dvc.yaml for me is a config for DVC pipelines! When I see such a file, I expected to find a pipeline configuration inside.

Makes sense since that’s what it was for a long time. Now it does more, which is needed outside dvclive (for example, to configure custom plots or add custom model artifacts), so we need to educate that it’s more than a pipeline file.

2. The dvc.yaml file generated by DVCLive looks like a dvc.lock file generated by DVC. It contains metadata to be used by DVC tools, not intended to be changed by a user. All artifacts configurations are stored and updated in the code!

Right, it’s auto-generated and not intended to be changed by a user, but it contains the same info as any other dvc.yaml file (specifying plots fields, artifact names/types/labels, etc.) and doesn’t have quite the same kind of info as a file like dvc.lock (it doesn’t contain hash, md5, etc.).

Is it possible that all these confusions are due to wrong naming? And, we should not have a second dvc.yaml at all? Could we have a separate .lock (aka artifacts.lock) to keep all artifacts metadata, versioned by Git?

  • BONUS: no need to generate a separate .dvc file for each artifact (artifacts.lock contains hashes)

If I understand your proposal, we may not need a new file, since this is basically what dvc.lock does. We have had discussions about moving away from .dvc files and consolidating all this info into dvc.lock. This whole discussion is closely related to #648, and one of the suggestions there is to keep all info in one lock file and get rid of those .dvc files.

Note that we would still need a dvc.yaml file to specify info like what field to use as the x-axis for the plot. This is what dvc.yaml does today, and I don’t yet see a good reason to that info outside dvc.yaml (users still need to be able to configure those fields, both within stages and for standalone plots, artifacts. etc.).

Is it possible to warn the user (via console warning in dvclive) that they shouldn’t use dir=somedir if somedir is listed as an output in any pipeline stage?

This makes sense to me, wdyt @dberenbaum ?

yes, what I said above applies to updates or overwrites or even a new file.

What I mean is:

with open("dvc.yaml", mode="w") as f:
   pass

As soon as it enters the block, it’ll truncate and create an empty file which might be visible to the other processes. Or, multiple write might be required to completely write to that file.

Again, it’s a general comment regarding safety. We allow other commands to run in parallel.

We do have to remember that while modifying dvc.yaml, it should be done atomically as possible, which we are not doing today.

Why? So you think DVCLive should be repeatedly opening and writing to dvc.yaml?

Yep, sounds like that. That’s what we do in VS Code btw with @sroy3, at least for now.

So default should be to suggest $ dvc stage add -f -n train -d train.py here?

I agree with @daavoo, I also don’t see a good reason to mention it at all, tbh. Except if metrics become really large, but probably solving that with a granular -O dvclive/metrics.json -o dvclive/plots should be fine?