dvc: plots: return error messages for failed plots

Description and Motivation

If plots can’t be processed we log only basic message and skip those plots in the json output:

$ dvc plots diff main workspace -o .dvc/tmp/plots --split --show-json -v --targets missclassified.jpg

DVC failed to load some plots for following revisions: 'workspace, main'.
{
  "missclassified.jpg": []
}

We need to have better results, granular messages about failed plots so that we can show in VS Code properly instead of silently ignoring it, see

It’s related to this issues - https://github.com/iterative/vscode-dvc/issues/2277 and https://github.com/iterative/vscode-dvc/issues/1649 in VS Code repo. Very high level - we need to distinguish absent plots from errors and show some signal to users vs silently ignoring things and/or showing misleading messages (refresh button when there is nothing to refresh in an experiment).

Current Output

All examples are done for multiple revisions, --json + --split flags.

Single image

"eval/importance.png": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/example-repos-dev/example-get-started/build/example-get-started/dvc_plots/workspace_eval_importance.png"
    },
    {
      "type": "image",
      "revisions": [
        "c475deb7448319fab434d5650264dd2dd91bad43"
      ],
      "url": "/Users/ivan/Projects/example-repos-dev/example-get-started/build/example-get-started/dvc_plots/c475deb7448319fab434d5650264dd2dd91bad43_eval_importance.png"
    },
    {
      "type": "image",
      "revisions": [
        "7e4e86ca117f1bbef288f2abebfc7c97d0a9925d"
      ],
      "url": "/Users/ivan/Projects/example-repos-dev/example-get-started/build/example-get-started/dvc_plots/7e4e86ca117f1bbef288f2abebfc7c97d0a9925d_eval_importance.png"
    }
  ]

Flexible (top-level) plot

"dvc.yaml::Precision-Recall": [
    {
      "type": "vega",
      "revisions": [
        "7e4e86ca117f1bbef288f2abebfc7c97d0a9925d",
        "c475deb7448319fab434d5650264dd2dd91bad43",
        "workspace"
      ],
      "content": {
        "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
        "data": {
          "values": "<DVC_METRIC_DATA>"
        },
        "title": "dvc.yaml::Precision-Recall",
        "width": 300,
        "height": 300,
        "mark": {
          "type": "line",
          "point": true,
          "tooltip": {
            "content": "data"
          }
        },
        "encoding": {
          "x": {
            "field": "recall",
            "type": "quantitative",
            "title": "recall"
          },
          "y": {
            "field": "precision",
            "type": "quantitative",
            "title": "precision",
            "scale": {
              "zero": false
            }
          },
          "color": {
            "field": "rev",
            "type": "nominal"
          }
        }
      },
      "datapoints": {
        "workspace": [
          {
            "precision": 0.30321774445485783,
            "recall": 1.0,
            "threshold": 0.0,
            "dvc_data_version_info": {
              "revision": "workspace",
              "filename": "eval/prc/train.json",
              "field": "precision"
            }
          },
         {"...."},
         {
            "precision": 0.6694635900509439,
            "recall": 0.9359028068705488,
            "threshold": 0.20869278966952978,
            "dvc_data_version_info": {
              "revision": "workspace",
              "filename": "eval/prc/test.json",
              "field": "precision"
            }
          }

Multiple images

"mispredicted/croissant/muffin-16115-13825-26827-1d8e67e0bffdfebcdb3b337787823ab6.jpeg": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/hackathon/dvc_plots/workspace_mispredicted_croissant_muffin-16115-13825-26827-1d8e67e0bffdfebcdb3b337787823ab6.jpeg"
    }
  ],
  "mispredicted/muffin/croissant-0295ed7610487b3118febb5563bc58fd.jpg": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/hackathon/dvc_plots/workspace_mispredicted_muffin_croissant-0295ed7610487b3118febb5563bc58fd.jpg"
    }
  ],
  "mispredicted/muffin/croissant-3f488b602f2a668e-3fd6af132b0dceafe014dfcf7809d2ff.jpg": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/hackathon/dvc_plots/workspace_mispredicted_muffin_croissant-3f488b602f2a668e-3fd6af132b0dceafe014dfcf7809d2ff.jpg"
    }
  ],
  "mispredicted/muffin/dog-bec8602c36317744-1827c47f8ae15e6a3c4ee660035781a4.jpg": [
    {
      "type": "image",
      "revisions": [
        "workspace"
      ],
      "url": "/Users/ivan/Projects/hackathon/dvc_plots/workspace_mispredicted_muffin_dog-bec8602c36317744-1827c47f8ae15e6a3c4ee660035781a4.jpg"
    }
  ]

Stage linear plot

"dvclive/scalars/eval/loss.tsv": [
    {
      "type": "vega",
      "revisions": [
        "d82452a"
      ],
      "content": {
        "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
        "data": {
          "values": "<DVC_METRIC_DATA>"
        },
        "title": "dvclive/scalars/eval/loss.tsv",
        "width": 300,
        "height": 300,
        "mark": {
          "type": "line",
          "point": true,
          "tooltip": {
            "content": "data"
          }
        },
        "encoding": {
          "x": {
            "field": "step",
            "type": "quantitative",
            "title": "step"
          },
          "y": {
            "field": "eval/loss",
            "type": "quantitative",
            "title": "eval/loss",
            "scale": {
              "zero": false
            }
          },
          "color": {
            "field": "rev",
            "type": "nominal"
          }
        }
      },
      "datapoints": {
        "d82452a": [
          {
            "timestamp": "1660180711394",
            "step": "0",
            "eval/loss": "2.4602549076080322",
            "dvc_data_version_info": {
              "revision": "d82452a",
              "filename": "dvclive/scalars/eval/loss.tsv",
              "field": "eval/loss"
            }
          },
          {
            "timestamp": "1660180723400",
            "step": "1",
            "eval/loss": "1.3761318922042847",
            "dvc_data_version_info": {
              "revision": "d82452a",
              "filename": "dvclive/scalars/eval/loss.tsv",
              "field": "eval/loss"
            }
          }
        ]
      }
    }
  ],

Unblocks, Related

https://github.com/iterative/vscode-dvc/issues/2277 https://github.com/iterative/vscode-dvc/issues/1649

Next Steps

A bit of research. JSON structure looks extremely suboptimal (tons of duplication), since we are changing it, I’d like to have a bit better understanding of how it’s being used. Entry point into VS Code is here.
⌛ Try to add an error for a single image
Classify and suggest how to add errors in all other cases - including directories, regular plots (e.g. linear).

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 29 (29 by maintainers)

Most upvoted comments

I think what we’ve learned is that it’s helpful to share drafts early and often to get feedback as you go so we know mostly what works in both products by the time we are ready to merge.

dberenbaum on Feb 21, 2023

I could not look into this during support duty, as some p0s/bugs came.

skshetry on Feb 14, 2023

We do seem to preserve errors during plots.collect(). We transform internal representation to the JSON format, where we lose most of the information. We could start with exposing that, what would be a good json format for incorporating errors for vscode?

skshetry on Feb 21, 2023

Yes, @dberenbaum . It’s related to this issues - https://github.com/iterative/vscode-dvc/issues/2277 and https://github.com/iterative/vscode-dvc/issues/1649 in VS Code repo. Very high level - we need to distinguish absent plots from errors and show some signal to users vs silently ignoring things and/or showing misleading messages (refresh button when there is nothing to refresh in an experiment).

Can you clarify the full scope of the issue? Is it only about plot directories, or is that merely one case you are trying to solve for?

Thus: The full scope: show error message for all plot definitions, not only directories / images.

shcheklein on Feb 8, 2023

Updated the description - some examples of the current output. Next - try to add an error for an image plot (not directory with images_ case for now (example-get-started’s importance.png).

shcheklein on Jan 17, 2023

@dberenbaum do we know if anyone else besides vs code depends on --json?

No, I don’t think so.

For the duplicated data, I’m missing something because I have different output from what @shcheklein shows above. I don’t see all the data in content.data.values. My output for dvc plots diff 504206e f586d67 workspace -o .dvc/tmp/plots --split --json looks like this:

{
  "dvc.yaml::Accuracy": [
    {
      "type": "vega",
      "revisions": [
        "504206e",
        "f586d67",
        "workspace"
      ],
      "content": {
        "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
        "data": {
          "values": "<DVC_METRIC_DATA>" # Nothing else shows up in this field.
        },
...

dberenbaum on Jan 16, 2023

Use https://github.com/iterative/vscode-dvc/blob/main/extension/src/plots/model/index.ts#L108 as an entry point.

mattseddon on Jan 16, 2023

I didn’t left any comment during research, so: We were able to implement top level plots basing on old data format. In order to support errors we will need to change the data structure returned by dvc plots ... --json.

pared on Sep 30, 2022

No, but I believe we could include it as a part of implementing iterative/vscode-dvc#1757

pared on Jul 20, 2022