dvclive: Model Checkpoint Callback error (torch lighning)

Hi - when running a model I am getting a warning when saving a model. Running dvc repro model_train, could the exeception handling be improved here? I had to track it in the source.

https://github.com/iterative/dvclive/blob/e8d008eb5f2b2632786afd3b27897e57b7867e43/src/dvclive/live.py#L493-L495

WARNING:dvclive:Failed to dvc add .../DvcLiveLogger/AE_embed_v1/checkpoints: cannot update ‘checkpoints’: not a data source

from dvclive.lightning import DVCLiveLogger
...
dvclive_logger = DVCLiveLogger(
    f"{model}_embed_v1", prefix=model, dir=f"dvclive/{model}", log_model=True
)
  - stages:
    - train.model:
      metrics:
        - dvclive/AE
      outs:
        - DvcLiveLogger/AE_embed_v1/

dvc doctor:

$ dvc doctor
DVC version: 2.58.2 (conda)
---------------------------
Platform: Python 3.10.11 on Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Subprojects:
        dvc_data = 0.51.0
        dvc_objects = 0.25.0
        dvc_render = 0.5.3
        dvc_task = 0.3.0
        scmrepo = 1.2.1
Supports:
        http (aiohttp = 3.8.5, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.5, aiohttp-retry = 2.8.3)
Config:
        Global: ~/.config/dvc
        System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: 9p on D:\
Caches: local
Remotes: local
Workspace directory: 9p on D:\
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/dd3f8e5f1731970476b1faa6cc8caf7a

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 15 (7 by maintainers)

Most upvoted comments

I couldn’t visualise the exp’s easily, and the VScode studio was locking the git index

this part should be significantly improved recently in the VS Code DVC extension. Could you give it a try again please?

Any reason you use dvc repro over dvc exp run? dvc exp run should work as a drop-in replacement, and not all DVCLive features work as intended (including some of the warnings/errors) with dvc repro, although we will work on fixing that as much as we can.

Related: https://github.com/iterative/dvclive/issues/670