dvclive: Model Checkpoint Callback error (torch lighning)
Hi - when running a model I am getting a warning when saving a model. Running dvc repro model_train
, could the exeception handling be improved here? I had to track it in the source.
WARNING:dvclive:Failed to dvc add
.../DvcLiveLogger/AE_embed_v1/checkpoints
: cannot update ‘checkpoints’: not a data source
from dvclive.lightning import DVCLiveLogger
...
dvclive_logger = DVCLiveLogger(
f"{model}_embed_v1", prefix=model, dir=f"dvclive/{model}", log_model=True
)
- stages:
- train.model:
metrics:
- dvclive/AE
outs:
- DvcLiveLogger/AE_embed_v1/
dvc doctor:
$ dvc doctor DVC version: 2.58.2 (conda) --------------------------- Platform: Python 3.10.11 on Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 Subprojects: dvc_data = 0.51.0 dvc_objects = 0.25.0 dvc_render = 0.5.3 dvc_task = 0.3.0 scmrepo = 1.2.1 Supports: http (aiohttp = 3.8.5, aiohttp-retry = 2.8.3), https (aiohttp = 3.8.5, aiohttp-retry = 2.8.3) Config: Global: ~/.config/dvc System: /etc/xdg/dvc Cache types: hardlink, symlink Cache directory: 9p on D:\ Caches: local Remotes: local Workspace directory: 9p on D:\ Repo: dvc, git Repo.site_cache_dir: /var/tmp/dvc/repo/dd3f8e5f1731970476b1faa6cc8caf7a
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 15 (7 by maintainers)
this part should be significantly improved recently in the VS Code DVC extension. Could you give it a try again please?
Any reason you use
dvc repro
overdvc exp run
?dvc exp run
should work as a drop-in replacement, and not all DVCLive features work as intended (including some of the warnings/errors) withdvc repro
, although we will work on fixing that as much as we can.Related: https://github.com/iterative/dvclive/issues/670