pytorch-lightning: WandbLogger(log_model=False) does not work, model is always saved to wandb logs

🐛 Bug

WandbLogger log_model argument does not work as expected. The model checkpoint is always being saved to wandb logs (project_name/run/checkpoints/checkpoint_name.ckpt). The model is not uploaded to wandb, only saved on the drive. It should not be saved on disk nor uploaded. It is especially important during sweeping (hundreds of runs exhaust available space quickly). I don’t use ModelCheckpoint here so the weights should not be saved at all.

To Reproduce

wandb_logger = WandbLogger(project=‘private_example’, config={ **model_args.dict, **data_args.dict, **training_args.dict }, log_model=False) trainer = pl.Trainer(…, logger=[wandb_logger])

Expected behavior

Checkpoints are saved if log_model True and not saved otherwise.

Environment

PyTorch Version (e.g., 1.0): 1.7.1
OS (e.g., Linux): Debian
How you installed PyTorch (conda, pip, source): conda
Python version: 3.6.9
CUDA/cuDNN version: cuda 11
wandb version: 0.10.15 (same problem on 0.10.17)
lightning version: 1.1.4 (same problem on 1.1.8)

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 17 (7 by maintainers)

Most upvoted comments

Hey in new version 1.5.9 of pytorch lightning and wandb version of 0.12.10 same issue is arising. even I have tried doing log_model = False and there is no such parameter as checkpoint_callbacks=False

someshfengde on Mar 21, 2022

Checkpointing is turned on by default. You can turn it off with checkpoint_callbacks=False.
It looks like wandb will upload everything that is in the default logging directory under wandb. If you set save_dir="./somewhere" you can prevent this.

awaelchli on Feb 8, 2021

We use wandb.save here, so it is normal that the file ends up on the disk first. It should get uploaded only at the end of the run. Maybe your process gets killed and wandb with it too, it would explain why artifacts are not uploaded

It should not be saved on disk nor uploaded. Is it possible with WandbLogger? I just don’t want 500 checkpoints on SSD after wandb sweep…

N0vel on Feb 8, 2021

Oh, I see! I am sorry, i misunderstood what you said. Thanks! 😃

fmellomascarenhas on Dec 14, 2021

As I have already stated, the issue is confirmed for 1.3.4 and fixed in the latest version. In order to see a change, you will have to upgrade Lightning using pip install --upgrade pytorch-lightning. Yes you can add wandb.init() and finish(), but it doesn’t matter.

awaelchli on Dec 14, 2021

Thank you, looks like checkpoint_callbacks=False works right (though WandbLogger documentation is misleading).

N0vel on Feb 9, 2021