pytorch-lightning: Logging with "self.log" in training_step does not create any outputs in progress bar or external Logger when loss isn't returned
🐛 Bug
I think the newly introduced log function function does not log properly while being used in the training_step. The same code in validation_step creates the desired results.
def training_step(self, batch, batch_idx):
output = self.layer(batch)
loss = self.loss(batch, output)
self.log("loss", loss, prog_bar=True, logger=True, on_step=True, on_epoch=True)
self.log("my_metric_train", 1001, prog_bar=True, logger=True, on_step=True, on_epoch=True)
##### Doesn't Work #######
def validation_step(self, batch, batch_idx):
output = self.layer(batch)
loss = self.loss(batch, output)
self.log("val_loss", loss, prog_bar=True, logger=True, on_step=True, on_epoch=True)
self.log("my_metric_val", 1001, prog_bar=True, logger=True, on_step=True, on_epoch=True)
##### Works #######
Please reproduce using
https://gist.github.com/tobiascz/bb2c6de83263eb38181052840062b5ac
Expected behavior
Logs created in training_step should show up in the prog_bar and loggers (such as tensorboard logger). Same code in the validation_step creates the desired results.
Environment
- CUDA:
- GPU:
- Tesla T4
- available: True
- version: 10.1
- GPU:
- Packages:
- numpy: 1.18.5
- pyTorch_debug: False
- pyTorch_version: 1.6.0+cu101
- pytorch-lightning: 0.10.0
- tqdm: 4.41.1
- System:
- OS: Linux
- architecture:
- 64bit
- processor: x86_64
- python: 3.6.9
- version: #1 SMP Thu Jul 23 08:00:38 PDT 2020 In [ ]:
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 17 (7 by maintainers)
Just a heads up for future people, there is a flag for Trainer, log_every_n_step is defaulted to be 50, so if you encounter similar issue as described as in this thread, try modify it.
Hello, I’m using pytorch-lightning 1.3.2, and facing a similar issue as well
It only logged the output of
val_loss,val_label_lossandval_score_lossto Tensorboard, the training and test loss values are no where to be seenThanks for pointing this out @itsikad. based on the comment above I am reopening the issue.
Expected behavior logging in the training_step should be independent of optimising the model e.g. returning a loss. Even if the training_step has some issues and does not return a loss, the logging should work as expected.
I have the same issue, did you find any solution or workaround?
Hey @hecoding,
You might want to have a look at this: https://github.com/PyTorchLightning/pytorch-lightning/pull/4618
Best, T.C
Sometimes there’s no unambiguous way of returning a single loss tho, i.e. GAN training. What I’m doing to bypass the bug rn is this, hopefully there’s no impact on the optimization: (I’m using
automatic_optimization=Falsebtw)Dear @tobiascz,
Thanks for noticing this wrong behaviour. I will look into it asap.
Best regards, Thomas Chaton.
@tobiascz I believe it isn’t a common issue, but from a design perspective, is it the desired behavior? I would expect the logs to work independently of whether the user returned the loss or not, however, a user warning/error should be raised in the case the training step returned None.