pytorch-lightning: Trainer.on_gpu incorrectly set to False when specifying `gpus=0`
🐛 Bug
When creating a trainer with the arg gpus=0, the field on_gpu is always set False, even on machines with CUDA available.
The existing logic for on_gpu is:
self.on_gpu = True if (gpus and torch.cuda.is_available()) else False
is buggy because 0 is “falsy”. It should probably be:
self.on_gpu = gpus is not None and torch.cuda.is_available()
To Reproduce
trainer = trainer.Trainer(gpus=0, ...)
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 20 (10 by maintainers)
@rohitgr7 @pgeez how about this clarification in the docs in the section that @rohitgr7 linked:
Would that avoid the misunderstanding you had?
@eladar Just checked, looks like it’s fixed on master. I get
GPU available: True, used: FalsewithTrainer(gpus=0)When running trainer script without any flag this log message wrongly appears -
even though the training does executed on the CPU
this is because
self.on_gpu is Truewhileself.single_gpu is False(Windows 2004, python 3.6.9 Anaconda, pytorch 1.3.1)
I don’t think this is a documentation error, I think this is a bug. In other words, specifying
gpus=0should be perfectly valid and supported because device0is a valid device.