pytorch-lightning: Trainer.on_gpu incorrectly set to False when specifying `gpus=0`

🐛 Bug

When creating a trainer with the arg gpus=0, the field on_gpu is always set False, even on machines with CUDA available.

The existing logic for on_gpu is:

self.on_gpu = True if (gpus and torch.cuda.is_available()) else False

is buggy because 0 is “falsy”. It should probably be:

self.on_gpu = gpus is not None and torch.cuda.is_available()

To Reproduce

trainer = trainer.Trainer(gpus=0, ...)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 1
  • Comments: 20 (10 by maintainers)

Most upvoted comments

@rohitgr7 @pgeez how about this clarification in the docs in the section that @rohitgr7 linked:

  • Number of GPUs to train on (int)
  • or Which GPUs to train on (list)
  • can handle strings
# default used by the Trainer (ie: train on CPU)
trainer = Trainer(gpus=None)
# equivalent
trainer = Trainer(gpus=0)

Would that avoid the misunderstanding you had?

@eladar Just checked, looks like it’s fixed on master. I get GPU available: True, used: False with Trainer(gpus=0)

When running trainer script without any flag this log message wrongly appears -

“GPU available: True, used: True”

even though the training does executed on the CPU

this is because self.on_gpu is True while self.single_gpu is False

(Windows 2004, python 3.6.9 Anaconda, pytorch 1.3.1)

I don’t think this is a documentation error, I think this is a bug. In other words, specifying gpus=0 should be perfectly valid and supported because device 0 is a valid device.