pytorch-lightning: Trainer.on_gpu incorrectly set to False when specifying `gpus=0`

🐛 Bug

When creating a trainer with the arg gpus=0, the field on_gpu is always set False, even on machines with CUDA available.

The existing logic for on_gpu is:

self.on_gpu = True if (gpus and torch.cuda.is_available()) else False

is buggy because 0 is “falsy”. It should probably be:

self.on_gpu = gpus is not None and torch.cuda.is_available()

To Reproduce

trainer = trainer.Trainer(gpus=0, ...)

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 1
Comments: 20 (10 by maintainers)

Most upvoted comments

@rohitgr7 @pgeez how about this clarification in the docs in the section that @rohitgr7 linked:

Number of GPUs to train on (int)
or Which GPUs to train on (list)
can handle strings

# default used by the Trainer (ie: train on CPU)
trainer = Trainer(gpus=None)
# equivalent
trainer = Trainer(gpus=0)

Would that avoid the misunderstanding you had?

awaelchli on Aug 5, 2020

@eladar Just checked, looks like it’s fixed on master. I get GPU available: True, used: False with Trainer(gpus=0)

awaelchli on Aug 10, 2020

When running trainer script without any flag this log message wrongly appears -

“GPU available: True, used: True”

even though the training does executed on the CPU

this is because self.on_gpu is True while self.single_gpu is False

(Windows 2004, python 3.6.9 Anaconda, pytorch 1.3.1)

eladar on Aug 9, 2020

I don’t think this is a documentation error, I think this is a bug. In other words, specifying gpus=0 should be perfectly valid and supported because device 0 is a valid device.

pgeez on Aug 6, 2020