pytorch-lightning: `transfer_batch_to_device` doesn't work under DP
🐛 Bug
This is discussed under #1756 and I’m opening a separate issue here for visibility.
In the training loop, for DP/DDP/DDP2, we do not move the data to devices ourselves, but instead use the default scatter to transfer data. This results in transfer_batch_to_device
not being called.
Expected behavior
Ideally, we want transfer_batch_to_device
to work in all settings. If it’s not possible at all to override this behavior, at least a run-time warning and/or some warning in the doc should be given.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 19 (15 by maintainers)
Not sure if the earlier label removal counts towards a new “activity” by the stale bot, so commenting here to indicate that this is not stale and still needs to be addressed.
this has been updated in the recent refactors, so won’t be a problem now. you can try master. the new version will be officially released next week I guess.
@rubencart This seems to be unrelated, and should not be a problem for DDP. This part of the code underwent a lot of changes lately. If you don’t mind, would you send us a repro example in a new issue? If you ping me there and it is fixable, I will fix it.
@edenlightning yes. Still not supported for DP AFAIK.