pytorch-lightning: Trainer is setting parameters with requires_grad=False to requires_grad=True (bug)

🐛 Bug

When training a model that has some parameters where requires_grad=False, the Trainer is actually setting requires_grad=True for these parameters and changing them. The bug appears to originate in the TrainerTrainLoopMixin code.

To Reproduce

Steps to reproduce the behavior:

Create a model with some parameters which have requires_grad=False
Fit the model using the Trainer
Check to see if the parameters which were set with `requires_grad=False’ have changed.

Code sample (to reproduce the bug)

import torch
import numpy as np
import os
from torch.nn import functional as F
from torch.utils.data import DataLoader
import pytorch_lightning as pl

# Make toy dataset
features = torch.from_numpy(np.asarray([[0],[0],[0],[1],[1],[1]])).float()
targets = torch.from_numpy(np.asarray([0,0,0,1,1,1]))
train = torch.utils.data.TensorDataset(features, targets)
train_loader = torch.utils.data.DataLoader(train, batch_size=2, shuffle=True)


#Define lightning model
class CoolSystem(pl.LightningModule):

    def __init__(self):
        super(CoolSystem, self).__init__()
        self.l1 = torch.nn.Linear(1, 10)
        self.l2 = torch.nn.Linear(10, 2)
        for param in self.l2.parameters():
            param.requires_grad = False
        self.loss_func = torch.nn.CrossEntropyLoss()
   
    def forward(self, x):
        return self.l2(torch.relu(self.l1(x)))

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self.forward(x)
        loss = self.loss_func(y_hat, y)
        tensorboard_logs = {'train_loss': loss}
        return {'loss': loss, 'log': tensorboard_logs}

    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=0.02)

    @pl.data_loader
    def train_dataloader(self):
        return train_loader

# Run the lightning model (check parameter before and after training)

coolsystem = CoolSystem()
print(list(coolsystem.parameters())[3])
trainer = pl.Trainer(min_epochs=10, max_epochs=10, logger=False)    
trainer.fit(coolsystem)
list(coolsystem.parameters())[3]

Expected behavior

Expected

The parameters with requires_grad == False should not change during training.

Actual

The printed out parameter before training has requires_grad == False, but after training with the Trainer, the parameter now has requires_grad == True and has changed values.

Environment

PyTorch Version 1.3.1
Linux
PyTorch installed with pip
Python 3.7.1
pytorch-lightning 0.6.0

Where I think the issue is!

Here is the code snippet from training_loop.py that I think is causing the issue:

class TrainerTrainLoopMixin(ABC):
            .
            .
            .
    def run_training_batch(self, batch, batch_idx):
            .
            .
            .
            # call training_step once per optimizer
            for opt_idx, optimizer in enumerate(self.optimizers):
                # make sure only the gradients of the current optimizer's paramaters are calculated
                # in the training step to prevent dangling gradients in multiple-optimizer setup.
                for param in self.get_model().parameters():
                    param.requires_grad = False
                for group in optimizer.param_groups:
                    for param in group['params']:
                        param.requires_grad = True

As you can see, the params in the model are all set to param.requires_grad = True during each training batch!

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 16 (11 by maintainers)

Most upvoted comments

I did this in regular Pytorch for a recent paper. You bring up some good points though, I did not freeze any parameters so that example may not apply. I do think that if this is specifically a GAN issue though, maybe there is a GAN specific solution? Maybe not though.

Anyways, I appreciate the quick response and the fix should be appropriate for my current work!

colehurwitz on Jan 21, 2020