chainer: Exponential Shift does not effect some links after the links' lr are manually modified

  • Conditions
    • Chainer version: 5.0.0b1
    • CuPy version: 5.0.0b1
    • OS/Platform: Ubuntu 16.04
    • CUDA/cuDNN version: 9.0

Problem

I tried to reproduce lr_mult in caffe, I multiply hyperparameter lr of some links. And I use ExponentialShift to multiply all lr in the model, but the modified link’s lr does not change. I tried with the code below, and conv1’s lr does not change and still remain to 0.03. I expected that conv1’s lr will be 0.15. Is this expected return for optimizer in Chainer?

import chainer


class ExampleModel(chainer.Chain):

    def __init__(self):
        super(ExampleModel, self).__init__()
        with self.init_scope():
            self.conv1 = chainer.links.Convolution2D(3, 10, 1, 1)
            self.conv2 = chainer.links.Convolution2D(10, 10, 1, 1)


def main():
    model = ExampleModel()
    optimizer = chainer.optimizers.MomentumSGD(lr=0.01)
    optimizer.setup(model)
    model.conv1.W.update_rule.hyperparam.lr *= 3
    print('Done: conv1 lr * 3')
    print('  conv1 lr: {}'.format(model.conv1.W.update_rule.hyperparam.lr))
    print('  conv2 lr: {}'.format(model.conv2.W.update_rule.hyperparam.lr))

    optimizer.lr *= 5
    print('Done: optimizer.lr * 5')
    print('  conv1 lr: {}'.format(model.conv1.W.update_rule.hyperparam.lr))
    print('  conv2 lr: {}'.format(model.conv2.W.update_rule.hyperparam.lr))


if __name__ == '__main__':
    main()

output

$ python example.py
Done: conv1 lr * 3
  conv1 lr: 0.03
  conv2 lr: 0.01
Done: optimizer.lr * 5
  conv1 lr: 0.03
  conv2 lr: 0.05

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 15 (7 by maintainers)

Most upvoted comments

Do not close this

Chainer currently does not have a way to emulate lr_mult. It looks better to add support for it.

As a workaround, you can use multiple ExponentialShift extensions to modify lr of different parameters. For example, if you want to use x2 lr for some parameters, you can write:

optimizer.setup(model)
...
trainer.extend(extensions.ExponentialShift('lr', rate))
twice_lr = chainer.optimizer.Hyperparameter(optimizer.hyperparam)
twice_lr.lr = optimizer.lr * 2
for param in params_for_which_you_want_to_do_lr_mult:
    param.update_rule.hyperparam = chainer.optimizer.Hyperparameter(twice_lr)
trainer.extend(extensions.ExponentialShift('lr', rate, optimizer=twice_lr),
               name='my_exponential_shift'))  # name should be different from "ExponentialShift"