crnn.pytorch: Probelms when fine tuning the models . RuntimeError: inconsistent tensor size

Hello,

l stuck with fine tuning.

1)First of all to fine tune the model you have to set --nh=“256” otherwise it will not work, you’ll get this error

( loading pretrained model from /home/ahmed/Downloads/crnn.pytorch-master/data/crnn.pth Traceback (most recent call last): File “crnn_main.py”, line 98, in <module> crnn.load_state_dict(torch.load(opt.crnn)) File “/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/torch/nn/modules/module.py”, line 335, in load_state_dict own_state[name].copy_(param) RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51 )

because the pretrained model --nh=“256” and not 100 as it is set in the default model. But when fine tuning obviously we can change the parameter, so l find it strange that it doesn’t work

tried different configurations while fine tuning the length of the alphabet , nb_classes= 37 ‘0123456789abcdefghijklmnopqrstuvwxyz’ by default

l tried the following : A) add one letter, let’s say Z or another char , . / ‘0123456789abcdefghijklmnopqrstuvwxyzZ’ l got the same error

RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51

B) l removed one char and add another remove z and add / ‘0123456789abcdefghijklmnopqrstuvwxy/’

l get the same error

RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51

C) l set alphabet only to digits ‘0123456789’

the same error

RuntimeError: inconsistent tensor size at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorCopy.c:51

train a new model with a varibale length alphabet and number of -nh it works perfectly

Have you any idea for solving the problem of fine tuning to make a variable length of alphabet and the architecture ? Thanks a lot

About this issue

Original URL
State: open
Created 7 years ago
Comments: 16 (1 by maintainers)

Most upvoted comments

Hi, @wulivicte @meijieru

What is the difference between transfer learning with the code of @meijieru as follow :

python2 crnn_main.py --trainroot="train_data/" --valroot="valid_data/" --cuda --adadelta --experiment="sotr_model/" --crnn="data/crnn.pth"

and the code of @wulivicte when you add

 pre_trainmodel = torch.load(opt.crnn)
    model_dict = crnn.state_dict()
    # replace the classfidy layer parameters
   for k,v in model_dict.items():
        if not (k == 'rnn.1.embedding.weight' or k == 'rnn.1.embedding.bias'):
            model_dict[k] = pre_trainmodel[k]

   crnn.load_state_dict(model_dict)
print(crnn)

Then :

python2 crnn_main.py --trainroot="train_data/" --valroot="valid_data/" --cuda --adadelta --experiment="sotr_model/" --crnn="data/crnn.pth"

Thank you

ahmedmazari-dhatim on Jul 1, 2017