ai-imu-dr: train parameters size mismatch

follow testing steps, but meet the following error. It seems that the model parameters do not correspond to the model definition.

/data/github_code/ai-imu-dr/src/main_kitti.py in launch(args)
     29 
     30     if args.test_filter:
---> 31         test_filter(args, dataset)
     32 
     33     if args.results_filter:

/data/github_code/ai-imu-dr/src/main_kitti.py in test_filter(args, dataset)
    427     from IPython import embed; embed()
    428 
--> 429     torch_iekf.load(args, dataset)
    430     iekf.set_learned_covariance(torch_iekf)
    431 

/data/github_code/ai-imu-dr/src/utils_torch_filter.py in load(self, args, dataset)
    461         if os.path.isfile(path_iekf):
    462             mondict = torch.load(path_iekf)
--> 463             self.load_state_dict(mondict)
    464             cprint("IEKF nets loaded", 'green')
    465         else:

~/miniconda3/envs/dfvo/lib/python3.6/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
    775         if len(error_msgs) > 0:
    776             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
--> 777                                self.__class__.__name__, "\n\t".join(error_msgs)))
    778         return _IncompatibleKeys(missing_keys, unexpected_keys)
    779 

RuntimeError: Error(s) in loading state_dict for TORCHIEKF:
        Unexpected key(s) in state_dict: "mes_net.cov_net.8.weight", "mes_net.cov_net.8.bias", "mes_net.cov_net.12.weight", "mes_net.cov_net.12.bias", "mes_net.cov_net.16.weight", "mes_net.cov_net.16.bias". 
        size mismatch for mes_net.cov_net.4.weight: copying a param with shape torch.Size([64, 32, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 5]).
        size mismatch for mes_net.cov_net.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).

torch version

torch                              1.1.0     
torchvision                        0.3.0

About this issue

Original URL
State: open
Created 3 years ago
Reactions: 2
Comments: 37

Most upvoted comments

Hi @scott81321, could you please describe more about that actually what modifications were done in main_kitti.py to get the best results? Also, you mentioned data given to you, so are you talking about the dataset given to you by the author or your own dataset?

Hi guys. As I can tell there is a mismatch in format between the file iekfnets.p and what CNN format is. Notice that Brossard’s default is on test mode, not train mode. I saw discrepancies in the values for the noise covariances of his thesis and what he encoded for the OXTS data files of his test data. This suggests to me that he hardwired these numbers to get the best test results for his test cases and kind of relinquished the training aspect in a pragmatic way. These noise covariances are in the initials ones on main_kitti.py and less importantly in utils_numpy_filter.py I had to modify the ones in main_kitti.py to get the best results for the data given to me. So I would like to ask all of you: what does iefknets.p contain? Is it only noise covariances? If so, which ones?

Hi @scott81321, could you please describe more about that actually what modifications were done in main_kitti.py to get the best results? Also, you mentioned data given to you, so are you talking about the dataset given to you by the author or your dataset?

The data is proprietary and I cannot tell you where it came from. It’s not OXTS data. That much I can tell you. The IMU sensor is not as high quality. As I said to get the best results, I had to change the noise covariances - variables starting with cov_ in the python files I mentioned. I cannot and will not tell what settings I used, only point out that I had to increase them. To find the best results, I tried many simulations on the same data until I found a range that worked well.

scott81321 on Nov 22, 2022

Hi guys. As I can tell there is a mismatch in format between the file iekfnets.p and what CNN format is. Notice that Brossard’s default is on test mode, not train mode. I saw discrepancies in the values for the noise covariances of his thesis and what he encoded for the OXTS data files of his test data. This suggests to me that he hardwired these numbers to get the best test results for his test cases and kind of relinquished the training aspect in a pragmatic way. These noise covariances are in the initials ones on main_kitti.py and less importantly in utils_numpy_filter.py I had to modify the ones in main_kitti.py to get the best results for the data given to me.

So I would like to ask all of you: what does iefknets.p contain? Is it only noise covariances? If so, which ones?

scott81321 on Nov 20, 2022

@nothing371442 didn’t you get any errors while training as mentioned in #72? Did you make any changes to getting the train option (set to 1) working on the existing dataset provided by the author? Could you help me out with it.

Did you delete the iekfnets.p file first? I delete the iekfnets.p file firstly, and do train option (set to 1), which can generate a new .p file.

Yes, I have deleted this file and set the train option (set to 1), but it gives me an error similar to #72.

Hi, did you download the provided delta_p.p file firstly?

nothing371442 on Nov 21, 2022

Nice! What did you change? Running the model which is provided by the author does not work…

lumyus on Jan 11, 2022

I also get something very similar: RuntimeError: Error(s) in loading state_dict for TORCHIEKF: Unexpected key(s) in state_dict: “mes_net.cov_net.8.weight”, “mes_net.cov_net.8.bias”, “mes_net.cov_net.12.weight”, “mes_net.cov_net.12.bias”, “mes_net.cov_net.16.weight”, “mes_net.cov_net.16.bias”. size mismatch for mes_net.cov_net.4.weight: copying a param with shape torch.Size([64, 32, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 5]). size mismatch for mes_net.cov_net.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).

Part of the problem goes away if you adjust the sizes in mesnet but either I cannot find (so far) make the right size adjustments to make completely the problem go away,

=> This happens if path_iekf finds the file …/temp/iekfnets.p However, if it is not there the program carries and I still get the beautiful plot as shown in Github namely the route segment of file 2011_09_30_drive_0028_extract

scott81321 on Jan 10, 2022