I2L-MeshNet_RELEASE: Cannot reproduce training performance
Hi Gyeongsik,
I am working on reproducing the numbers reported in the paper. Train dataset: H36M, MuCo, COCO Test dataset: 3DPW
I am using pytorch 1.8, python 3.8, cuda10
I did two runs. Here is the performance of snapshot12.pth on 3DPW dataset (last checkpoint of lixel stage)
- Train Batch Size per GPU = 16, Number of GPUs = 4 (this is the default config)
MPJPE from lixel mesh: 96.23 mm
PA MPJPE from lixel mesh: 60.68 mm
- Train Batch Size per GPU = 24, Number of GPUs = 8 (bigger batch config)
MPJPE from lixel mesh: 96.37 mm
PA MPJPE from lixel mesh: 61.51 mm
I also trained the bigger batch config (run2) for the param stage. Here is the performance snapshot17.pth and snapshot15.pth (the best checkpoint) on 3DPW dataset.
snapshot17.pth, param stage
MPJPE from lixel mesh: 95.85 mm
PA MPJPE from lixel mesh: 61.21 mm
MPJPE from param mesh: 98.11 mm
PA MPJPE from param mesh: 61.64 mm
snapshot15.pth, param stage
MPJPE from lixel mesh: 95.65 mm
PA MPJPE from lixel mesh: 60.97 mm
MPJPE from param mesh: 97.22 mm
PA MPJPE from param mesh: 60.82 mm
I am still waiting on the param stage of the default config, will edit this then. But the reported MPJPE for lixel is 93.2 and it looks unlikely that I will converge there. Any suggestions? Should I train longer?
Thank you would greatly appreciate your help.
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 16 (8 by maintainers)
We found that the longer training is not necessary
Sorry I changed common/base.py Now it gonna work