I2L-MeshNet_RELEASE: Cannot reproduce training performance

Hi Gyeongsik,

I am working on reproducing the numbers reported in the paper. Train dataset: H36M, MuCo, COCO Test dataset: 3DPW

I am using pytorch 1.8, python 3.8, cuda10

I did two runs. Here is the performance of snapshot12.pth on 3DPW dataset (last checkpoint of lixel stage)

Train Batch Size per GPU = 16, Number of GPUs = 4 (this is the default config)

MPJPE from lixel mesh: 96.23 mm
PA MPJPE from lixel mesh: 60.68 mm

Train Batch Size per GPU = 24, Number of GPUs = 8 (bigger batch config)

MPJPE from lixel mesh: 96.37 mm
PA MPJPE from lixel mesh: 61.51 mm

I also trained the bigger batch config (run2) for the param stage. Here is the performance snapshot17.pth and snapshot15.pth (the best checkpoint) on 3DPW dataset.

snapshot17.pth, param stage
MPJPE from lixel mesh: 95.85 mm
PA MPJPE from lixel mesh: 61.21 mm
MPJPE from param mesh: 98.11 mm
PA MPJPE from param mesh: 61.64 mm

snapshot15.pth, param stage
MPJPE from lixel mesh: 95.65 mm
PA MPJPE from lixel mesh: 60.97 mm
MPJPE from param mesh: 97.22 mm
PA MPJPE from param mesh: 60.82 mm

I am still waiting on the param stage of the default config, will edit this then. But the reported MPJPE for lixel is 93.2 and it looks unlikely that I will converge there. Any suggestions? Should I train longer?

Thank you would greatly appreciate your help.

About this issue

Original URL
State: open
Created 3 years ago
Comments: 16 (8 by maintainers)

Most upvoted comments

We found that the longer training is not necessary

mks0601 on Jan 14, 2023

Sorry I changed common/base.py Now it gonna work

mks0601 on Sep 25, 2021