pytorch-ts: unable to reproduce results from notebook

I am unable to reproduce results from TimeGrad Notebook. I am getting diverging loss into NaN loss.

predictor = estimator.train(dataset_train, num_workers=8)

99it [00:22, 4.39it/s, avg_epoch_loss=0.945, epoch=0] 99it [00:22, 4.40it/s, avg_epoch_loss=0.495, epoch=1] 99it [00:22, 4.39it/s, avg_epoch_loss=0.466, epoch=2] 99it [00:22, 4.35it/s, avg_epoch_loss=0.795, epoch=3] 99it [00:22, 4.33it/s, avg_epoch_loss=0.852, epoch=4] 99it [00:22, 4.32it/s, avg_epoch_loss=nan, epoch=5]
99it [00:22, 4.33it/s, avg_epoch_loss=nan, epoch=6] 99it [00:22, 4.30it/s, avg_epoch_loss=nan, epoch=7] 99it [00:23, 4.30it/s, avg_epoch_loss=nan, epoch=8] 99it [00:22, 4.34it/s, avg_epoch_loss=nan, epoch=9] 99it [00:23, 4.29it/s, avg_epoch_loss=nan, epoch=10] 99it [00:23, 4.28it/s, avg_epoch_loss=nan, epoch=11] 99it [00:22, 4.33it/s, avg_epoch_loss=nan, epoch=12] 99it [00:23, 4.21it/s, avg_epoch_loss=nan, epoch=13] 99it [00:23, 4.30it/s, avg_epoch_loss=nan, epoch=14] 99it [00:23, 4.30it/s, avg_epoch_loss=nan, epoch=15] 99it [00:22, 4.34it/s, avg_epoch_loss=nan, epoch=16] 99it [00:22, 4.34it/s, avg_epoch_loss=nan, epoch=17] 99it [00:22, 4.34it/s, avg_epoch_loss=nan, epoch=18] 99it [00:23, 4.20it/s, avg_epoch_loss=nan, epoch=19]

About this issue

Most upvoted comments

sorry to hear that… i will try to reproduce on a clean env as well!

pytorch 1.7.1+cu110

thanks! I just re-ran it again on my machine and it all worked out… very strange… 🤔

thanks for letting me know… perhaps i screwed something up while refactoring, I’ll check and get back to you