efficient_densenet_pytorch: test failed on v0.2
the efficient_densenet_bottleneck_test.py failed in test_backward_computes_backward_pass
> assert(almost_equal(layer.conv.weight.grad.data, layer_efficient.conv_weight.grad.data))
E assert False
E + where False = almost_equal(\n(0 ,0 ,.,.) = \n 0.3746\n\n(0 ,1 ,.,.) = \n 70.7402\n\n(0 ,2 ,.,.) = \n 68.3647\n\n(0 ,3 ,.,.) = \n 5.2501\n\n(0 ,4 ,.,...) = \n 101.7459\n\n(3 ,6 ,.,.) = \n 10.9038\n\n(3 ,7 ,.,.) = \n 0.0000\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n, \n(0 ,0 ,.,.) = \n 0.0000e+00\n\n(0 ,1 ,.,.) = \n -2.0594e+24\n\n(0 ,2 ,.,.) = \n -9.6653e+20\n\n(0 ,3 ,.,.) = \n 2.1138e+21\n\n(...-1.5375e+00\n\n(3 ,6 ,.,.) = \n -7.0127e-03\n\n(3 ,7 ,.,.) = \n 0.0000e+00\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n)
E + where \n(0 ,0 ,.,.) = \n 0.3746\n\n(0 ,1 ,.,.) = \n 70.7402\n\n(0 ,2 ,.,.) = \n 68.3647\n\n(0 ,3 ,.,.) = \n 5.2501\n\n(0 ,4 ,.,...) = \n 101.7459\n\n(3 ,6 ,.,.) = \n 10.9038\n\n(3 ,7 ,.,.) = \n 0.0000\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n = Variable containing:\n(0 ,0 ,.,.) = \n 0.3746\n\n(0 ,1 ,.,.) = \n 70.7402\n\n(0 ,2 ,.,.) = \n 68.3647\n\n(0 ,3 ,.,.) = \n ...) = \n 101.7459\n\n(3 ,6 ,.,.) = \n 10.9038\n\n(3 ,7 ,.,.) = \n 0.0000\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n.data
E + where Variable containing:\n(0 ,0 ,.,.) = \n 0.3746\n\n(0 ,1 ,.,.) = \n 70.7402\n\n(0 ,2 ,.,.) = \n 68.3647\n\n(0 ,3 ,.,.) = \n ...) = \n 101.7459\n\n(3 ,6 ,.,.) = \n 10.9038\n\n(3 ,7 ,.,.) = \n 0.0000\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n = Parameter containing:\n(0 ,0 ,.,.) = \n 0.0978\n\n(0 ,1 ,.,.) = \n 1.9624\n\n(0 ,2 ,.,.) = \n 2.4802\n\n(0 ,3 ,.,.) = \n 1.06...5 ,.,.) = \n 0.4832\n\n(3 ,6 ,.,.) = \n 1.0052\n\n(3 ,7 ,.,.) = \n 1.7624\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n.grad
E + where Parameter containing:\n(0 ,0 ,.,.) = \n 0.0978\n\n(0 ,1 ,.,.) = \n 1.9624\n\n(0 ,2 ,.,.) = \n 2.4802\n\n(0 ,3 ,.,.) = \n 1.06...5 ,.,.) = \n 0.4832\n\n(3 ,6 ,.,.) = \n 1.0052\n\n(3 ,7 ,.,.) = \n 1.7624\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n = Conv2d(8, 4, kernel_size=(1, 1), stride=(1, 1), bias=False).weight
E + where Conv2d(8, 4, kernel_size=(1, 1), stride=(1, 1), bias=False) = Sequential (\n (norm): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True)\n (relu): ReLU (inplace)\n (conv): Conv2d(8, 4, kernel_size=(1, 1), stride=(1, 1), bias=False)\n).conv
E + and \n(0 ,0 ,.,.) = \n 0.0000e+00\n\n(0 ,1 ,.,.) = \n -2.0594e+24\n\n(0 ,2 ,.,.) = \n -9.6653e+20\n\n(0 ,3 ,.,.) = \n 2.1138e+21\n\n(...-1.5375e+00\n\n(3 ,6 ,.,.) = \n -7.0127e-03\n\n(3 ,7 ,.,.) = \n 0.0000e+00\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n = Variable containing:\n(0 ,0 ,.,.) = \n 0.0000e+00\n\n(0 ,1 ,.,.) = \n -2.0594e+24\n\n(0 ,2 ,.,.) = \n -9.6653e+20\n\n(0 ,3 ,.,....-1.5375e+00\n\n(3 ,6 ,.,.) = \n -7.0127e-03\n\n(3 ,7 ,.,.) = \n 0.0000e+00\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n.data
E + where Variable containing:\n(0 ,0 ,.,.) = \n 0.0000e+00\n\n(0 ,1 ,.,.) = \n -2.0594e+24\n\n(0 ,2 ,.,.) = \n -9.6653e+20\n\n(0 ,3 ,.,....-1.5375e+00\n\n(3 ,6 ,.,.) = \n -7.0127e-03\n\n(3 ,7 ,.,.) = \n 0.0000e+00\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n = Parameter containing:\n(0 ,0 ,.,.) = \n 0.0978\n\n(0 ,1 ,.,.) = \n 1.9624\n\n(0 ,2 ,.,.) = \n 2.4802\n\n(0 ,3 ,.,.) = \n 1.06...5 ,.,.) = \n 0.4832\n\n(3 ,6 ,.,.) = \n 1.0052\n\n(3 ,7 ,.,.) = \n 1.7624\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n.grad
E + where Parameter containing:\n(0 ,0 ,.,.) = \n 0.0978\n\n(0 ,1 ,.,.) = \n 1.9624\n\n(0 ,2 ,.,.) = \n 2.4802\n\n(0 ,3 ,.,.) = \n 1.06...5 ,.,.) = \n 0.4832\n\n(3 ,6 ,.,.) = \n 1.0052\n\n(3 ,7 ,.,.) = \n 1.7624\n[torch.cuda.FloatTensor of size 4x8x1x1 (GPU 0)]\n = _EfficientDensenetBottleneck (\n).conv_weight
I uncommented the code in densenet_efficient.py
self.efficient_batch_norm.training = False,
but the issue persists.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 18 (4 by maintainers)
We will be catching this repo up soon! I’ll try to get to it later today.