tensorflow: ResNet models in tf.keras.applications contain a bias term which should not be there.
Pretrained ResNet models available as part of tf.keras.applications include a bias weight with all the convolutional layers which is weird. What is even weirder is that the pretrained weights contain all zeros for the bias weights, which is definitely a problem.
ResNet models do not use a bias term because of the use of batch normalization. Even the TensorFlow models repository, the ResNet construction code does not contain a bias term in convolutions.
The following code shows the weights in the resnet50 model in tf.keras.applications as can be seen the bias terms are all zeros.
import tensorflow as tf
model = tf.keras.applications.resnet50.ResNet50(include_top=False, weights='imagenet')
print(model.trainable_variables)
A small portion of the output shows the bias terms as all zeros.
array([[[[ 1.76406968e-02, 2.18379945e-02, 6.38491847e-03, ...,
-1.56918354e-02, 1.33828130e-02, -7.58931879e-03],
[ 6.57748384e-03, -1.13832625e-02, -1.44122150e-02, ...,
1.07535999e-02, 1.99317057e-02, -5.90330362e-03],
[ 1.96981058e-02, 6.84878789e-03, -1.30715151e-03, ...,
-8.99719913e-03, 1.00973761e-02, -1.09837623e-02],
...,
[ 4.02560830e-03, -2.51277094e-03, -1.91410668e-02, ...,
1.84022412e-02, -1.05592925e-02, 3.84159223e-03],
[-1.21582337e-02, -2.44973949e-03, -8.21000524e-03, ...,
-3.52650182e-03, 9.62345582e-03, -1.55217517e-02],
[-1.57500952e-02, -5.96316298e-03, 4.53999359e-03, ...,
4.88574570e-03, 4.60040662e-03, 8.99072620e-05]]]],
dtype=float32)>, <tf.Variable 'conv5_block3_3_conv/bias:0' shape=(2048,) dtype=float32, numpy=array([0., 0., 0., ..., 0., 0., 0.], dtype=float32)>, <tf.Variable 'conv5_block3_3_bn/gamma:0' shape=(2048,) dtype=float32, numpy=
array([1.3451786, 1.3965728, 1.4453218, ..., 1.2092956, 1.5969722,
1.4321095], dtype=float32)>, <tf.Variable 'conv5_block3_3_bn/beta:0' shape=(2048,) dtype=float32, numpy=
array([-1.4512578, -1.6519743, -1.6319023, ..., -1.6061822, -1.7218091,
-1.9375533], dtype=float32)>]
This is definitely an issue which needs to be cleared up as a lot of people are depending upon tf.keras.applications.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 15 (8 by maintainers)
This is a rather serious issue requiring an urgent fix because too many people rely upon
tf.keras.applicationsfor pretrained models. As of yet there are no V2 versions of slim models available and hencetf.keras.applicationsis the only resort.Hence, unless this issue is resolved, usage of TF V2 for code porting and new code writing is going to be a big burden.