tensorflow: ValueError from invalid weights while loading older .h5 model

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): binary
TensorFlow version: 1.13.1 & v1.12.1-10753-g1c2ae57 2.0.0-dev20190910
Python version: 3.7.4
CUDA/cuDNN version: 10.1
GPU model and memory:

Describe the current behavior I created a tfkeras model and saved it to .h5 format using TF version 1.13.1. The model can be loaded and used for inference just fine in 1.13.1. After upgrading to TF 2.0 (nightly build), loading the model results in a ValueError (see traceback below). I am using tf.compat.v1.disable_v2_behavior() in case that might make a difference.

Describe the expected behavior V1 models should load correctly, or present the user with a way to migrate the model to a more compatible format.

Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem.

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

Relevant traceback info:

  File "*/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/keras/saving/save.py", line 146, in load_mode
l
    return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
  File "*/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 171, in lo
ad_model_from_hdf5
    load_weights_from_hdf5_group(f['model_weights'], model.layers)
  File "*/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 697, in lo
ad_weights_from_hdf5_group
    str(len(weight_values)) + ' elements.')
ValueError: Layer #1 (named "encoder_bn_0" in the current model) was found to correspond to layer encoder_bn_0 in the save file. However t
he new layer encoder_bn_0 expects 7 weights, but the saved weights have 8 elements.

I added some print statements to get info on the weights for this batchnorm (w. renorm) layer. The names for the weights in the original model are:

['encoder_bn_0/gamma:0', 'encoder_bn_0/beta:0', 'encoder_bn_0/moving_mean:0', 'encoder_bn_0/moving_variance:0', 'encoder_bn_0/renorm_mean:
0', 'encoder_bn_0/renorm_mean_weight:0', 'encoder_bn_0/renorm_stddev:0', 'encoder_bn_0/renorm_stddev_weight:0']

The expected weight placeholders are:

[<tf.Variable 'encoder_bn_0/gamma:0' shape=(96,) dtype=float32>, <tf.Variable 'encoder_bn_0/beta:0' shape=(96,) dtype=float32>, <tf.Variab
le 'encoder_bn_0/moving_mean:0' shape=(96,) dtype=float32>, <tf.Variable 'encoder_bn_0/moving_variance:0' shape=(96,) dtype=float32>, <tf.
Variable 'encoder_bn_0/moving_stddev:0' shape=(96,) dtype=float32>, <tf.Variable 'encoder_bn_0/renorm_mean:0' shape=(96,) dtype=float32>,
<tf.Variable 'encoder_bn_0/renorm_stddev:0' shape=(96,) dtype=float32>]

I’m guessing the format for saving batchnorm (w.renorm) parameters changed at some point? Is there a way to make this backwards compatible? Or perhaps a way to migrate the save file?

tensorflow: ValueError from invalid weights while loading older .h5 model

About this issue

Most upvoted comments