tensorflow: ValueError from invalid weights while loading older .h5 model
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
- TensorFlow installed from (source or binary): binary
- TensorFlow version: 1.13.1 & v1.12.1-10753-g1c2ae57 2.0.0-dev20190910
- Python version: 3.7.4
- CUDA/cuDNN version: 10.1
- GPU model and memory:
Describe the current behavior I created a tfkeras model and saved it to .h5 format using TF version 1.13.1. The model can be loaded and used for inference just fine in 1.13.1. After upgrading to TF 2.0 (nightly build), loading the model results in a ValueError (see traceback below). I am using tf.compat.v1.disable_v2_behavior() in case that might make a difference.
Describe the expected behavior V1 models should load correctly, or present the user with a way to migrate the model to a more compatible format.
Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
Relevant traceback info:
File "*/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/keras/saving/save.py", line 146, in load_mode
l
return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
File "*/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 171, in lo
ad_model_from_hdf5
load_weights_from_hdf5_group(f['model_weights'], model.layers)
File "*/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 697, in lo
ad_weights_from_hdf5_group
str(len(weight_values)) + ' elements.')
ValueError: Layer #1 (named "encoder_bn_0" in the current model) was found to correspond to layer encoder_bn_0 in the save file. However t
he new layer encoder_bn_0 expects 7 weights, but the saved weights have 8 elements.
I added some print statements to get info on the weights for this batchnorm (w. renorm) layer. The names for the weights in the original model are:
['encoder_bn_0/gamma:0', 'encoder_bn_0/beta:0', 'encoder_bn_0/moving_mean:0', 'encoder_bn_0/moving_variance:0', 'encoder_bn_0/renorm_mean:
0', 'encoder_bn_0/renorm_mean_weight:0', 'encoder_bn_0/renorm_stddev:0', 'encoder_bn_0/renorm_stddev_weight:0']
The expected weight placeholders are:
[<tf.Variable 'encoder_bn_0/gamma:0' shape=(96,) dtype=float32>, <tf.Variable 'encoder_bn_0/beta:0' shape=(96,) dtype=float32>, <tf.Variab
le 'encoder_bn_0/moving_mean:0' shape=(96,) dtype=float32>, <tf.Variable 'encoder_bn_0/moving_variance:0' shape=(96,) dtype=float32>, <tf.
Variable 'encoder_bn_0/moving_stddev:0' shape=(96,) dtype=float32>, <tf.Variable 'encoder_bn_0/renorm_mean:0' shape=(96,) dtype=float32>,
<tf.Variable 'encoder_bn_0/renorm_stddev:0' shape=(96,) dtype=float32>]
I’m guessing the format for saving batchnorm (w.renorm) parameters changed at some point? Is there a way to make this backwards compatible? Or perhaps a way to migrate the save file?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (7 by maintainers)
So far it’s looking correct…