tfjs: Bug in deserializing TF2.0 GRU layer's bias vector in tf.loadLayersModel()
Dear tfjs-team,
I ran into an issue these days when I tried to import a former Keras .h5 model with GRU layers into tfjs, see https://github.com/tensorflow/tfjs/issues/2437
tldr: this error popped up:
Uncaught (in promise) Error: Shape mismatch: [384] vs. [2,384]
at variables.ts:135
at t.write (variables.ts:98)
at variables.ts:339
at Array.forEach (<anonymous>)
at sf (variables.ts:337)
at e.loadWeights (container.ts:598)
at models.ts:315
at common.ts:14
at Object.next (common.ts:14)
at a (common.ts:14)
On further investigation, I figured out that there’s something wrong in models.ts:300
in the deserialize()
function.
Turns out that the model object returned by deserialize()
sets faulty shapes for the bias vectors of GRU layers (those should be of shape [2, x] but are set to [x], here [384):
On the contrary, the weights loaded in models.ts:313
by io.decodeWeights()
are set correctly (here [2, 384]):
So there must be something wrong with deserialize()
or some nested functions. I really tried to dig further, but I’m basically totally foreign to JS/TS so it’s really hard for me to figure it out any further.
This bug should be easy to reproduce, just create some model with GRU layers in Keras, like this:
model = keras.models.Sequential([
# keras.layers.GRU(128, return_sequences=True, batch_input_shape=[batch_size, None, max_id+1]),
keras.layers.GRU(128, return_sequences=True, input_shape=[ None, max_id+1]),
keras.layers.GRU(128, return_sequences=True),
keras.layers.GRU(128),
keras.layers.Flatten(),
keras.layers.Dense(output_size, activation="softmax")
])
I guess you don’t even need to train it, just initializing it should be fine.
Then convert it with the tfjs-converter, and load it with tf.loadLayersModel
I’d be really grateful for any fixes or quick workarounds. Thank you in advance!
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 3
- Comments: 17
reset_after=True
doesn’t work for me:Version 1.6.0