keras: Unable to load custom initializer from the saved model, passing custom_objects is not working

I have a simple custom initializer in the model. When I try to load the model, I get invalid initialization error. I saw similar issues where the suggested solution was to pass custom_objects argument to the load_model function. However, this did not work for me. This is the code to reproduce the problem:

from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.optimizers import SGD
from keras.models import model_from_json
from keras.models import load_model
from keras import initializations

def my_init(shape, name=None):
    return initializations.normal(shape, scale=0.01, name=name)

model = Sequential()
model.add(Dense(output_dim=64, input_dim=100, init=my_init))
model.add(Activation("relu"))
model.add(Dense(output_dim=10))
model.add(Activation("relu"))

# compile the model 
model.compile(loss='categorical_crossentropy', optimizer='sgd')
print("Compilation OK!")

# load the model from json string
model.save('model.h5')
del model
model = load_model('model.h5', custom_objects={'my_init':my_init})
print("Load OK!")

Running it throws error at the load_model line:

Exception: Invalid initialization: my_init

I also tried saving and loading with json using the model_from_json function, but the same issue appears.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 10
  • Comments: 16 (2 by maintainers)

Most upvoted comments

For anyone interested: The solution @joeyearsley presented works as long as working with a (callable) class. In the case of an initializer function I get TypeError: custom_initializer() missing 1 required positional argument: 'shape'

My workaround to this situation was to introduce a callable class, which did finally work:

class CustomInitializer:
    def __call__(self, shape, dtype=None):
        return custom_initializer(shape, dtype=dtype)

get_custom_objects().update({'custom_initializer': CustomInitializer})`

model = load_model("../weights/custom_init_test.hdf5")

Still I think this shouild be considered a bug, cause the most natural interface/solution would be to use model = load_model('model.h5', custom_objects={'my_init':my_init})

i have seen this error posted in several places on the internet, and has been fixed in tensorflowjs but not keras or tf python.

my model is culled from early-stopping callback, im not saving it manually. this appears to be common

Traceback (most recent call last): File “/home/jim/mlcc-exercises/rejuvepredictor/stage4.py”, line 175, in <module> custom_objects={‘kernel_initializer’:GlorotUniform} File “/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py”, line 419, in load_model model = _deserialize_model(f, custom_objects, compile) File “/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py”, line 225, in _deserialize_model model = model_from_config(model_config, custom_objects=custom_objects) File “/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py”, line 458, in model_from_config return deserialize(config, custom_objects=custom_objects) File “/usr/local/lib/python3.6/dist-packages/keras/layers/init.py”, line 55, in deserialize printable_module_name=‘layer’) File “/usr/local/lib/python3.6/dist-packages/keras/utils/generic_utils.py”, line 145, in deserialize_keras_object list(custom_objects.items()))) File “/usr/local/lib/python3.6/dist-packages/keras/engine/sequential.py”, line 300, in from_config custom_objects=custom_objects) File “/usr/local/lib/python3.6/dist-packages/keras/layers/init.py”, line 55, in deserialize printable_module_name=‘layer’) File “/usr/local/lib/python3.6/dist-packages/keras/utils/generic_utils.py”, line 147, in deserialize_keras_object return cls.from_config(config[‘config’]) File “/usr/local/lib/python3.6/dist-packages/keras/layers/recurrent.py”, line 2298, in from_config return cls(**config) File “/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py”, line 91, in wrapper return func(*args, **kwargs) File “/usr/local/lib/python3.6/dist-packages/keras/layers/recurrent.py”, line 2178, in init implementation=implementation) File “/usr/local/lib/python3.6/dist-packages/keras/layers/recurrent.py”, line 1841, in init self.kernel_initializer = initializers.get(kernel_initializer) File “/usr/local/lib/python3.6/dist-packages/keras/initializers.py”, line 508, in get return deserialize(identifier) File “/usr/local/lib/python3.6/dist-packages/keras/initializers.py”, line 503, in deserialize printable_module_name=‘initializer’) File “/usr/local/lib/python3.6/dist-packages/keras/utils/generic_utils.py”, line 138, in deserialize_keras_object ': ’ + class_name) ValueError: Unknown initializer: GlorotUniform

I solved my issue, I noticed that I had these imports:

import os, sys
import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
import tensorboard as tb
from tensorflow import keras
from tensorflow.keras import Sequential
from tensorflow.keras import initializers
from keras.models import load_model

by digging into the .__file__ and/or .__path__ attribute of any keras modules, I finally noticed that the last line of my imports was actually calling the keras standalone module and not the one embeded in tensorflow, which I naively thought was having the priority over the keras standalone because of my from tensorflow import keras import which is a few lines before. But it’s not the case (probably related to how python decide to explore its own paths)!

By changing my keras imports to purely tensorflow ones and tensorflow ones only:

import os, sys
import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
import tensorboard as tb
from tensorflow import keras
# always load keras stuff from tf...
from tensorflow.keras import Sequential
from tensorflow.keras import initializers
from tensorflow.keras.models import load_model

(notice the last line changes)

I do not meet the error anymore!

/!\ So, to me, it seems that some of the keras submodules doesn’t behave exactly the same than the ones embeded in tf.keras (I first thought there was kind of automated pushes from keras to tf.keras and so I have not to care about that… but actually it seem that I must care!). For example here, the glorot_uniform call was correctly linked to “GlorotUniform” within tensorflow but not in keras itself.

Hope it helps.

I agree this should be treated as a bug. It is important and a fairly common use case!

I found a workaround to this problem. When you add layers to your model, the “weights” parameters can be initialized with a numpy array. Then you can use numpy to do random initialization, such as:

def weights_initialization(inputDim, outputDim, scale = 0.1):
    return numpy.sqrt(scale)*numpy.random.randn(inputDim, outputDim), \
        numpy.sqrt(scale) * numpy.random.randn(outputDim),