keras: RuntimeError: "Cannot clone object ..." when using sklearn.base.clone with BaseEstimator

Description

RuntimeError thrown when using sklearn.base.clone due to the fact that get_params of KerasClassifier returns copies instead of references.

Sanity check at the end of the clone function fails when the implementation of the estimator used, copies parameters in the get_params method.

The scikit-learn issue 15722 opens the discussion whether the behaviour of their estimator cloning functionality is incorrect or just poorly documented.

This bug may also affect 'KerasRegressor`.

Steps/Code to Reproduce

from keras.wrappers.scikit_learn import KerasClassifier
from keras.models import Sequential
from keras.layers import Dense
 
from sklearn.base import clone
 
def create_keras_classifier_model(n_classes):
    """Keras multinomial logistic regression creation model
 
    Args:
        n_classes(int): Number of classes to be classified
 
    Returns:
        Compiled keras model
 
    """
    # create model
    model = Sequential()
    model.add(Dense(n_classes, activation="softmax"))
    # Compile model
    model.compile(
        loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]
    )
    return model
 
estimator = KerasClassifier(build_fn=create_keras_classifier_model, n_classes=2, class_weight={0: 1, 1:3})
 
clone(estimator)

Expected Results

No error is thrown.

Actual Results

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-1-520f4ee6e745> in <module>
     26 estimator = KerasClassifier(build_fn=create_keras_classifier_model, n_classes=2, class_weight={0: 1, 1:3})
    27
---> 28 clone(estimator)
 
/usr/local/anaconda/envs/ivan/lib/python3.6/site-packages/sklearn/base.py in clone(estimator, safe)
     73             raise RuntimeError('Cannot clone object %s, as the constructor '
     74                                'either does not set or modifies parameter %s' %
---> 75                                (estimator, name))
     76     return new_object
     77
 
RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x7f7504148f28>, as the constructor either does not set or modifies parameter class_weight

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16

Commits related to this issue

Most upvoted comments

I had the same error when trying to pass an nested 2d array as parameters. I changed to to a tuple and it works. E.g.: Change param = [[1,2], [3,4]] to param = [(1,2), (3,4)]

Hi, I have the same issue, it seems to come from initialising parameter ranges using numpy arrays. Versions:

Python 3.8.1
numpy==1.18.5
scikit-learn==0.23.1
tensorflow==2.2.0

Full code for reproducibility:

#!/usr/bin/env python3
import numpy as np

from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV

def load_dataset():
    """ Load all the data """
    # Load pima indians dataset
    # https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv
    dataset = np.loadtxt("pima-indians-diabetes.csv", delimiter=",")
    # Split into input (X) and output (Y) variables
    X = dataset[:,:8]
    Y = dataset[:,8]
    return X, Y

def create_model():
    """ Creates a Keras model """
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation="relu"))
    model.add(Dense(8, activation="relu"))
    model.add(Dense(1, activation="sigmoid"))

    model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
    return model

def main():
    """Main function"""
    # Load dataset
    X, Y = load_dataset()
    # Create model
    model = KerasClassifier(build_fn=create_model, epochs=150, batch_size=10, verbose=0)
    pipeline = Pipeline([
        ('scale', StandardScaler()),
        ('mlp', model)])
    param_grid_working = dict(
            mlp__epochs=[150,200,250,300],
            mlp__batch_size=[16,32,64,128]
        )
    param_grid_breaking = dict(
            mlp__epochs=list(np.arange(150,301,50)),
            mlp__batch_size=list(2**np.arange(4,8))
        )
    print("Works ", param_grid_working)
    print("Breaks", param_grid_breaking)
    param_grid = param_grid_breaking
    grid = GridSearchCV(pipeline, param_grid=param_grid, cv=2, n_jobs=8, verbose=2, scoring='accuracy')
    results = grid.fit(X, Y)
if __name__ == "__main__":
    main()

In this simple example, switching between param_grid = param_grid_breaking and param_grid = param_grid_working respectively breaks or not the cloning process. I believe the issue stems from the list members’ types in each cases being different (Python’s int vs Numpy’s np.int64).

You can work around this issue by casting numpy arrays to list using the .tolist() method.

This would probably work for the multidimensional parameters as it conserves the nested list structure, but it discards numpy’s dtype for Python’s builtin types and uses probably way more memory.

This is sklearn bug. You should reduce the version of sklearn:

conda install scikit-learn==0.21.2

Under sklearn.model_selection._search.py (sklearn version 0.22.1) line 735, nested clone call seems to be responsible:

self.best_estimator_ = clone(clone(base_estimator).set_params( **self.best_params_))

seems to be the culprit. It changed from the following in a previous version:

self.best_estimator_ = clone(base_estimator).set_params( **self.best_params_)

Changing this back to the previous version seems to fix the error (Cannot clone object…), though not sure that is the correct solution… based upon the discussion in this thread.

Thoughts??

Casting numpy arrays to list using the .tolist() method worked for me. Great help @sam1902

Downgrading the version of Scikit-learn from 0.22.2 to 0.21.2 works for me.

sudo python3 -m pip install scikit-learn==0.21.2 --upgrade

there is a problem, if param is an array…

param1 [150, 50]

param2 [150, 50]

param1 is not param2 True

quick sanity check of the parameters of the clone

for name in new_object_params:
    param1 = new_object_params[name]
    param2 = params_set[name]
    if param1 is not param2:
        raise RuntimeError('Cannot clone object %s, as the constructor '
                           'either does not set or modifies parameter %s' %
                           (estimator, name))

I’m having the same issue, but none of the above solutions fixed the issue for me. Hoping someone can help. My environment looks like this:

Python 3.6
Windows 10
tensorflow==2.3.0
Keras==2.4.3
scikit-learn==0.21.2
pandas==0.25.3
numpy==1.18.5

Admittedly, I’m trying to rig up my own deep learning grid search script using a skeleton from auto_ml no longer supported, and the logic from this post. The only thing I’m changing is, I’m trying to also input some different hidden_layer configurations to create the model architecture with. Most of the articles I’ve found already define a pre-set amount of hidden layers, which I want to tune as well.

  1. I’ve tried downgrading my scikit-learn version to 0.21.2, didn’t work.
  2. I’ve tried changing my list of hidden_layers to a list of tuples, didn’t work.
  3. I’ve tried downgrading to TensorFlow 2.0, didn’t work.

What else can I try? What am I missing? To reproduce, first download the pima-indians-diabetes.csv from this link. If you get an issue with the importing of the .csv file, edit it so that each row is separated by columns, not commas.

Then set up the same environment I’m using with the command pip install --upgrade tensorflow==2.3.0 Keras==2.4.3 scikit-learn==0.21.2 pandas==0.25.3 numpy==1.18.5 or pip3 install --upgrade tensorflow==2.3.0 Keras==2.4.3 scikit-learn==0.21.2 pandas==0.25.3 numpy==1.18.5

Then run this code:

# Use scikit-learn to grid search over Keras model hyperparams
import numpy
from sklearn.model_selection import GridSearchCV
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier

# Define some classification model hyper params to tune
hidden_layers = [
    (1),
    (0.5),
    (2),
    (1, 1),
    (0.5, 0.5),
    (2, 2),
    (1, 1, 1),
    (1, 0.5, 0.5),
    (0.5, 1, 1),
    (1, 0.5, 0.25),
    (1, 2, 1),
    (1, 1, 1, 1),
    (1, 0.66, 0.33, 0.1),
    (1, 2, 2, 1)
]

batch_size = [16, 32, 64, 128, 256, 512]
optimizer = ['SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam']
activation = ['tanh', 'softmax', 'elu', 'softplus', 'softsign', 'relu', 'sigmoid', 'hard_sigmoid', 'linear', 'LeakyReLU', 'PReLU', 'ELU', 'ThresholdedReLU']
epochs = [100, 200, 500, 1000]
lr = [0.01, 0.001, 0.0001, 0.00001]
momentum = [0.0, 0.3, 0.6, 0.8, 0.9]
init_mode = ['uniform', 'lecun_uniform', 'normal', 'zero', 'glorot_normal', 'glorot_uniform', 'he_normal', 'he_uniform']
weight_constraint = [1, 3, 5]
dropout_rate = [0.0, 0.2, 0.4, 0.6, 0.8]

# Define these into a dictionary
model_params = dict(hidden_layers=hidden_layers, batch_size=batch_size, optimizer=optimizer, activation=activation, epochs=epochs, learning_rate=lr,
                    momentum=momentum, init_mode=init_mode, weight_constraint=weight_constraint, dropout_rate=dropout_rate)



# Define a function that creates our model structure for the current param set
def make_deep_learning_classifier(hidden_layers=None, batch_size=32, num_cols=None, learning_rate=0.001, optimizer='Adadelta', momentum=0.0, dropout_rate=0.2, 
                                  weight_constraint=0, final_activation='sigmoid', feature_learning=False, activation='elu', init_mode='normal',
                                  epochs=epochs):

    if hidden_layers is None:
        hidden_layers = [1, 0.75, 0.25]

    # The hidden_layers passed to us is simply describing a shape. it does not know the num_cols we are dealing with, it is simply values of 0.5, 1, and 2, 
    # which need to be multiplied by the num_cols
    scaled_layers = []
    for layer in hidden_layers:
        scaled_layers.append(min(int(num_cols * layer), 10))

    model = Sequential()

    # There are times we will want the output from our penultimate layer, not the final layer, so give it a name that makes the penultimate layer easy to find
    # later (if I even use it. This has to do with feature_learning which I'm omitting in this example)
    model.add(Dense(scaled_layers[0], input_dim=num_cols, kernel_initializer=kernel_initializer, kernel_regularizer=regularizers.l2(0.01)))
    model.add(get_activation_layer(activation))

    for layer_size in scaled_layers[1:-1]:
        model.add(Dense(layer_size, kernel_initializer=kernel_initializer, kernel_regularizer=regularizers.l2(0.01)))
        model.add(get_activation_layer(activation))

    model.add(Dense(scaled_layers[-1], kernel_initializer=kernel_initializer, name='penultimate_layer', kernel_regularizer=regularizers.l2(0.01)))
    model.add(get_activation_layer(activation))

    model.add(Dense(1, kernel_initializer=kernel_initializer, activation=final_activation))
    model.compile(loss='binary_crossentropy', optimizer=get_optimizer(optimizer), metrics=['accuracy', 'poisson'])
    print(model.summary())
    return model



# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataset = numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

# How many columns (nodes) to pass to our first layer
num_cols = X.shape[1]

model = KerasClassifier(build_fn=make_deep_learning_classifier, num_cols=num_cols,  **model_params) # feature_learning=self.feature_learning,

grid = GridSearchCV(estimator=model, param_grid=model_params, n_jobs=-2, cv=3)
grid_result = grid.fit(X, Y)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Hope we can get this figured out for my case? Thanks!!

UPDATE:

I guess I should also add my full traceback:

Traceback (most recent call last):
  File "keras_grid_search.py", line 92, in <module>
    grid_result = grid.fit(X, Y)
  File "C:\Users\windowshopr\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\model_selection\_search.py", line 632, in fit
    base_estimator = clone(self.estimator)
  File "C:\Users\windowshopr\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\base.py", line 75, in clone
    (estimator, name))
RuntimeError: Cannot clone object <tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier object at 0x000002384870B390>, as the constructor either does not set or modifies parameter hidden_layers