keras: RuntimeError: "Cannot clone object ..." when using sklearn.base.clone with BaseEstimator
Description
RuntimeError thrown when using sklearn.base.clone
due to the fact that get_params
of KerasClassifier
returns copies instead of references.
Sanity check at the end of the clone
function fails when the implementation of the estimator used, copies parameters in the get_params
method.
The scikit-learn
issue 15722 opens the discussion whether the behaviour of their estimator cloning functionality is incorrect or just poorly documented.
This bug may also affect 'KerasRegressor`.
Steps/Code to Reproduce
from keras.wrappers.scikit_learn import KerasClassifier
from keras.models import Sequential
from keras.layers import Dense
from sklearn.base import clone
def create_keras_classifier_model(n_classes):
"""Keras multinomial logistic regression creation model
Args:
n_classes(int): Number of classes to be classified
Returns:
Compiled keras model
"""
# create model
model = Sequential()
model.add(Dense(n_classes, activation="softmax"))
# Compile model
model.compile(
loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]
)
return model
estimator = KerasClassifier(build_fn=create_keras_classifier_model, n_classes=2, class_weight={0: 1, 1:3})
clone(estimator)
Expected Results
No error is thrown.
Actual Results
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-1-520f4ee6e745> in <module>
26 estimator = KerasClassifier(build_fn=create_keras_classifier_model, n_classes=2, class_weight={0: 1, 1:3})
27
---> 28 clone(estimator)
/usr/local/anaconda/envs/ivan/lib/python3.6/site-packages/sklearn/base.py in clone(estimator, safe)
73 raise RuntimeError('Cannot clone object %s, as the constructor '
74 'either does not set or modifies parameter %s' %
---> 75 (estimator, name))
76 return new_object
77
RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x7f7504148f28>, as the constructor either does not set or modifies parameter class_weight
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 16
I had the same error when trying to pass an nested 2d array as parameters. I changed to to a tuple and it works. E.g.: Change
param = [[1,2], [3,4]]
toparam = [(1,2), (3,4)]
Hi, I have the same issue, it seems to come from initialising parameter ranges using numpy arrays. Versions:
Full code for reproducibility:
In this simple example, switching between
param_grid = param_grid_breaking
andparam_grid = param_grid_working
respectively breaks or not the cloning process. I believe the issue stems from the list members’ types in each cases being different (Python’sint
vs Numpy’snp.int64
).You can work around this issue by casting numpy arrays to list using the
.tolist()
method.This would probably work for the multidimensional parameters as it conserves the nested list structure, but it discards numpy’s dtype for Python’s builtin types and uses probably way more memory.
This is sklearn bug. You should reduce the version of sklearn:
conda install scikit-learn==0.21.2
Under sklearn.model_selection._search.py (sklearn version 0.22.1) line 735, nested clone call seems to be responsible:
self.best_estimator_ = clone(clone(base_estimator).set_params( **self.best_params_))
seems to be the culprit. It changed from the following in a previous version:
self.best_estimator_ = clone(base_estimator).set_params( **self.best_params_)
Changing this back to the previous version seems to fix the error (Cannot clone object…), though not sure that is the correct solution… based upon the discussion in this thread.
Thoughts??
Casting numpy arrays to list using the .tolist() method worked for me. Great help @sam1902
Downgrading the version of Scikit-learn from
0.22.2
to0.21.2
works for me.there is a problem, if param is an array…
param1 [150, 50]
param2 [150, 50]
param1 is not param2 True
quick sanity check of the parameters of the clone
I’m having the same issue, but none of the above solutions fixed the issue for me. Hoping someone can help. My environment looks like this:
Admittedly, I’m trying to rig up my own deep learning grid search script using a skeleton from auto_ml no longer supported, and the logic from this post. The only thing I’m changing is, I’m trying to also input some different
hidden_layer
configurations to create the model architecture with. Most of the articles I’ve found already define a pre-set amount of hidden layers, which I want to tune as well.What else can I try? What am I missing? To reproduce, first download the
pima-indians-diabetes.csv
from this link. If you get an issue with the importing of the .csv file, edit it so that each row is separated by columns, not commas.Then set up the same environment I’m using with the command
pip install --upgrade tensorflow==2.3.0 Keras==2.4.3 scikit-learn==0.21.2 pandas==0.25.3 numpy==1.18.5
orpip3 install --upgrade tensorflow==2.3.0 Keras==2.4.3 scikit-learn==0.21.2 pandas==0.25.3 numpy==1.18.5
Then run this code:
Hope we can get this figured out for my case? Thanks!!
UPDATE:
I guess I should also add my full traceback: