scikit-lego: [BUG] Stacking classifier cannot use Thresholder function - no .predict_proba
Description:
I’m able to use the thresholder on sklearn’s voting classifer, but not on the stacking classifier. It throws this error, which I believe is in error. StackingClassifier does have predict_proba. Maybe I’m missunderstanding the use case, but this seems to fit.
ValueError: The Thresholder meta model only works on classifcation models with .predict_proba.
Code for reproduction (using the sklearn sample data for StackingClassifier):
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import LinearSVC
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.ensemble import StackingClassifier
X, y = load_iris(return_X_y=True)
estimators = [
('rf', RandomForestClassifier(n_estimators=10, random_state=42)),
('svr', make_pipeline(StandardScaler(), LinearSVC(random_state=42)))]
clf = StackingClassifier( estimators=estimators, final_estimator=LogisticRegression())
clf.fit(X, y)
a = Thresholder(clf, threshold=0.2)
a.fit(X, y)
a.predict(X)
Full trace:
ValueError Traceback (most recent call last)
<ipython-input-26-1b89dbfa16b8> in <module>
16
17 a = Thresholder(clf, threshold=0.2)
---> 18 a.fit(X_train_std, np.ceil(y_train[targets[2]]))
19 a.predict(X_train_std)
~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\sklego\meta\thresholder.py in fit(self, X, y, sample_weight)
54 self.estimator_ = clone(self.model)
55 if not isinstance(self.estimator_, ProbabilisticClassifier):
---> 56 raise ValueError(
57 "The Thresholder meta model only works on classifcation models with .predict_proba."
58 )
ValueError: The Thresholder meta model only works on classifcation models with .predict_proba.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 16 (11 by maintainers)
Commits related to this issue
- Added unit test as described in issue #501. Added solution to not clone the model — committed to MarkusDegen/scikit-lego by MarkusDegen 2 years ago
The PR is now merged into the main branch. I’d like to give it another week of waiting to see if other PRs come in. If so, we may be able to batch together a few fixes into new releases on PyPI.
@MarkusDegen, thanks for the PR!
Unit test shows same error. I guess i need to slim it down a bit and add some asserts. Need to learn what the stacking classifier actually does
No worries. But you may appreciate calmcode.io
I’ll be honest, I’m a very amateur programmer. I’m out of my depth writing the tests for it.