scikit-learn: Incorrect documentation for `warm_start` behavior on BaseForest-derived classes

Describe the issue linked to the documentation

The RandomForestRegressor documentation states:

When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest.

This is also true for all classes that derive from BaseForest, such as RandomForestClassifier and RandomTreesEmbedding.

However, the source code does not reflect this behavior. When n_more_estimators == 0, it does not fit a new forest and instead just recomputes the OOB score if applicable.

Suggest a potential alternative/fix

There are two potential fixes:

Reword the documentation to state:

When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, reuse the existing ensemble.

Modify the actual behavior of this method to fit a new forest in the case where n_more_estimators == 0 to reflect the existing documentation.

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 19 (15 by maintainers)

Most upvoted comments

@amay1212 This is still an open issue as stated in different messages. #26318 is apparently trying to solve the issue.

We still need this issue to be open.

glemaitre on May 4, 2023

@yashasvimisra2798 sure, thanks!

NicolasHug on Jul 17, 2021