scikit-learn: GridSearchCV with xgboost estimator hangs when n_jobs!=1
I don’t know if this is related to #6147. I am using “The scikit-learn version is 0.18.dev0” and I have no exception though, so this is different.
In any case, this is my code (the data I am using is the same as the data for the Santander kaggle competition, too big to attach).
alg = XGBClassifier(max_depth=4, min_child_weight = 1, n_estimators=1000, learning_rate=0.0202, gamma=0, nthread=4, subsample=0.6815, colsample_bytree=0.701, seed=1, silent=False)
param_test1 = {
'max_depth':range(3,10,2),
'min_child_weight':range(1,10,2)
}
gsearch1 = GridSearchCV(estimator = alg, param_grid = param_test1, scoring='roc_auc', iid=False, n_jobs=4, cv=5)
gsearch1.fit(train_data[predictors].as_matrix(),train_data[target].as_matrix())
The program will not crash, will not throw an exception, but will not do anything (activity monitor shows no activity). Quick debugging shows the program enters _fit
in grid_search.py
but never reaches line 564. I did not debug further. A quick search brought me to issue #6147 and tried removing the n_jobs
variable.
Removingn_jobs
from the GridSearchCV
call solves the issue.
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 4
- Comments: 16 (10 by maintainers)
Thank you lesteve, I removed n_jobs from the GridSearchCV call and that fixed the issue. I might try what you suggest in the future but for now I am happy as it is. I just wanted to make you aware of this issue, and possibly help others who may encounter the same problem, suggesting a possible temporary solution. Thank you.
Of course, I’d be happy to help. Yes, that solves the issue as well (setting nthread=1). I don’t have Python3 and at this point I’d rather not mess up with Python’s installation. By the way, I am using MacOS and found issue #5115. Don’t know enough about the system, could it be related?