scikit-learn: GridSearchCV with xgboost estimator hangs when n_jobs!=1

I don’t know if this is related to #6147. I am using “The scikit-learn version is 0.18.dev0” and I have no exception though, so this is different.

In any case, this is my code (the data I am using is the same as the data for the Santander kaggle competition, too big to attach).

alg = XGBClassifier(max_depth=4, min_child_weight = 1, n_estimators=1000, learning_rate=0.0202, gamma=0, nthread=4, subsample=0.6815, colsample_bytree=0.701, seed=1, silent=False)

param_test1 = {
 'max_depth':range(3,10,2),
 'min_child_weight':range(1,10,2)
}

gsearch1 = GridSearchCV(estimator = alg, param_grid = param_test1, scoring='roc_auc', iid=False, n_jobs=4, cv=5)
gsearch1.fit(train_data[predictors].as_matrix(),train_data[target].as_matrix())

The program will not crash, will not throw an exception, but will not do anything (activity monitor shows no activity). Quick debugging shows the program enters _fit in grid_search.py but never reaches line 564. I did not debug further. A quick search brought me to issue #6147 and tried removing the n_jobsvariable.

Removingn_jobs from the GridSearchCV call solves the issue.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 4
  • Comments: 16 (10 by maintainers)

Most upvoted comments

Thank you lesteve, I removed n_jobs from the GridSearchCV call and that fixed the issue. I might try what you suggest in the future but for now I am happy as it is. I just wanted to make you aware of this issue, and possibly help others who may encounter the same problem, suggesting a possible temporary solution. Thank you.

Of course, I’d be happy to help. Yes, that solves the issue as well (setting nthread=1). I don’t have Python3 and at this point I’d rather not mess up with Python’s installation. By the way, I am using MacOS and found issue #5115. Don’t know enough about the system, could it be related?