lithops: Error when running grid search with RandomForestClassifier

When using lithops joblib backend, a grid search with the RandomForestClassifier causes an error from joblib. The error complains that the system doesn’t support forking, but MacOS does. Running a very close example with either a different classifier or the “loky” (default) joblib backend runs without issue.

Please see sklearn issue for full details - referred here as likely source of issue.

Reproducible example:

import joblib
from lithops.util.joblib import register_lithops
from lithops.utils import setup_lithops_logger
from sklearn.datasets import load_digits
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV

digits = load_digits()
param_grid = {
    "n_estimators": [100, 50, 25],
}
model = RandomForestClassifier()
search = GridSearchCV(model, param_grid, cv=2, refit=True)


register_lithops()
setup_lithops_logger("INFO")

with joblib.parallel_backend("lithops"):
    search.fit(
        digits.data,
        digits.target,
    )
print("Best score: %0.3f" % search.best_score_)
print("Best parameters set:")
best_parameters = search.best_estimator_.get_params()
print(best_parameters)

Raises this error:


2023-10-16 13:01:43,896 [INFO] config.py:141 -- Lithops v3.0.1 - Python3.11
2023-10-16 13:01:43,896 [INFO] localhost.py:39 -- Localhost storage client created
2023-10-16 13:01:43,896 [INFO] localhost.py:58 -- Localhost compute client created
Traceback (most recent call last):
  File "/engineering/test_lithops.py", line 20, in <module>
    search.fit(
  File "/venv/lib/python3.11/site-packages/sklearn/base.py", line 1152, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_search.py", line 898, in fit
    self._run_search(evaluate_candidates)
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_search.py", line 1422, in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_search.py", line 875, in evaluate_candidates
    _warn_or_raise_about_fit_failures(out, self.error_score)
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py", line 414, in _warn_or_raise_about_fit_failures
    raise ValueError(all_fits_failed_message)
ValueError: 
All the 6 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
6 fits failed with the following error:
Traceback (most recent call last):
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py", line 729, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/venv/lib/python3.11/site-packages/sklearn/base.py", line 1152, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/sklearn/ensemble/_forest.py", line 456, in fit
    trees = Parallel(
            ^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/sklearn/utils/parallel.py", line 65, in __call__
    return super().__call__(iterable_with_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/joblib/parallel.py", line 1854, in __call__
    n_jobs = self._initialize_backend()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/joblib/parallel.py", line 1332, in _initialize_backend
    n_jobs = self._backend.configure(n_jobs=self.n_jobs, parallel=self,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/lithops/util/joblib/lithops_backend.py", line 72, in configure
    raise ImportError(
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 15 (7 by maintainers)

Most upvoted comments

Thanks, as i can see in the logs:

You have 10 workes process (CPUs)

2023-10-19 11:24:49,315 [DEBUG] invokers.py:176 -- ExecutorID dc65f6-0 | JobID A000 - Worker processes: 10 - Chunksize: 10

while the output of CPU_COUNT is 1

2023-10-19 11:24:49,301 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.11
2023-10-19 11:24:49,301 [DEBUG] config.py:103 -- Loading configuration from /Users/davidfoster/repos/data_engineering_model/.lithops_config
2023-10-19 11:24:49,311 [DEBUG] config.py:176 -- Loading compute backend module: localhost
2023-10-19 11:24:49,311 [DEBUG] config.py:235 -- Loading Storage backend module: localhost
1 **<--- CPU_COUNT**

CPU_COUNT should be 10, so my feeling is that you master branch is not updated to the latest changes

The clean command only deletes non-necessary lithops temporary data. Sometimes it might be good to run it in the case you see inconsistences.

Can you place this script in tha same directory where you have the sklearn script, run it, and copy/paste all the output here?

from lithops.multiprocessing import Pool, cpu_count
from lithops.utils import setup_lithops_logger

setup_lithops_logger("DEBUG")


def hello(name):
    return 'Hello {}!'.format(name)


if __name__ == '__main__':
    print(cpu_count())
    with Pool() as pool:
        res = pool.apply(hello, ('World', ))
        print(res)

Hi @foster999 . Found that the lithops joblib backend was not compatible anymore with latest versions of joblib. I submitted a newer version of the backend. Can you test it with master branch?