interpret: Segmentation Fault
This issue may be related to the issue https://github.com/interpretml/interpret/issues/435 .
I was trying to run a model_ExpBoostReg.fit(X, y)
when I have got the following segfault (full notebook available here ):
---------------------------------------------------------------------------
TerminatedWorkerError Traceback (most recent call last)
Cell In[12], line 1
----> 1 model_ExpBoostReg.fit(X, y)
File /opt/conda/lib/python3.11/site-packages/interpret/glassbox/_ebm/_ebm.py:848, in EBMModel.fit(self, X, y, sample_weight, init_score)
821 early_stopping_rounds_local = 0
823 parallel_args.append(
824 (
825 dataset,
(...)
845 )
846 )
--> 848 results = provider.parallel(boost, parallel_args)
850 # let python reclaim the dataset memory via reference counting
851 del parallel_args # parallel_args holds references to dataset, so must be deleted
File /opt/conda/lib/python3.11/site-packages/interpret/provider/_compute.py:19, in JobLibProvider.parallel(self, compute_fn, compute_args_iter)
18 def parallel(self, compute_fn, compute_args_iter):
---> 19 results = Parallel(n_jobs=self.n_jobs)(
20 delayed(compute_fn)(*args) for args in compute_args_iter
21 )
22 return results
File /opt/conda/lib/python3.11/site-packages/joblib/parallel.py:1098, in Parallel.__call__(self, iterable)
1095 self._iterating = False
1097 with self._backend.retrieval_context():
-> 1098 self.retrieve()
1099 # Make sure that we get a last message telling us we are done
1100 elapsed_time = time.time() - self._start_time
File /opt/conda/lib/python3.11/site-packages/joblib/parallel.py:975, in Parallel.retrieve(self)
973 try:
974 if getattr(self._backend, 'supports_timeout', False):
--> 975 self._output.extend(job.get(timeout=self.timeout))
976 else:
977 self._output.extend(job.get())
File /opt/conda/lib/python3.11/site-packages/joblib/_parallel_backends.py:567, in LokyBackend.wrap_future_result(future, timeout)
564 """Wrapper for Future.result to implement the same behaviour as
565 AsyncResults.get from multiprocessing."""
566 try:
--> 567 return future.result(timeout=timeout)
568 except CfTimeoutError as e:
569 raise TimeoutError from e
File /opt/conda/lib/python3.11/concurrent/futures/_base.py:456, in Future.result(self, timeout)
454 raise CancelledError()
455 elif self._state == FINISHED:
--> 456 return self.__get_result()
457 else:
458 raise TimeoutError()
File /opt/conda/lib/python3.11/concurrent/futures/_base.py:401, in Future.__get_result(self)
399 if self._exception:
400 try:
--> 401 raise self._exception
402 finally:
403 # Break a reference cycle with the exception in self._exception
404 self = None
TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
The exit codes of the workers are {SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11), SIGSEGV(-11)}
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 20
That’s great @paulbkoch . The SEGFAULT error got removed 😃 I have got another problem, but I will probably address it on a separate issue. Thanks a lot!
Hi @ricardobarroslourenco – I’m pretty sure I know what the problem is. It should only affect the conda build. I’ll put out a new release shortly fixing the issue.
Thanks for all your help!!
I can repro the error! I can take it from here and will let you know what I find.
Hi @ricardobarroslourenco – Thanks for submitting this error! Can you make 2 changes to that script and send me the new log:
Set: native_debug=True instead of False
pass the parameter n_jobs=1 to the ExplainableBoostingRegressor.
Is this a public dataset that I can download?