LightGBM: Avoiding Exception "Check failed: (best_split_info.right_count) > (0) at ..." with a regression task
How you are using LightGBM?
- Python package
Environment info
- Operating System: Ubuntu 20.04.1 LTS
- Python version: 3.8.5
- GCC 7.3.0
- LightGBM version or commit hash: 3.1.1
Steps to reproduce
- In jupyter lab’s notebook, prepare train and validation datasets. (They are huge and private, so can’t share a reproducible example).
- Train lgbm with the data with different sets of features.
- Observe an exception looking like this:
Check failed: (best_split_info.right_count) > (0) at [...]
Sometimes it says left_count instead of right_count.
Other times it doesn’t occur at all, depending on the features I use.
Other details
Apparently this is the start of the piece of code initiating the exception: https://github.com/microsoft/LightGBM/blob/master/src/treelearner/serial_tree_learner.cpp#L652.
I tried setting min_data_in_leaf to a value greater than zero. It helps sometimes, but not reliably. Same with feature_fraction. I also tried changing min_sum_hessian_in_leaf, to no avail. Also tried setting min_data_in_leaf and min_sum_hessian_in_leaf simultaneously, no difference.
This (or a similar) issue is mentioned a few times here:
- https://stackoverflow.com/questions/60161691/best-split-info-check-failure-encountered-while-fitting-lightgbm-classifier
- https://github.com/microsoft/LightGBM/issues/3603
- https://github.com/microsoft/LightGBM/issues/2742
None of them suggests an approach that allowed me to avoid these exceptions. Would you please share any ideas how to fix this, or at least why does this issue happen at all? If I understand correctly, one could simply trim the split leading to this error and stop branching further. Please correct me if I’m wrong. Thank you.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 56 (11 by maintainers)
@mshivers Thanks! Given that reproducible example, we should reopen this issue. I’ll investigate it further in next few days.
I have same issue on 3.1.1, build from source with GPU.
lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) at G:\Projects\LightGBM\LightGBM\src\treelearner\serial_tree_learner.cpp, line 653 .#3694 is opened to potentially fix these errors, but it is only related to CPU version. We need further investigation if the errors are not fully eliminated after this PR is merged.
A potential bug in histogram offset assignment may cause this error. I will create a PR for this.
Hi @shiyu1994, I’m using CPUs. I’ve managed to reproduce the error just using randomly generated data. I’m on a corporate network that restricts data upload, however when I run the script below, it usually only takes a few minutes before it throws the error:
For me updating
3.1.1 -> 3.2.1fixes the issue (CPU, macbook pro 16", macos catalina)@pseudotensor Can you please try version 3.0.0 to see if the same problem occurs. Is your training data private? If not, can you share it with us? Thanks.
It was on large private dataset, which I can’t provide. Error gone on other LightGBM settings. I will try to reproduce it on some small data.
@shiyu1994 Just for your information.
I also got a similar error and then I google on the web and found this issue.
I am using LightGBM 3.1.1 (the version that I can install from “pip3 install lightgbm”) I run it with missing_data=True, regression task, least-square error, no GPU, with categorical features
I got the following error at some point:
I saw #3694 had been merged. Therefore, I compile the latest version from github master and it currently works.
My data is also private and cannot be shared. Sorry about that.
The last step should be
if you’d like Python package installation pick up already compiled dynamic library.
@shiyu1994 Could you please transfer your changes from your fork to this repository? I believe you have enough rights to do this as a collaborator. Then we can trigger Azure Pipelines to build Python wheel file with your changes. And after that @ch3rn0v will be able to install patched version with simple
pip install ...in isolated env without any other requirements.