LightGBM: bin size 257 cannot run on GPU

I know there are a couple other issues that mention this problem, but it’s gotten messy with suggestions it’s related to categorical_feature setting and other stuff. Here is clean MRE.

d9a96c90cb479cef87047ba20517d97982b563eb

lgb257.pkl.zip

import pickle
model, X, y, kwargs = pickle.load(open(lgb257.pkl, "rb"))
model.fit(X, y, **kwargs)

FYI a model.get_params() shows:

params = {'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 0.8, 'importance_type': 'gain',
          'learning_rate': 0.5, 'max_depth': 6, 'min_child_samples': 1, 'min_child_weight': 1.0, 'min_split_gain': 0.0,
          'n_estimators': 100, 'n_jobs': 8, 'num_leaves': 64, 'objective': 'binary', 'random_state': 1234,
          'reg_alpha': 0.0, 'reg_lambda': 1.0, 'silent': True, 'subsample': 0.7, 'subsample_for_bin': 200000,
          'subsample_freq': 1, 'pred_gap': None, 'pred_periods': None, 'max_bin': 255, 'scale_pos_weight': 1.0,
          'max_delta_step': 0.0, 'min_data_in_bin': 1, 'seed': 1234, 'early_stopping_limit': None, 'device_type': 'gpu',
          'gpu_device_id': 0, 'gpu_platform_id': 0, 'gpu_use_dp': True, 'feature_fraction_seed': 1235,
          'bagging_seed': 1236, 'num_threads': 8, 'num_class': 1, 'verbose': -1, 'categorical_feature': ''}

and FYI here is kwargs:

image

[LightGBM] [Warning] num_threads is set=8, n_jobs=8 will be ignored. Current value: num_threads=8
[LightGBM] [Warning] seed is set=1234, random_state=1234 will be ignored. Current value: seed=1234
/home/jon/minicondadai/lib/python3.6/site-packages/lightgbm_gpu/basic.py:1586: UserWarning: Using categorical_feature in Dataset.
  warnings.warn('Using categorical_feature in Dataset.')
/home/jon/minicondadai/lib/python3.6/site-packages/lightgbm_gpu/basic.py:1590: UserWarning: categorical_feature in Dataset is overridden.
New categorical_feature is []
  'New categorical_feature is {}'.format(sorted(list(categorical_feature))))
/home/jon/minicondadai/lib/python3.6/site-packages/lightgbm_gpu/basic.py:1108: UserWarning: categorical_feature keyword has been found in `params` and will be ignored.
Please use categorical_feature argument of the Dataset constructor to pass this parameter.
  .format(key))
[LightGBM] [Fatal] bin size 257 cannot run on GPU
Traceback (most recent call last):
  File "/home/jon/h2oai.fullcondatest/h2oaicore/lgb257.py", line 18, in <module>
    model.fit(X, y, **kwargs)
  File "/home/jon/minicondadai/lib/python3.6/site-packages/lightgbm_gpu/sklearn.py", line 867, in fit
    callbacks=callbacks, init_model=init_model)
  File "/home/jon/minicondadai/lib/python3.6/site-packages/lightgbm_gpu/sklearn.py", line 637, in fit
    callbacks=callbacks, init_model=init_model)
  File "/home/jon/minicondadai/lib/python3.6/site-packages/lightgbm_gpu/engine.py", line 230, in train
    booster = Booster(params=params, train_set=train_set)
  File "/home/jon/minicondadai/lib/python3.6/site-packages/lightgbm_gpu/basic.py", line 2104, in __init__
    ctypes.byref(self.handle)))
  File "/home/jon/minicondadai/lib/python3.6/site-packages/lightgbm_gpu/basic.py", line 52, in _safe_call
    raise LightGBMError(_LIB.LGBM_GetLastError().decode('utf-8'))
lightgbm.basic.LightGBMError: bin size 257 cannot run on GPU

Running

model.fit(X, y)

fails same way, but I’m unsure for sklearn API if it is using ‘auto’ for categorical_feature then.

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 2
  • Comments: 20

Most upvoted comments

Hi, any thoughts? Seems like a clear MRE, but it’s been 5 days and no response. Thanks.

@pseudotensor The current CUDA version is doing the correct thing, it can handle categorical features normally. The only problem is current implementation only do histogram construction on GPU, so the GPU utilization can be low.

Supporting of categorical features is not added yet in our first part of new CUDA version #4630, but will be added later.

were you able to check with above two .csv files for X and y

I was not. If you’re subscribed to this issue, you’ll be notified when someone picks this up or has new information to share.

Here’s another minimal repro, in case helps

lgb.bin257.pkl.zip

import pickle
import lightgbm as lgb
print(lgb.__version__)

from lightgbm.sklearn import LGBMRegressor
with open("lgb.bin257.pkl", "rb") as f:
    X, y = pickle.load(f)
    model = LGBMRegressor(max_bin=252, device_type='gpu')
    model.fit(X, y)
    print("OK1")

    model = LGBMRegressor(max_bin=253, device_type='gpu')
    model.fit(X, y)
    print("OK2")

first one passes, second one fails, not sure where 257 comes from:

3.2.1.99
OK1
[LightGBM] [Fatal] bin size 257 cannot run on GPU
Traceback (most recent call last):
  File "/nfs4/lgb_prefit_1c95733f-58d6-4a61-969f-b2331e03e895.py", line 13, in <module>
    model.fit(X, y)
  File "/home/arno/minicondadai_py38/lib/python3.8/site-packages/lightgbm/sklearn.py", line 851, in fit
    super().fit(X, y, sample_weight=sample_weight, init_score=init_score,
  File "/home/arno/minicondadai_py38/lib/python3.8/site-packages/lightgbm/sklearn.py", line 714, in fit
    self._Booster = train(params, train_set,
  File "/home/arno/minicondadai_py38/lib/python3.8/site-packages/lightgbm/engine.py", line 260, in train
    booster = Booster(params=params, train_set=train_set)
  File "/home/arno/minicondadai_py38/lib/python3.8/site-packages/lightgbm/basic.py", line 2537, in __init__
    _safe_call(_LIB.LGBM_BoosterCreate(
  File "/home/arno/minicondadai_py38/lib/python3.8/site-packages/lightgbm/basic.py", line 125, in _safe_call
    raise LightGBMError(_LIB.LGBM_GetLastError().decode('utf-8'))
lightgbm.basic.LightGBMError: bin size 257 cannot run on GPU

Process finished with exit code 1

I think the old GPU/CUDA version will be abandoned. also cc @shiyu1994 to follow up on this issue.