LightGBM: Not able to train with `dart` and `early_stopping_rounds`

Environment info

Operating System: MacOS Mojave and Ubuntu

CPU/GPU model: CPU

C++/Python/R version: Python

Error message

When using the dart booting type the model is not trained when applying early_stopping_rounds. The rmse after training is not the same as it was at the stopping point in the training.

When I use gbdt the model trains fine, and I am able to reproduce the rmse from the in-training.

Reproducible examples

import lightgbm
import numpy as np

np.random.seed(1234)

params = {
    "early_stopping_rounds": 100,
    "metric": "root_mean_squared_error",
    "objective": "regression",
    "num_boost_round": 1000,
    "boosting_type": "dart",
}

size = (245688, 470)
x = np.random.exponential(scale=10, size=size)
y = 2 * x[:, 0] + np.random.exponential(scale=2, size=(size[0],))

x_val = np.random.exponential(scale=10, size=(int(size[0] / 13), size[1]))
y_val = 2 * x_val[:, 0] + np.random.exponential(scale=2, size=(int(size[0] / 13),))

model = lightgbm.LGBMModel(**params)

model.fit(x, y, eval_set=[(x, y), (x_val, y_val)], verbose=50)
train_pred = model.predict(x)
rmse = np.sqrt(np.mean((y - train_pred) ** 2))
print(f"Train rmse: {rmse}")

Output:

UserWarning: Starting from version 2.2.1, the library file in distribution wheels for macOS is built by the Apple Clang (Xcode_8.3.1) compiler.
This means that in case of installing LightGBM from PyPI via the ``pip install lightgbm`` command, you don't need to install the gcc compiler anymore.
Instead of that, you need to install the OpenMP library, which is required for running LightGBM on the system with the Apple Clang compiler.
You can install the OpenMP library by the following command: ``brew install libomp``.
  "You can install the OpenMP library by the following command: ``brew install libomp``.", UserWarning)
UserWarning: Found `num_boost_round` in params. Will use it instead ofargument
  warnings.warn("Found `{}` in params. Will use it instead of argument".format(alias))
UserWarning: Found `early_stopping_rounds` in params. Will use it instead of argument
  warnings.warn("Found `{}` in params. Will use it instead of argument".format(alias))
Training until validation scores don't improve for 100 rounds.
[50]    valid_0's rmse: 3.86284 valid_1's rmse: 4.04269
[100]   valid_0's rmse: 3.64912 valid_1's rmse: 3.88063
Early stopping, best iteration is:
[34]    valid_0's rmse: 2.60659 valid_1's rmse: 2.88739
Train rmse: 16.60181744687661

Thanks for an amazing product! 👍

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 16 (1 by maintainers)

Most upvoted comments

ping @Laurae2 @jameslamb for R’s fix

R-package should be fixed in #2443.