LightGBM: [bug] Creating multiple boosters from file results in broken scores.

Using the master branch, if I create 2 boosters from file using the C API, I get broken scores from all but the first created booster:


BoosterHandle bh1, bh2;  

LGBM_BoosterCreateFromModelfile("./LightGBM_model.txt", &num_iterations, &bh1); // comment this & you get good results
LGBM_BoosterCreateFromModelfile("./LightGBM_model.txt", &num_iterations, &bh2);

// Score is only correct if I predict with bh1, or predict with bh2 but don't initialize bh1 before initializing bh2:
predict(bh2, values             , NUM_FEATURES, num_iterations, &out_len, &ref_scores[0]); // score changes depending on bh2 being the first Booster created or not.



Meaning, only the first booster created works properly.

I’m on Ubuntu 18.04, x86_64.

Thank you

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 19 (9 by maintainers)

Commits related to this issue

Most upvoted comments

@AlbertoEAF I think I found the problem. Is your model file produced by an earlier version of LightGBM (earlier than the merge of linear_tree in #3299)? If so, the field is_linear= is missing from the model file. And when loaded by the latest version of LightGBM, the is_linear_ member of tree will be undefined, which causes undefined prediction behavior.

To fix this, let’s set the is_linear_ filed to false when it is absent from the model file, for backward compatibility.

I’ll open a PR for this.

Hello @shiyu1994 and @guolinke I managed to reproduce it. I created a new branch https://github.com/feedzai/LightGBM/tree/test-double-model-load which you can use (has the model and the profile code).

To test:

# Compile
mkdir build && cd build && cmake -DBUILD_PROFILING_TESTS=ON

# Go to main repo folder & run
cd ..
./lightgbm_profile_single_row_predict

you should get the following outputs, when they should be the same for both boosterHandles 1 & 2:

image

and if you comment the creation of boosterHandle1 and its predictions, boosterHandle2 starts spitting the results of boosterHandle1 in the screenshot above.

Hi there 😃,

Unfortunately I cannot provide that model file as it is sensitive, neither do I have much time on my hands to try to produce a simpler example this week. I can try it in the weekend though since it’s urgent.