machinelearning: LightGBM trainer exception
System information
- OS version/distro: Windows 10
- .NET Version (eg., dotnet --info): .NET Core 2.1
Issue
- What did you do? Ran MML command line: execgraph “C:\Benchmarking\automl_graph.json”
Contents of automl_.graph.json:
{
"Inputs": {
"file_train": "D:\\SplitDatasets\\ExcitementFG2_train.csv",
"file_test": "D:\\SplitDatasets\\ExcitementFG2_valid.csv"
},
"Nodes": [
{
"Inputs": {
"CustomSchema": "sep=, col=Label:R4:78 col=Features1:R4:0-77 col=Features2:R4:79-202 header=+",
"InputFile": "$file_train"
},
"Name": "Data.CustomTextLoader",
"Outputs": {
"Data": "$data_train"
}
},
{
"Inputs": {
"CustomSchema": "sep=, col=Label:R4:78 col=Features1:R4:0-77 col=Features2:R4:79-202 header=+",
"InputFile": "$file_test"
},
"Name": "Data.CustomTextLoader",
"Outputs": {
"Data": "$data_test"
}
},
{
"Inputs": {
"BatchSize": 3,
"StateArguments": {
"Name": "AutoMlState",
"Settings": {
"Engine": {
"Name": "Rocket",
"Settings": {}
},
"Metric": "Accuracy",
"TerminatorArgs": {
"Name": "IterationLimited",
"Settings": {
"FinalHistoryLength": 100
}
},
"TrainerKind": "SignatureBinaryClassifierTrainer"
}
},
"TestingData": "$data_test",
"TrainingData": "$data_train",
"IgnoreColumns": ["cost"]
},
"Name": "Models.PipelineSweeper",
"Outputs": {
"Results": "$output_data",
"State": "$xyz"
}
}
],
"Outputs": {
"output_data": "C:\\Benchmarking\\01-ResultsOut.csv"
}
}
-
What happened? Encountered an exception in LightGBM trainer
-
What did you expect? A run to completion, w/o exception
Source code / logs
— Command line args —
dotnet MML.dll execgraph C:\Benchmarking\automl_graph.json
— Exception message —
System.InvalidOperationException
HResult=0x80131509
Message=Categorical split features is zero length
Source=Microsoft.ML.Core
StackTrace:
at Microsoft.ML.Runtime.Contracts.Check(Boolean f, String msg) in C:\MLDotNet\src\Microsoft.ML.Core\Utilities\Contracts.cs:line 497
at Microsoft.ML.Trainers.FastTree.Internal.RegressionTree.CheckValid(Action`2 checker) in C:\MLDotNet\src\Microsoft.ML.FastTree\TreeEnsemble\RegressionTree.cs:line 469
at Microsoft.ML.Trainers.FastTree.Internal.RegressionTree..ctor(Int32[] splitFeatures, Double[] splitGain, Double[] gainPValue, Single[] rawThresholds, Single[] defaultValueForMissing, Int32[] lteChild, Int32[] gtChild, Double[] leafValues, Int32[][] categoricalSplitFeatures, Boolean[] categoricalSplit) in C:\MLDotNet\src\Microsoft.ML.FastTree\TreeEnsemble\RegressionTree.cs:line 223
at Microsoft.ML.Trainers.FastTree.Internal.RegressionTree.Create(Int32 numLeaves, Int32[] splitFeatures, Double[] splitGain, Single[] rawThresholds, Single[] defaultValueForMissing, Int32[] lteChild, Int32[] gtChild, Double[] leafValues, Int32[][] categoricalSplitFeatures, Boolean[] categoricalSplit) in C:\MLDotNet\src\Microsoft.ML.FastTree\TreeEnsemble\RegressionTree.cs:line 189
at Microsoft.ML.Runtime.LightGBM.Booster.GetModel(Int32[] categoricalFeatureBoudaries) in C:\MLDotNet\src\Microsoft.ML.LightGBM\WrappedLightGbmBooster.cs:line 241
at Microsoft.ML.Runtime.LightGBM.LightGbmTrainerBase`3.TrainCore(IChannel ch, IProgressChannel pch, Dataset dtrain, CategoricalMetaData catMetaData, Dataset dvalid) in C:\MLDotNet\src\Microsoft.ML.LightGBM\LightGbmTrainerBase.cs:line 378
at Microsoft.ML.Runtime.LightGBM.LightGbmTrainerBase`3.TrainModelCore(TrainContext context) in C:\MLDotNet\src\Microsoft.ML.LightGBM\LightGbmTrainerBase.cs:line 126
at Microsoft.ML.Runtime.Training.TrainerEstimatorBase`2.Train(TrainContext context) in C:\MLDotNet\src\Microsoft.ML.Data\Training\TrainerEstimatorBase.cs:line 92
at Microsoft.ML.Runtime.Training.TrainerEstimatorBase`2.Microsoft.ML.Runtime.ITrainer.Train(TrainContext context) in C:\MLDotNet\src\Microsoft.ML.Data\Training\TrainerEstimatorBase.cs:line 158
at Microsoft.ML.Runtime.Data.TrainUtils.TrainCore(IHostEnvironment env, IChannel ch, RoleMappedData data, ITrainer trainer, RoleMappedData validData, IComponentFactory`1 calibrator, Int32 maxCalibrationExamples, Nullable`1 cacheData, IPredictor inputPredictor) in C:\MLDotNet\src\Microsoft.ML.Data\Commands\TrainCommand.cs:line 254
at Microsoft.ML.Runtime.Data.TrainUtils.Train(IHostEnvironment env, IChannel ch, RoleMappedData data, ITrainer trainer, IComponentFactory`1 calibrator, Int32 maxCalibrationExamples) in C:\MLDotNet\src\Microsoft.ML.Data\Commands\TrainCommand.cs:line 223
at Microsoft.ML.Runtime.EntryPoints.LearnerEntryPointsUtils.Train[TArg,TOut](IHost host, TArg input, Func`1 createTrainer, Func`1 getLabel, Func`1 getWeight, Func`1 getGroup, Func`1 getName, Func`1 getCustom, ICalibratorTrainerFactory calibrator, Int32 maxCalibrationExamples) in C:\MLDotNet\src\Microsoft.ML.Data\EntryPoints\InputBase.cs:line 189
at Microsoft.ML.Runtime.LightGBM.LightGbm.TrainBinary(IHostEnvironment env, LightGbmArguments input) in C:\MLDotNet\src\Microsoft.ML.LightGBM\LightGbmBinaryTrainer.cs:line 189
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 17 (14 by maintainers)
Hey @daholste, I wasn’t able to reproduce this at all, neither in TLC nor in ML.NET. And it looks like the Models.PipelineSweeper and Rocket components in the graph (along with the execgraph command) were removed in ML.NET some time ago. In any case, there was no repro even when using LightGbm from the command line or API since the dataset is only numerical columns, and the
Categorical split features is zero lengtherror isn’t applicable so I’m not sure why you were seeing that in the first place.I do, however, have the same error reproduced in #3659, and I believe the underlying cause is the same. It deterministically happens when there is only one categorical feature and
UseCategoricalSplitistruein LightGbm, and it is likely a bug in model conversion from LightGbm to FastTree. Please follow #3659 for details and updates. I am closing this issue. Please feel free to reopen if you find a repro that is distinct from the conditions described in the other issue.cc: @vinodshanbhag @justinormont @guolinke @vKuryshev @mayoatte @rauhs @eyvindwa
Hey, sent!
@daholste can you send me the dataset and code with which I can reproduce this issue? The same that you sent to Ivan 😃
I think @justinormont sent me repo file some time ago, but I lost it. If someone can provide reproducible snippet of code, I would be more than happy to fix it.