catboost: catboost/libs/target/target_converter.cpp:64: Unknown class name: "0.6"
Problem: The above exception is thrown for certain target values. catboost version: 0.13.1 Operating System: Linux
How to reproduce:
import catboost as cb
import numpy as np
print(cb.__version__)
model = cb.CatBoostRegressor(
iterations=1,
depth=1,
loss_function='RMSE',
# If you change the eval metric to RMSE it works
eval_metric='AUC:border={}'.format(0.5),
train_dir='/tmp/cbtest2',
)
x = np.array([[1.5], [0.1]])
# If you change the following line to: y = np.array([0.6, 0.4]) it works
y = np.array([0.99, 0.4])
pool = cb.Pool(x, label=y)
x_valid = np.array([[0.33]])
y_valid = np.array([0.6])
pool_valid = cb.Pool(x_valid, label=y_valid)
model.fit(X=pool, eval_set=pool_valid, use_best_model=False)
Full output:
0.13.1
---------------------------------------------------------------------------
CatBoostError Traceback (most recent call last)
<ipython-input-81-d2333a747008> in <module>
21 pool_valid = cb.Pool(x_valid, label=y_valid)
22
---> 23 model.fit(X=pool, eval_set=pool_valid, use_best_model=False)
~/.conda/envs/thehft-ml/lib/python3.7/site-packages/catboost/core.py in fit(self, X, y, cat_features, sample_weight, baseline, use_best_model, eval_set, verbose, logging_level, plot, column_description, verbose_eval, metric_period, silent, early_stopping_rounds, save_snapshot, snapshot_file, snapshot_interval)
2699 use_best_model, eval_set, verbose, logging_level, plot, column_description,
2700 verbose_eval, metric_period, silent, early_stopping_rounds,
-> 2701 save_snapshot, snapshot_file, snapshot_interval)
2702
2703 def predict(self, data, ntree_start=0, ntree_end=0, thread_count=-1, verbose=None):
~/.conda/envs/thehft-ml/lib/python3.7/site-packages/catboost/core.py in _fit(self, X, y, cat_features, pairs, sample_weight, group_id, group_weight, subgroup_id, pairs_weight, baseline, use_best_model, eval_set, verbose, logging_level, plot, column_description, verbose_eval, metric_period, silent, early_stopping_rounds, save_snapshot, snapshot_file, snapshot_interval)
1171
1172 with log_fixup(), plot_wrapper(plot, self.get_params()):
-> 1173 self._train(train_pool, eval_sets, params, allow_clear_pool)
1174
1175 if (not self._object._has_leaf_weights_in_model()) and allow_clear_pool:
~/.conda/envs/thehft-ml/lib/python3.7/site-packages/catboost/core.py in _train(self, train_pool, test_pool, params, allow_clear_pool)
864
865 def _train(self, train_pool, test_pool, params, allow_clear_pool):
--> 866 self._object._train(train_pool, test_pool, params, allow_clear_pool)
867 self._set_trained_model_attributes()
868
_catboost.pyx in _catboost._CatBoost._train()
_catboost.pyx in _catboost._CatBoost._train()
CatBoostError: catboost/libs/target/target_converter.cpp:64: Unknown class name: "0.6"
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 21 (9 by maintainers)
Solved. eval_set contained labels that model have never seen. My y has roughly 1500 categories and cleaning valuecounts for y =1 and stratifying split by y solved this problem. Suggest throwing more detailed exception to prevent posting such errors.
A solution to solve this problem is to define the
class_name
you can do this using:catb_model= CatBoostClassifier(iterations=1000,learning_rate=0.05, loss_function='MultiClass', class_names=["1","2","3","4","5","6","7","8","9","10","11"])