autogluon: Tabular: Fit on Windows, load on MacOS/Linux causes exception
Hi , I am using Autogluon Tabular predictor. The predictor is as follows:
predictor= TabularPredictor(
label="label1",
verbosity=4,
problem_type="regression",
path="agModels-predicttest1”,
eval_metric='mean_absolute_error'
).fit(
train_data=train_data_transformed.drop(
["label2", "label3", "label4", "label5"], axis=1),
feature_generator=IdentityFeatureGenerator(),
time_limit=3200,
# visualizer='tensorboard',
presets='medium_quality', #'optimize_for_deployment', #'best_quality',
hyperparameters={"GBM":{'num_boost_round': [ag.space.Int](http://ag.space.int/)(lower=50, upper=2000, default=100),
'num_leaves': [ag.space.Int](http://ag.space.int/)(lower=64, upper=1024, default=256),
'learning_rate': ag.space.Real(3e-2, 2e-1, default=1e-1,log=True),
'max_depth':[ag.space.Int](http://ag.space.int/)(lower=6, upper=10, default=8),
'early_stopping_round':10,'min_data_in_leaf': [ag.space.Int](http://ag.space.int/)(lower=10, upper=100, default=20)}},
hyperparameter_tune_kwargs={'num_trials': 10,'searcher': 'auto','scheduler':'local'}
)
With medium_quality presets and hyper parameter tuning, if I try to load the model after training as follows: predictor_test1=TabularPredictor.load("/home/hadoop/agModels-predicttest1/“) I get the following error:
Traceback (most recent call last):
File "/tmp/1665687139040-0/zeppelin_python.py", line 158, in <module>
exec(code, _zcUserQueryNameSpace)
File "<stdin>", line 7, in <module>
File "/usr/local/lib/python3.7/site-packages/autogluon/tabular/predictor/predictor.py", line 2901, in load
predictor = cls._load(path=path)
File "/usr/local/lib/python3.7/site-packages/autogluon/tabular/predictor/predictor.py", line 2833, in _load
predictor._set_post_fit_vars(learner=learner)
File "/usr/local/lib/python3.7/site-packages/autogluon/tabular/predictor/predictor.py", line 2787, in _set_post_fit_vars
self._learner.persist_trainer(low_memory=True)
File "/usr/local/lib/python3.7/site-packages/autogluon/tabular/learner/abstract_learner.py", line 704, in persist_trainer
self.trainer = self.load_trainer()
File "/usr/local/lib/python3.7/site-packages/autogluon/core/learner/abstract_learner.py", line 121, in load_trainer
path=self.trainer_path, reset_paths=self.reset_paths
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 2315, in load
obj.set_contexts(path)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 214, in set_contexts
self.path, model_paths = self.create_contexts(path_context)
File "/usr/local/lib/python3.7/site-packages/autogluon/core/trainer/abstract_trainer.py", line 224, in create_contexts
model_local_path = prev_path.split(abs_path, 1)[1]
IndexError: list index out of range
However, it works fine if I use presets=Best_quality where auto_stack=True. I am not sure why medium_quality Presets with hyperparameter tuning are giving this error at the time of loading the weights.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 19 (1 by maintainers)
@Innixma you bet, I will follow these steps and follow up with a reply. Thanks!
Thanks for providing the additional information @Alex-Wenner-FHR and @rsj123!
I have this on my radar and will see if it can be fixed for the upcoming v0.7 release.
I also met this problem, I think it may cause this: in core\trainer\abstract_trainer.py line 223:
self.path is a relative path, in my case it is
AutogluonModels/ag-20221128_161031/models/
And in some case working directory is not the code directory, so abspath will return a wrong path So, prev_path.split(abs_path, 1)[1] will get into error