sktime: [BUG] Automatic fitting with AutoETS fails

Describe the bug Automatic fitting with AutoETS errors. Works fine when manually setting params.

To Reproduce

instantiate with auto=True and fit

from sktime.datasets import load_airline
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.forecasting.ets import AutoETS

y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)


hw_auto_model = AutoETS(auto=True)
hw_model.fit(y_train)

Expected behavior

A fit model with good params.

Stack trace

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-80-03aaf6ca0732> in <module>
----> 1 hw_auto_model.fit(y_train)

~/miniconda3/lib/python3.7/site-packages/sktime/forecasting/base/_statsmodels.py in fit(self, y_train, fh, X_train)
     39         self._set_y_X(y_train, X_train)
     40         self._set_fh(fh)
---> 41         self._fit_forecaster(y_train, X_train=X_train)
     42         self._is_fitted = True
     43         return self

~/miniconda3/lib/python3.7/site-packages/sktime/forecasting/ets.py in _fit_forecaster(self, y, X_train)
    287                 delayed(_fit)(error, trend, seasonal, damped)
    288                 for error, trend, seasonal, damped in _iter(
--> 289                     error_range, trend_range, seasonal_range, damped_range
    290                 )
    291             )

~/miniconda3/lib/python3.7/site-packages/joblib/parallel.py in __call__(self, iterable)
   1027             # remaining jobs.
   1028             self._iterating = False
-> 1029             if self.dispatch_one_batch(iterator):
   1030                 self._iterating = self._original_iterator is not None
   1031 

~/miniconda3/lib/python3.7/site-packages/joblib/parallel.py in dispatch_one_batch(self, iterator)
    845                 return False
    846             else:
--> 847                 self._dispatch(tasks)
    848                 return True
    849 

~/miniconda3/lib/python3.7/site-packages/joblib/parallel.py in _dispatch(self, batch)
    763         with self._lock:
    764             job_idx = len(self._jobs)
--> 765             job = self._backend.apply_async(batch, callback=cb)
    766             # A job can complete so quickly than its callback is
    767             # called before we get here, causing self._jobs to

~/miniconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py in apply_async(self, func, callback)
    206     def apply_async(self, func, callback=None):
    207         """Schedule a func to be run"""
--> 208         result = ImmediateResult(func)
    209         if callback:
    210             callback(result)

~/miniconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py in __init__(self, batch)
    570         # Don't delay the application, to avoid keeping the input
    571         # arguments in memory
--> 572         self.results = batch()
    573 
    574     def get(self):

~/miniconda3/lib/python3.7/site-packages/joblib/parallel.py in __call__(self)
    251         with parallel_backend(self._backend, n_jobs=self._n_jobs):
    252             return [func(*args, **kwargs)
--> 253                     for func, args, kwargs in self.items]
    254 
    255     def __reduce__(self):

~/miniconda3/lib/python3.7/site-packages/joblib/parallel.py in <listcomp>(.0)
    251         with parallel_backend(self._backend, n_jobs=self._n_jobs):
    252             return [func(*args, **kwargs)
--> 253                     for func, args, kwargs in self.items]
    254 
    255     def __reduce__(self):

~/miniconda3/lib/python3.7/site-packages/sktime/forecasting/ets.py in _fit(error, trend, seasonal, damped)
    271                     dates=self.dates,
    272                     freq=self.freq,
--> 273                     missing=self.missing,
    274                 )
    275                 _fitted_forecaster = _forecaster.fit(

~/miniconda3/lib/python3.7/site-packages/statsmodels/tsa/exponential_smoothing/ets.py in __init__(self, endog, error, trend, damped_trend, seasonal, seasonal_periods, initialization_method, initial_level, initial_trend, initial_seasonal, bounds, dates, freq, missing)
    454             )
    455             if seasonal_periods is None:
--> 456                 self.seasonal_periods = freq_to_period(self._index_freq)
    457             if self.seasonal_periods <= 1:
    458                 raise ValueError("seasonal_periods must be larger than 1.")

~/miniconda3/lib/python3.7/site-packages/statsmodels/tsa/tsatools.py in freq_to_period(freq)
    810     if not isinstance(freq, offsets.DateOffset):
    811         freq = to_offset(freq)  # go ahead and standardize
--> 812     freq = freq.rule_code.upper()
    813 
    814     if freq == 'A' or freq.startswith(('A-', 'AS-')):

AttributeError: 'NoneType' object has no attribute 'rule_code'

Versions 0.4.2

System: python: 3.7.6 (default, Jan 8 2020, 13:42:34) [Clang 4.0.1 (tags/RELEASE_401/final)] executable: /Users/jeffhale/miniconda3/bin/python machine: Darwin-18.7.0-x86_64-i386-64bit

Python dependencies: pip: 20.2.2 setuptools: 49.6.0.post20200814 sklearn: 0.23.2 numpy: 1.18.1 scipy: 1.4.1 Cython: 0.29.17 pandas: 1.1.1 matplotlib: 3.2.2 joblib: 0.16.0 numba: 0.50.0 pmdarima: 1.7.1 tsfresh: 0.17.0

When installing the latest sktime I got:

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 15 (9 by maintainers)

Commits related to this issue

Merge branch 'master' into #435-autoETS-bug-fix — committed to HYang1996/sktime by mloning 4 years ago

Most upvoted comments

We’re proposing a uniform interface for fitted parameters, something which is also missing from scikit-learn, basically a get_fitted_params method similar to get_params for hyper-parameters. But we haven’t implemented much of it. Would that be helpful? Other non-fitted attributes should probably just be exposed from the wrapped statsmodel results class.

mloning on Oct 20, 2020

Hi @discdiver , thank you for raising the issue.

This error occurs as the seasonal period sp is not specified while calling AutoETS, and using AutoETS(auto=True, sp=12) for example would work (same as the implementation in the forecasting example notebook).

The reason for this is that we implemented AutoETS based on statsmodels’ ETSModel. In statsmodels’ source code, the seasonal period is automatically picked up from the dataset itself when sp is not specified through conversion from the _index_freq (frequency of the time-series) attribute of the dataset.

As far as I know, sktime currently does not have seasonal period embedded in its datasets yet. @mloning would you please confirm this and suggest if we have better ways to fix it?

HYang1996 on Oct 19, 2020

I’m digging around the sktime docs, statsmodels docs, and google to try to figure out whats the fitted_forecaster has in it and how to access it. It returns <statsmodels.tsa.exponential_smoothing.ets.ETSResultsWrapper>. I guessed that params was an attribute, and that returned a Series of floats. I guessed on params_names, and that returned a series of param names. So I think I’m getting somewhere.

Still not clear where to find which arguments the best model found. e.g. what would I pass to an skitme ExponentialSmoothing object to achieve the same model?

@discdiver Based on statsmodels documentation on the fit method, the parameters used for fitting can be obatined by setting return_params to True. The default setting in sktime’s version is False but it would not work if it is set to Ture like so:

hw_auto_model = AutoETS(auto=True, sp=12, return_params=True)

This is because we have automated the fitting process. One possible way to go around this would be to fit the model again after the optimal one has been chosen:

hw_auto_model = AutoETS(auto=True, sp=12)
hw_auto_model.fit(y_train)
hw_auto_model._forecaster.fit(y_train, return_params=True)

which gives the following results:

array([9.9990000e-01, 9.9980001e-01, 1.0000000e-08, 1.2900000e+02,
       1.2100000e+02, 1.3500000e+02, 1.4800000e+02, 1.4800000e+02,
       1.3600000e+02, 1.1900000e+02, 1.0400000e+02, 1.1800000e+02,
       1.1500000e+02, 1.2600000e+02, 1.4100000e+02, 1.3500000e+02,
       1.0000000e+00])

Is this what you were looking for?

HYang1996 on Oct 23, 2020

Yes, one complication here is that statsmodels return a different object (the ETSResultsWrapper) rather than changing the state of the estimator as in scikit-learn. I’ve found myself in the same situation a few times where I wanted to inspect the object without really knowing where to look for what. scikit-learn is much easier to use in that respect.

I’m not sure how the model would translate into the ExponentialSmoothing model, would be nice to have this in a much more consistent framework.

mloning on Oct 19, 2020

@HYang1996 Yes, would suggest to set the default sp=1 if that works or raise an issue with statsmodels. In either case we should probably also use our input check check_sp from utils.validation.forecasting.

@discdiver You should be able to get the underlying fitted statsmodel from hw_auto_models._fitted_forecaster.

mloning on Oct 19, 2020

Thank you @mloning !

Why does statsmodels not raise a more informative error here if the period cannot be inferred from freq in period_to_freq? freq is just an attribute on a pd.PeriodIndex or pd.DatetimeIndex.

ETSModel handles it this way:

       self.has_seasonal = self.seasonal is not None

       if self.has_seasonal:
           self.seasonal_periods = int_like(
               seasonal_periods, "seasonal_periods", optional=True
           )
           if seasonal_periods is None:
               self.seasonal_periods = freq_to_period(self._index_freq)
           if self.seasonal_periods <= 1:
               raise ValueError("seasonal_periods must be larger than 1.")
       else:
           # in case the model has no seasonal component, we internally handle
           # this as if it had an additive seasonal component with
           # seasonal_periods=1, but restrict the smoothing parameter to 0 and
           # set the initial seasonal to 0.
           self.seasonal_periods = 1

_index_freq attribute is included in their base model, so no informative error would be raised in this case.

We could change the default to sp=1 . I think that’s what we use for the other forecasters at the moment.

This would work and we would need to select only non-seasonal models in the automatic fitting process if sp=1. Alternatively we could set seasonal=None if sp is not specified, and sp would be automatically set to 1.

HYang1996 on Oct 19, 2020