pmdarima: Error: Input contains NaN, infinity or a value too large for dtype('float64'): pmdarima.predict()

Describe the bug The method abruptly exit with the below error… ValueError: Input contains NaN, infinity or a value too large for dtype(‘float64’).

But The data is clean and no sign of any reported behavior above.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-56-cb145de96983> in <module>
      4 model_arima = auto_arima(data_tra, start_p = 0, start_q = 0, max_p = 12, max_q = 12, m = 12, start_P = 0, start_Q = 0, seasonal = False, error_action = 'ignore', suppress_warnings = True, stepwise = True)
----> 6 forecast_arima = model_arima.predict(n_periods = 18, return_conf_int = True, alpha = 0.05)

/opt/anaconda/envs/shared/lib/python3.7/site-packages/pmdarima/arima/arima.py in predict(self, n_periods, exogenous, return_conf_int, alpha)
    651             end=end,
    652             exog=exogenous,
--> 653             alpha=alpha)
    654 
    655         if return_conf_int:

/opt/anaconda/envs/shared/lib/python3.7/site-packages/pmdarima/arima/arima.py in _seasonal_prediction_with_confidence(arima_res, start, end, exog, alpha, **kwargs)
     81     conf_int = results.conf_int(alpha=alpha)
     82     return check_endog(f, dtype=None, copy=False), \
---> 83         check_array(conf_int, copy=False, dtype=None)
     84 
     85 

/opt/anaconda/envs/shared/lib/python3.7/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 

/opt/anaconda/envs/shared/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    643         if force_all_finite:
    644             _assert_all_finite(array,
--> 645                                allow_nan=force_all_finite == 'allow-nan')
    646 
    647     if ensure_min_samples > 0:

/opt/anaconda/envs/shared/lib/python3.7/site-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
     97                     msg_err.format
     98                     (type_err,
---> 99                      msg_dtype if msg_dtype is not None else X.dtype)
    100             )
    101     # for object dtype data, we only check for NaNs (GH-13254)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
-----------------------------------------------------------------------------------------------------

To Reproduce
Steps to reproduce the behavior:

data:

[1872.0, 1452.0, 1476.0, 1404.0, 3048.0, 1788.0, 1080.0, 888.0, 2184.0, 2220.0, 1680.0,
612.0, 2124.0, 486.0, 1968.0, 924.0, 888.0, 1756.0, 1104.0, 876.0, 888.0, 1608.0, 1896.0,
648.0, 1524.0, 804.0, 816.0, 1944.0, 1512.0, 900.0, 1464.0, 876.0, 1464.0, 2136.0, 732.0, 
1764.0, 840.0, 1860.0, 792.0, 1728.0, 768.0, 1080.0, 876.0, 1716.0, 900.0, 1740.0, 888.0, 
2172.0, 486.0]

Code:

from pmdarima.arima import auto_arima

model_arima = auto_arima(data, start_p = 0, start_q = 0, max_p = 12, max_q = 12, m = 12, start_P = 0, start_Q = 0, seasonal = False, error_action = 'ignore', suppress_warnings = True, stepwise = True)
forecast_arima = model_arima.predict(n_periods = 18, return_conf_int = False, alpha = 0.05)

Versions

import pmdarima; pmdarima.show_versions()

System:
    python: 3.7.9 (default, Aug 31 2020, 12:42:55)  [GCC 7.3.0]
executable: /opt/anaconda/envs/shared/bin/python
   machine: Linux-4.4.0-1114-aws-x86_64-with-debian-stretch-sid

Python dependencies:
        pip: 20.2.3
 setuptools: 49.6.0.post20200917
    sklearn: 0.23.2
statsmodels: 0.12.0
      numpy: 1.19.1
      scipy: 1.5.2
     Cython: 0.29.21
     pandas: 0.25.3
     joblib: 0.16.0
   pmdarima: 1.7.1

Expected behavior There should be no error.

Actual behavior

Additional context

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 2
  • Comments: 20 (5 by maintainers)

Commits related to this issue

Most upvoted comments

The problem happens due to extremely large errors when the autoARIMA can’t find a good solution

Its the same error even after the pmdarima upgrade…

import pmdarima; pmdarima.show_versions()

System:
    python: 3.7.9 (default, Aug 31 2020, 12:42:55)  [GCC 7.3.0]
executable: /opt/anaconda/envs/shared/bin/python
   machine: Linux-4.4.0-1114-aws-x86_64-with-debian-stretch-sid

Python dependencies:
        pip: 20.2.3
 setuptools: 49.6.0.post20200917
    sklearn: 0.23.2
statsmodels: 0.12.1
      numpy: 1.19.1
      scipy: 1.5.2
     Cython: 0.29.17
     pandas: 0.25.3
     joblib: 0.16.0
   pmdarima: 1.8.0

Error

    380     model_arima = auto_arima(data_tra, start_p = 0, start_q = 0, max_p = 12, max_q = 12, m = 12, start_P = 0, start_Q = 0, seasonal = False, error_action = 'ignore', suppress_warnings = True)
--> 381     forecast_arima = model_arima.predict(n_periods = len(tes), return_conf_int = False, alpha = ci_alpha)

/opt/anaconda/envs/shared/lib/python3.7/site-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
     97                     msg_err.format
     98                     (type_err,
---> 99                      msg_dtype if msg_dtype is not None else X.dtype)
    100             )
    101     # for object dtype data, we only check for NaNs (GH-13254)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Could you let me know your supporting libraries versions as well with pmdarima.show_versions()? I read somewhere the pandas & statsmodel versions also matters.