prophet: Logistic Floor/Cap not being respected?
Pardon me in advance for being new to Prophet (and forecasting in general). I am trying to fit a curve for approximately 2 years worth of datetime data posted here, and forecasting a year’s worth of 5-minute intervals.
Whether I do a linear or logistic fitting, I do get some outrageous values that are larger or smaller than any of the inputted values (linear even produces negative values). I tried to leverage a logistic model to specify a floor and cap, but that didn’t stop deviant numbers either.
library(prophet)
library(dplyr)
df <- read.csv('http://bit.ly/2po0xPJ') %>% mutate(y=log(y))
#floor and cap
lower = quantile(df$y, .05)
upper = quantile(df$y, .95)
df <- df %>% mutate(floor = lower, cap = upper)
# modeling
m <- prophet(df,
changepoint.prior.scale=0.01,
growth = 'logistic')
future <- make_future_dataframe(m, periods = (24*12*365),
freq = 60 * 5,
include_history = FALSE) %>% mutate(floor = lower, cap = upper)
# forecast every 5 minutes
forecast <- predict(m,future)
#prophet_plot_components(m, forecast)
write.csv(forecast %>%
select(ds, yhat_lower, yhat_upper, yhat) %>%
mutate(floor = exp(lower),
cap = exp(upper),
ln_yhat_lower = yhat_lower,
ln_yhat_upper = yhat_upper,
ln_yhat = yhat,
ln_floor = lower,
ln_cap = upper,
yhat_lower = exp(yhat_lower),
yhat_upper = exp(yhat_upper),
yhat = exp(yhat)
), 'problem_output.csv')
For instance, of the 105120 forecasted records outputted, 25131 are lower than the specified floor. Can someone please tell me what I’m doing wrong? Or if there is an expected behavior I’m not interpreting correctly?
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 7
- Comments: 19 (7 by maintainers)
Hi, I’m trying to forecast daily transactions and was wondering if there’s a way to put minimum threshold for seasonalities or if that makes sense at all. I fit the model on two years of data using logistic growth with floor= 0 and predicting for the next year. I’m getting positive trend , however, because of negative seasonalities I’m still getting negative forecasts for transactions:

Thanks a lot for your help in advance! P.S. is it possible/sensible to put floor on confidence bands?
Thanks for the clean repro. This is a case of bad model fit due to the daily seasonality overfitting. If you plot the forecast with
plot(m, forecast)you can see that it looks like this:You can see that the in-sample fit seems pretty reasonable, but then the forecasted values are bad. You can see what is happening if you look at the components plot, with
prophet_plot_components(m, forecast):You can see that the daily seasonality has enormous swings of +/- 2 in the afternoon, which is what is messing up the forecast. The reason this is happening is because if you look at
df$dsthere are no data with time greater than 12:59:00. With no data in the afternoon, the daily seasonality is being fit poorly there. There’s some description of this happening with monthly data in the documentation her: https://facebook.github.io/prophet/docs/non-daily_data.html .I’m wondering if this is an artifact of a bad conversion to 24-hour time. But if there really is only times less than 12:59:00, then there are three things you can do to resolve this issue with the daily seasonality:
m <- prophet(df, changepoint.prior.scale=0.01, growth = 'logistic', daily.seasonality = FALSE).add_seasonalityto add a daily seasonality with a stronger prior (smaller prior.scale).I can imagine this issue coming up more frequently with sub-daily data, we should add better documentation of this behavior.
As a side note, the reason that with this bad seasonality the forecast goes outside the upper and lower bounds is because the upper and lower bounds are for the trend; Seasonal fluctuations will allow the forecast value to go outside those bounds.
@deniznoah You might want to read this https://math.stackexchange.com/questions/2687851/what-does-ln-accomplish-on-a-regression-input/2688642#2688642
And this if you need a better understanding of Euler’s number and natural logarithms. https://www.youtube.com/watch?v=m2MIpDrF7Es
Well, like I said we can get fluctuations outside of lower/upper with the seasonality. But it would be nice to have some check for this type of situation where a big part of the seasonality has no data. At the very least there should be documentation about this, so I’m going to leave this task open for that.
@shaidams64 The floor/cap is just for the trend, and as you’ve observed the seasonality can push the forecast outside that.
Like @deniznoah suggested you could fit the model to the log of your data (with no floor), and then take the exp() of yhat and that would ensure positivity. It does, however, induce a different trend model and the exp() can sometimes be a bit sensitive to small changes in the history. So it might work or might not, you’d just have to see.
Multiplicative seasonality would also drive the seasonality to 0 as the trend goes to 0.
Documentation now describes this issue in https://facebook.github.io/prophet/docs/non-daily_data.html , so I’ll go ahead and close.
@deniznoah But in Python and R,
log()is preciselye^x. On these platforms,log()uses baseeraised toxpower.