sktime: [BUG] `forecaster.update` fails on multi indexed data frame

Discussed in https://github.com/sktime/sktime/discussions/5833

<div type='discussions-op-text'>

Originally posted by ManuB68 January 25, 2024 I am trying to update a forecaster but it doesn’t seem to work on my multi indexed data frame:

python code:

df = pd.DataFrame
df = pd.DataFrame([['A', '2024-01-01', 1, 11], ['A', '2024-01-02', 2, 12], ['A', '2024-01-03', 3, 13], ['A', '2024-01-04', 4, 14],\
                   ['B', '2024-01-01', 101, 1], ['B', '2024-01-02', 102, 2], ['B', '2024-01-03', 103, 3], ['B', '2024-01-04', 104, 4],\
                    ['C', '2024-01-01', 10000, 10], ['C', '2024-01-02', 20000, 20], ['C', '2024-01-03', 30000, 30], ['C', '2024-01-04', 40000, 43]],\
                  columns=['Patients', 'Dates', 'M1', 'M2'])

df['Dates']= pd.to_datetime(df['Dates'])

df = df.reset_index(drop = True)
df = df.set_index(["Patients", "Dates"])

y_train, y_test = temporal_train_test_split(df, test_size=1)

fh = 1

forecaster = VAR() 

forecaster.fit(y_train)
y_pred = forecaster.predict(fh=fh)


Day = y_test.index.get_level_values("Dates")[-1]
y_update = y_test[y_test.index.get_level_values("Dates") == Day]

forecaster.update(y_update)

y_pred_updated = forecaster.predict(fh)

Code might look overcomplicated, due to extraction from a code iterating on dates and my poor knowledge of python The execution fails on line: forecaster.update(y_update)

with this message: ValueError: Need at least 3 dates to infer frequency

</div>

About this issue

  • Original URL
  • State: open
  • Created 5 months ago
  • Comments: 15 (9 by maintainers)

Commits related to this issue

Most upvoted comments

@yarnabrina the series’ frequency is inferred to retain the frequency information on the new slice of data. This fails for the edge case where the original data itself has fewer than 3 observations and no frequency set

You are right. I am just thinking if in this case it would be better to not provide any frequency information if the original index has no frequency set…

But I think trying to infer one and ignore the warning if not possible is a good solution. Alternatively, we could do this more explicit, using an if statement that checks if an inference would be possible… Probably, this would be better to regarding maintainability and understandability.