darts: [BUG] Training never starts on TFT/Progress bar not working
Describe the bug I use a dataset composed of 20 features and a single target. All of the features are future covariates. I use target past as well as the features’s history as past covariates. To covariates, I add datetime attributes of year, month, day of week, hour, and holidays. The dataset has several years of hourly data, however I tried cutting down the samples to check if it made a difference. I am succesfully using the same dataset on other models (not from DARTS) and getting good results.
To Reproduce
train_ratio = 0.90
look_back = 192
horizon = 192
n_outputs = 1
df = pd.read_csv(file_path, index_col = 0)
training_cutoff = pd.Timestamp(df['Time'].iloc[round(len(df)*train_ratio)])
series = TimeSeries.from_dataframe(df, 'Time', value_cols = df.columns[1:])
train, val = series.split_after(training_cutoff)
scaler = Scaler()
train_transformed = scaler.fit_transform(train)
val_transformed = scaler.transform(val)
series_transformed = scaler.transform(series)
trgt_scaler = Scaler()
trgt_transformed = trgt_scaler.fit_transform(series['target'])
covariates = datetime_attribute_timeseries(series, attribute='year', one_hot=False)
covariates = covariates.stack(datetime_attribute_timeseries(series, attribute='month', one_hot=False))
covariates = covariates.stack(datetime_attribute_timeseries(series, attribute='day_of_week', one_hot=False))
covariates = covariates.stack(datetime_attribute_timeseries(series, attribute='hour', one_hot=False))
covariates = covariates.add_holidays(country)
f_covariates = covariates.stack(TimeSeries.from_times_and_values(times=series.time_index,
values=df.iloc[:, 1+n_outputs:].to_numpy(),
columns=series.columns[n_outputs:]))
p_covariates = covariates.stack(TimeSeries.from_times_and_values(times=series.time_index,
values=df.iloc[:, 1:].to_numpy(),
columns=series.columns))
scaler_f_covs = Scaler()
f_cov_train, f_cov_val = f_covariates.split_after(training_cutoff)
scaler_f_covs.fit(f_cov_train)
f_covariates_transformed = scaler_f_covs.transform(f_covariates)
scaler_p_covs = Scaler()
p_cov_train, p_cov_val = p_covariates.split_after(training_cutoff)
scaler_p_covs.fit(p_cov_train)
p_covariates_transformed = scaler_p_covs.transform(p_covariates)
quantiles = [
0.1, 0.25, 0.5, 0.75, 0.9
]
model = TFTModel(input_chunk_length=look_back,
output_chunk_length=horizon,
hidden_size=32,
lstm_layers=1,
full_attention = True,
dropout = 0.1,
num_attention_heads=4,
batch_size=32,
n_epochs=250,
add_relative_index=False,
add_encoders=None,
#likelihood=None,
#loss_fn=MSELoss(),
likelihood=QuantileRegression(quantiles=quantiles), # QuantileRegression is set per default
force_reset=True,
pl_trainer_kwargs = {"accelerator": "gpu", "gpus": [0],
"enable_progress_bar" : True, "enable_model_summary" : True},
optimizer_cls = torch.optim.SGD,
optimizer_kwargs = {'lr':0.01})
model.fit(train_transformed['target'],
future_covariates=f_covariates_transformed,
past_covariates=p_covariates_transformed)
Expected behavior Training starts but it gets stuck. It never ends a single epoch.
System:
- Python version: [ 3.9]
- darts version [ 0.17.0]
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 17 (4 by maintainers)
True, in that case could you try uninstalling ipywidgets?
Can you still try to use max_samples_per_ts=1 in the fit() call and tell if the problem persists?
Hi, I have tried both options, running with CPU, and on the new version 0.17.1 with both GPU and CPU. With all the options it gets stuck in the beginning like so:
Is this probably too many parameters? Does it usually work smoothly on a model this big? Probably too many samples.
I could not reproduce the issue (on version 0.17.1). Could you try running it on CPU instead of GPU? With
Would also interest me if you still get the issue in version 0.17.1