sktime: [BUG] Conversion from_multi_index_to_3d_numpy(X=obj)
Describe the bug
I’d like to start out by saying this is probably an issue with my code, however, whenever I am running basic models from the examples I keep coming into a reshape error.
for instance I first run my data through check_raise(X_train,mtype=“pd-multiindex”) and it returns true. For this example I made a time series with 38 instances (subjects) with 1 time point each and 2 columns. Note the math is the same with real data.
To Reproduce
I run the code:
from sktime.classification.distance_based import KNeighborsTimeSeriesClassifier classifier = KNeighborsTimeSeriesClassifier(distance="euclidean" , n_jobs=-1) pipe = make_pipeline(classifier) pipe.fit(X_train, y_train) y_pred = pipe.predict(X_test) accuracy_score(y_test, y_pred)
and it returns: ValueError: cannot reshape array of size 76 into shape (38,38,2)
I also took another example this time using:
clf = ColumnEnsembleClassifier( estimators=[ ("TSF0", TimeSeriesForestClassifier(n_estimators=100), [0]) ] ) pipe = make_pipeline(clf) pipe.fit(X_train, y_train) y_pred = pipe.predict(X_test) accuracy_score(y_test, y_pred)
and it returns: ValueError: cannot reshape array of size 38 into shape (38,38,1)
Expected behavior
I thought it would run as it passed the check.
Additional context
When I add in default padding pad = PaddingTransformer(pad_length=None, fill_value=0)
both models works which is strange as I am unsure what they are padding if all of my sequences are truncated to 1 timepoint each
Versions
System: python: 3.10.7 (v3.10.7:6cc6b13308, Sep 5 2022, 14:02:52) [Clang 13.0.0 (clang-1300.0.29.30)] executable: /Library/Frameworks/Python.framework/Versions/3.10/bin/python3.10 machine: macOS-12.0.1-arm64-arm-64bit
Python dependencies: pip: 22.2.2 setuptools: 59.8.0 sklearn: 1.1.2 sktime: 0.13.4 statsmodels: 0.13.2 numpy: 1.22.4 scipy: 1.8.1 pandas: 1.4.4 matplotlib: 3.6.0 joblib: 1.2.0 numba: 0.56.2 pmdarima: None tsfresh: None
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 18 (3 by maintainers)
I have a deadline for presenting the results on Friday, after that I’ll try and post here what went wrong with the pipeline if its easy enough to debug. The workaround was to start over the data and use 3d numpy arrays over pandas multiindex.