pyts: SAX-VSM, constant time series error
Description
When running SAX-VSM on my timeseries I get the following error:
At least one sample is constant.
I tried filtering out all the constant time series with
X = X[np.where(~(np.var(X, axis=1) == 0))[0]]
to no avail
I tried fitting the model on 1 non-constant array and still got the error. I think that the issue is that this error is thrown when the SAX approximation would give the same symbol for the window, thus meaning that the window is constant. E.g. for a wordsize of 3 if the SAX transform would yield ‘aaa’ then this error appears. Could it be the case?
Steps/Code to Reproduce
<<
from pyts.classification import SAXVSM
X_train = np.array([[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2]])
y_train = np.array([1])
clf = SAXVSM(window_size=0.5, word_size=0.5, n_bins=3, strategy='normal')
clf.fit(X_train, y_train)
>>
Versions
NumPy 1.20.1 SciPy 1.6.1 Scikit-Learn 0.24.1 Numba 0.53.1 Pyts 0.11.0
Additionally an error:
'n_bins' must be greater than or equal to 2 and lower than or equal to min(word_size, 26)
If n_bins represents the alphabet size/the paa approximation then why should it be lower than the word size? Doesn’t that mean that situations like alphabet = {a,b,c,d,e} and wordsize = 3, are impossible? (which shouldn’t be the case)
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 46 (25 by maintainers)
Thanks @johannfaouzi - I came across this error using 0.11 and installed from main (ff746ef) and it’s working for me!