Towards Symbolic Time Series Representation Improved by Kernel Density Estimators. (arXiv:2205.12960v1 [cs.LG])

This paper deals with symbolic time series representation. It builds up on
the popular mapping technique Symbolic Aggregate approXimation algorithm (SAX),
which is extensively utilized in sequence classification, pattern mining,
anomaly detection, time series indexing and other data mining tasks. However,
the disadvantage of this method is, that it works reliably only for time series
with Gaussian-like distribution. In our previous work we have proposed an
improvement of SAX, called dwSAX, which can deal with Gaussian as well as
non-Gaussian data distribution. Recently we have made further progress in our
solution – edwSAX. Our goal was to optimally cover the information space by
means of sufficient alphabet utilization; and to satisfy lower bounding
criterion as tight as possible. We describe here our approach, including
evaluation on commonly employed tasks such as time series reconstruction error
and Euclidean distance lower bounding with promising improvements over SAX.



Related post