Time series data mining has received much greater interest along with the increase in temporal data sets from different domains such as medicine, finance, multimedia, etc. Representations are important to reduce dimensionality and generate useful similarity measures. High-level representations such as Fourier transforms, wavelets, piecewise polynomial models, etc., were considered previously. Recently, autoregressive kernels were introduced to reflect the similarity of the time series. We introduce a novel approach to model the dependency structure in time series that generalizes the concept of autoregression to local autopatterns. Our approach generates a pattern-based representation along with a similarity measure called learned pattern similarity (LPS). A tree-based ensemble-learning strategy that is fast and insensitive to parameter settings is the basis for the approach. Then, a robust similarity measure based on the learned patterns is presented. This unsupervised approach to represent and measure the similarity between time series generally applies to a number of data mining tasks (e.g., clustering, anomaly detection, classification). Furthermore, an embedded learning of the representation avoids pre-defined features and an extraction step which is common in some feature-based approaches. The method generalizes in a straightforward manner to multivariate time series. The effectiveness of LPS is evaluated on time series classification problems from various domains. We compare LPS to eleven well-known similarity measures. Our experimental results show that LPS provides fast and competitive results on benchmark datasets from several domains. Furthermore, LPS provides a research direction and template approach that breaks from the linear dependency models to potentially foster other promising nonlinear approaches.
- Pattern discovery
- Regression tree
- Time series
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Computer Networks and Communications