Towards Efficient Similarity-Aware Time-Series Classification
Time Series Classification (TSC), as one of the most fundamental data mining tasks, has attracted tremendous research attention in the past two decades. Given a set of time series instances with their corresponding class labels (training set), the goal of classification is to train a model such that it can accurately predict the class labels of future, previously unseen time series. However, when the training set is small, most existing approaches would not work due to the lack of labeled data. Semi-supervised time series classification has been proposed to address this issue. One such recent model is called SIM-TSC, which integrates similarity and deep learning into a single model. SIM-TSC uses a popular, shift-invariant distance measure called Dynamic Time Warping (DTW) to create a graph, in which similar time series instances have higher weights, and then train a graph neural network on it. However, it requires the construction of a full similarity matrix, which has quadratic time complexity in the data size and thus has limited usability on larger datasets. To address this challenge, instead of using a DTW similarity matrix to build the graph, we propose to compute a fast, linear-time approximation of DTW that lower bounds the original DTW distances. The lower bounding distance we adapt is called LB-Keogh. The similarity matrix built with LB-Keogh will serve as the backbone of the graph neural network. In the experiments, we run our method using eight real-world datasets from the well-known UCR time series classification archive. The results demonstrate that our approach significantly reduces the running time of creating the similarity matrix for the graph, especially on large datasets, while achieving comparable classification accuracy with the DTW similarity matrix.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.