Abstract:
The current Shapelet extraction algorithm for time series ordinal classification, which suffers from low efficiency, needs to figure out the Pearson's correlation coefficient or the Spearman's correlation coefficient between the Euclidean distances and the label distances from time series to Shapelets to evaluate the Shapelets. To handle this problem, this paper first proposes a Shapelet measure CD-Cover (concentration and dominance of coverage) based on the SAX (symbolic aggregate approximation)-represented time series. The measure takes into account both the concentration and the dominance of coverage of a Shapelet on the time series dataset. Secondly, this paper also proposes a Shapelet extraction algorithm based on random sampling. The algorithm uses the Bloom filter to pre-prune Shapelet candidates and employs a strategy of removing self-similar Shapelets to post-prune the extracting results. Experimental results on 11 time series public datasets show that the Shapelet extracted by the proposed algorithm has better ability for ordinal classification than the existing methods, and meanwhile, the computing efficiency of the proposed algorithm is superior to that of the existing methods.