Classify time series — and see why.
SAX-VSM turns each class of time series into a weighted bag of SAX words. Classification is a cosine-similarity lookup, and because every word maps back to a subsequence, the model tells you which shapes drove its decision.
Two classic ideas, composed
SAX-VSM joins Symbolic Aggregate approXimation — a symbolic representation of time series — with the vector space model from information retrieval. Each class collapses into one tf·idf-weighted term vector; an unlabeled series is scored against each by cosine similarity and labeled by the closest.
Discretize
A sliding window + SAX converts every class's training series into one combined bag of SAX words.
Weight
tf·idf scales each word by how characteristic it is of its class, yielding one weight vector per class.
Classify
An unlabeled series is discretized the same way and assigned the label of the most cosine-similar class vector.
A step-by-step walk through the math
The algorithm pages build SAX-VSM from the ground up — z-normalization, PAA, the SAX symbol table, sliding-window discretization and numerosity reduction, then tf·idf and cosine similarity — each with worked R examples and figures.

Get the library
Add the dependency
SAX-VSM ships as net.seninp:sax-vsm:2.0.0 on Maven Central.
<dependency>
<groupId>net.seninp</groupId>
<artifactId>sax-vsm</artifactId>
<version>2.0.0</version>
</dependency>Cite the work
Senin, P., Malinchik, S. SAX-VSM: Interpretable Time Series Classification Using SAX and Vector Space Model. ICDM 2013.