The clustering behavior of sliding windows
6 comments
·March 19, 2025jmole
You'd think this would be more well known. In fourier analysis or other frequency/wavelet decomposition techniques, the data you get out depends a ton on the windowing function that you use.
For example:
https://en.wikipedia.org/wiki/Sinc_function
https://en.wikipedia.org/wiki/Window_function#Examples_of_wi...
somecontext
Of further potential interest: This paper cites an earlier paper by Keogh and Lin with the provocative title "Clustering of time-series subsequences is meaningless", available online at https://www.cs.ucr.edu/~eamonn/meaningless.pdf
newer_vienna
Interesting proofs, though I would love more of the examples that show the dataset and the unexpected results, as well as describing what may lead to these traps and how to avoid them
PaulHoule
... if I get it right the whole idea of clustering sliding windows is wrong, the question of "what you should do instead?" is an interesting one.
I'd imagine two answers are: (1) for time series which are somewhat periodic you might cut out individual days or weeks and try to cluster them for each other, (2) for time series which are intermittent you might create some definition of an "event" (an earthquake, or a particle passing through a detector) and then cluster events and maybe (3) for something episodic such as "heart rate during a workout" you would cluster episodes.
pottertheotter
Well, I thought this was going to be about home design and sliding windows.
very interesting, but I wonder if it would hold for clustering with other distance metrics such as dynamic time warping (DTW).