I have a question about comparing sampled streams of data (that could apply to any data, gestural, audio descriptors etc). I am looking for ways to sample and compare input data streams of varying lengths to each other, to find similar or contrasting data 'phrases'. I'm doing this matching using the zsa.dist external from Mikhail Malt and Emmanuel Jourdan's zsa descriptors library: http://www.e--j.com/?page_id=83
My question concerns averaging out data streams, as I want a consistent way to compare streams of varying lengths. So far I am achieving this by averaging out the data streams of varying lengths to a fixed length - i.e. 50 data points. The way I am doing this is by taking a sampled stream, rounding the length up or down the nearest multiple of the desired fixed length (if rounded down the list is truncated, if rounded up it is padded with 0's), and then dividing this rounded length by the fixed length to get an evenly spaced window size with which to average out the data.
So if the original list has 258 elements and I want it averaged or down-sampled to 50 points:
Truncate the list to 250 elements,
Divide this length by the fixed length desired, 250/50 = 5,
Take the average value of every 5 elements,
Make a new list out of these averages.
As this approach was arrived at by trial and error, I guess I just wanted to know if anyone has tried to achieve anything similar and has approached it in a different way? Looking on the net for approaches to down-sampling data like this hasn't proved very useful for my purposes -or maybe I'm not looking in the right places :-)
Any ideas on how to improve this approach, or if there is another approach that may be more accurate?
I have attached an example patch to illustrate the idea.
Thanks in advance,