Yeah, what I mean is to speed up the specific transients only.
Here's an example of recording amplitude & phase data to a buffer~, and then playing it back, via pfft~ and gen~. This could form the basic infrastructure for what you want to do.
I'd first try to get the transient detection going. For that you'll need to store state *per bin*. Simplest thing would be to have an onset threshold, and say a transient begins when the bin's amplitude rises above this threshold, and ends when it drops below again. You could easily test this by scaling all output amplitudes to zero if the transient is not detected.
A better detector might use different rising/falling thresholds and slopes. I'd recommend looking at the gen~ vectral example as the simplest demo of per-bin processing (hint: it uses a [data vectorsize] object to store per bin information).
Once that works, then you could use the presence of a transient to increase the playback speed *for that bin only*.