Timbre identification and matching
Hi,
I'm working on a method that replaces slices of an original sound file with a 'matching' sound slice from a database, making use of the MuBu/PiPo audio descriptors to find a corresponding degree of timbre similarity between the two sources. Within the overwhelming amount of possible descriptor correlations, I'd like to optimise the set of timbre matching parameters.
- What are good sources of information on the subject of timbre identification and comparison?
- Are there commonly used descriptor combinations / correlations / hierarchies for this purpose?
- Any experience with using different sets of descriptors for specific sound qualities (noisy/pitched, percussive/sustained, consonant/vowel, ...)?
Thanks a lot.
I would say in terms of a set of continua these would be useful:
noisy->"pure" (white noise-> single sine tone)
inharmonic->harmonic (multiple identifiable partials and their relation with each other)
dullness->brightness (how much of the available bandwidth of human hearing is used)
there's also this paper which I beleive is considered a classic in attempts to describe a "timbre-space":
Perceptual effects of spectral modifications on musical timbres
The Journal of the Acoustical Society of America 63, 1493 (1978); https://doi.org/10.1121/1.381843
It has the idea of spectral-centroid, attack-centroid and spectral flux as higher-order dimensions.
these can be found implemented in max in the zsa descriptors
https://cycling74.com/projects/project14-zsadescriptors