Totally remove any trace of vocals

jebb's icon

As a result of many algorithms (for example voice removal algorithms or even other stuffs), I often have a signal in which there remains "traces" of voice :

What DSP technique, based on STFT or other, would you know, which could totally eliminate any trace of vocals ?
(I don't care it we remove too much, but I would like no more vocal at all).

PS1 : I'm not looking for an algorithm that needs to be trained with unvoiced music / voice only ; I'm looking for an algorithm that can be run in realtime, it's ok to have a small latency (up to 1 second if we needs a few FFT frames in advance).

PS2 : I'm working on mono audio, so channel substraction technique won't work.

PS3 : I've read a few research articles about voice extraction, and tried their code, but I haven't found what I'm looking for : most algorithm try to achieve a good separation, and there are always traces of vocal at the end...

Andro's icon

Every vocal is different, male, female, this is normally an offline process so good luck with it, far too many variables involved i think.
With that many frequencies going through each other I think it's impossible to remove them all.