Does anyone know what timestretching algorithm Melodyne uses?

maxplanck

http://www.celemony.com/cms/index.php?id=products_studio

Peter Ostry

Quote: maxplanck wrote on Wed, 14 May 2008 23:04
----------------------------------------------------
> Does anyone know what timestretching algorithm Melodyne uses?
----------------------------------------------------

I don't believe that you will find this out. Peter Neubaecker is a Genius and right now he amazed the whole computer-audio world with his new product, the polyphonic Melodyne which is theoretically impossible in this quality. But he did it.

Here are links to two well-known pieces of software but I am afraid the people won't tell you the algorithms. Time stretching is one of the most valuable techniques in todays music industry:

Radius from iZotope:
http://www.izotope.com/products/audio/radius/

Pitch 'n Time from Serato:
http://www.serato.com/products/pnt/

====================

Again a big wish to Cycling'74:
Please let us have real links in the online forum. There are so many links on the page (appr. 50 of them), it makes really no difference when ours are functional too.

Axiom-Crux

2 easy ways of diong pretty good time stretching involve either granular synthesis or fft spectral techniques. Theres plenty max patches out there with either that are publicly available. Melodyne uses spectral techniques to extract the fundimental and overtones based on very intelligently programmed fft techniques and alot of looking at spectrographs. Hes not too shy about it, during one of his press releases he talked a bit about how he did it, its the complexities that make it unique and those are the things that others will lag behind for qhite a while in discovering. the essential basics are pretty easy.

stringtapper

Quote: Peter Ostry wrote on Wed, 14 May 2008 18:05
----------------------------------------------------
> Again a big wish to Cycling'74:
> Please let us have real links in the online forum. There are so many links on the page (appr. 50 of them), it makes really no difference when ours are functional too.
----------------------------------------------------

Although I love the new feature in Safari 3 that let's you highlight an unlinked address, right-click and select "Go to address". Very nifty.

Marcos

Quote: Peter Ostry wrote on Wed, 14 May 2008 19:05
----------------------------------------------------

Has anyone in the media actually used the software yet? I haven't looked at the mathematics behind it yet, but if it is theoretically impossible, then it IS impossible. Just like when people say they have invented a perpetual motion or free energy machine, the laws of thermodynamics won't ever let that happen.

Anyways, all they have right now are video demos, correct? Until we get the software in the hands of reviewers and such, we have no idea if it can do what it claims to do...unless this has already happened?

Adam Murray

Quote: marcoskohler wrote on Wed, 14 May 2008 16:53
----------------------------------------------------
> I haven't looked at the mathematics behind it yet, but if it is theoretically impossible, then it IS impossible. Just like when people say they have invented a perpetual motion or free energy machine, the laws of thermodynamics won't ever let that happen.

Not true. It is theoretically impossible to do it with 100% accuracy. Approximations can be made to do it with very high accuracy. This is what engineering is all about: making things work in practice regardless of theory.

What about the existing pitch and beat detection externals for MSP? They certainly don't work perfectly, but plenty of people use them and get a lot of value out of it. It's not a black/white, possible/impossible problem.

stepfrequencer

[quote]but if it is theoretically impossible, then it IS impossible[/quote]

People thought it was impossible, there's no actual theory proving without doubt that it is impossible

Marcos

ahhhh ok. Still, has anyone besides the company played with it? Any magazines, etc? Wonder how spot on it is.

Eli

Demos and such, like those that advertise the new Polyphonic Melodyne function, are created with just the right parameters (through trial and error) in order to make the product sound GREAT.
Nothing involving computers is perfect, since we've only been fucking with them for 50 years, but I think the invention, algorithm involved in, seperating an audio files individual tones into a spectrum is an astounding achievement not only just in the electronic music world but in the computer community as well. Also, if you read a lot of psychoacoustics it doesn't seem so impossible.

Maurizio Giri

Quote: stepfrequencer wrote on Wed, 14 May 2008 19:06
----------------------------------------------------
> [quote]but if it is theoretically impossible, then it IS impossible[/quote]
>
> People thought it was impossible, there's no actual theory proving without doubt that it is impossible
----------------------------------------------------

A musically trained human brain can actually do it (recognize single voices in a polyphony): imho this is a proof that it IS possible, no?

Stefan Tiedje

marcos schrieb:
> Has anyone in the media actually used the software yet? I haven't
> looked at the mathematics behind it yet, but if it is theoretically
> impossible, then it IS impossible. Just like when people say they
> have invented a perpetual motion or free energy machine, the laws of
> thermodynamics won't ever let that happen.

The quote of this being impossible is a smart marketing technique, to
sell Peter as a genius (he is, no doubt). I listened to his talks, and
its not so hard to understand what he is doing. He is using assumptions
which are close enough to the reality to work with. One assumption is,
to think of harmonic sounds. Any sound has its over tone series. Any
single pitch will fit to a certain harmonic series...

Look for the lowest partial, look for its harmonic series, and assign
these partials to that pitch. Then look for strong partials which don't
fit into the series of the lowest. That's the next pitch. Collect it's
series etc. The next assumption would be, that the harmonics decline.
That way you find also pitches which are played harmonically (If a
partial is louder than expected, its probably also a played pitch and
thus a fundamental...)

The main problem and work, is to do the research of the details, to
create a resynthesis which sounds correct...

Mastery is 1% inspiration (roughly what I described above) and 99%
transpiration, that's what Peter is working on till the release at the
end of this year. If you buy the software, you pay for the
transpiration. The inspiration for sure is covered by patents...

> Anyways, all they have right now are video demos, correct?

The videos showed a very impressive proof of concept. I am sure the
examples had been chosen for being impressive, that's part of the
business...

They won't show the material which doesn't work, but Peter was talking
about music which is not made for that technique. The next Melodyne will
cover this by using neural nets. (My bet, the inspirational part is
there already...)

As soon you understand the technical concept, you would want to do your
own implementation. This will cost you a lot of transpiration. Most
people happily share the inspirational part...
Just recently we had a promising share of a granular stretch from
Mattijs Kneppers btw., search the archives...

Stefan

--
Stefan Tiedje------------x-------
--_____-----------|--------------
--(_|_ ----|-----|-----()-------
-- _|_)----|-----()--------------
----------()--------www.ccmix.com

Stefan Tiedje

gavin Peters schrieb:
> [quote]but if it is theoretically impossible, then it IS
> impossible[/quote]
>
> People thought it was impossible, there's no actual theory proving
> without doubt that it is impossible

The people who thought it's impossible are neither inventors, nor
scientists, its a quote out of an amateurs view, and means its
impossible with the current available (marketed) tools. Typical talk of
journalist of "professional journals". Using this quote in the context
of marketing and at the same time proofing it wrong is a smart way to
gain interest... (Oh, he (the star) did it again...)

Stefan

--
Stefan Tiedje------------x-------
--_____-----------|--------------
--(_|_ ----|-----|-----()-------
-- _|_)----|-----()--------------
----------()--------www.ccmix.com

maxplanck

I've run some tests on Melodyne 3.

I'm fairly sure that Melodyne uses Hilbert Transform to approximate the frequency and amplitude envelopes of signals.

Hilbert Transform involves generating a complex valued signal from the original real valued signal (this process is only completely accurate with a signal of infinite length, short of that its frequency response consists of a main lobe of a certain width and sidebands (imperfect process) just like filters and Fourier Transform) then there are equations for calculating instantaneous amplitude (i.e. amplitude envelope) and instantaneous phase of the real valued signal using the complex valued signal. These calculations are only as accurate as your approximation of the complex valued signal (i.e. the output of the Hilbert Transform).

Instantaneous frequency is then calculated by taking the first derivative of instantaneous phase with respect to time.

I've done these calculations for signals that I was trying to analyze in Scilab before, they work very well.

Peter Lyons' "Understanding Digital Signal Processing" contains the most easily understandable explanation of these concepts that I've found.

As far as the time stretching algorithm, I HIGHLY doubt that any spectral resynthesis is taking place. My assumption here comes from having done some experimenting with spectral synthesis myself, as well as looking closely at Melodyne's input and output waveforms. It looks to me as if the time stretching, in a nutshell, simply consists of zero crossing detection then looping the region between every other zero crossing. Then there is an algorithm for deciding how long each of these loops/"grains" should be played, based upon the amount of time stretching applied. Lastly, there must be an algorithm defining when and how it crossfades between grains.

There are a few other techniques that I think it may use to increase the accuracy of its Hilbert Transform.

I was just wondering if anyone has a paper describing the exact algorithm used, or if there were papers published that describe the algorithms/techniques that are obviously or probably implemented in this software.

Axiom-Crux

ive done some spectral note extraction with metasynth and max/msp using simple fft filtering, heck I think theres even a patch in the examples that I used and slightly modified for this, i think it was forbidden planet or something of that sort, which has a multislider list for which frequencies to filter and which to let through. but this is all kindof beside the time stretching point.

has anyone used the current melodyne? Its awesome right now, blows other auto tuning methods out of the water. Ive used it alot on violin and vocal parts lately and it sounds perfect and totally natural.

maxplanck

Edit: I have not looked at the Hilbert Transform of signals that cover a large frequency bandwidth, I only looked at the output of the Hilbert Transform of signals consisting of a very narrow frequency bandwidth.

I'm certain that the HT+Amplitude Envelope Calculation is useful, if not the most useful method for calculating the amplitude envelope of a signal that contains harmonics and/or partials. In order to save computational expense, the software may instead simply detect peaks and interpolate linearly between them (I haven't taken a close look at the amplitude envelopes generated by the software, a quick comparison between these and the input waveform should yield the answer).

I just am not sure that the HT+Instantaneous Frequency Calculation is useful for calculating the instantaneous fundamental frequency of signals that contain harmonics and/or partials of frequency significantly different from the fundamental frequency.

There may be additional processing of the HT's output required in order to glean useful information about the instantaneous fundamental frequency. Or perhaps the original signal is simply lowpass filtered before being fed to the HT, in order to decrease the amplitude of the harmonics and/or partials.

Or maybe the signal is simply Fourier Transformed using the traditional McAulay-Quatieri FT + peak interpolation and partial tracking, then the fundamental frequency's frequency envelope is then simply used.

If any FT is occurring, and if HT is the core method used in calculating frequency and amplitude envelope, then the output of the FT is likely factored in to the time stretching and/or instantaneous frequency.

maxplanck

Yeah I've used the latest Melodyne, but only for time stretching. I need a good timestretching algorithm to use as an intermediate step in something else that i'm trying to do. I have some more tests to run on it in order to determine if it will fit the bill or not.

Stefan Tiedje

Max schrieb:
> As far as the time stretching algorithm, I HIGHLY doubt that any
> spectral resynthesis is taking place. My assumption here comes from
> having done some experimenting with spectral synthesis myself, as
> well as looking closely at Melodyne's input and output waveforms.

My experimentation with spectral resynthesis resulted not in perfect,
but HIGLY better and more promising results than anything else. Zero
crossings are useless if the signal has significant noise parts, or is
simply short. (Unless all you want to do is octaves... ;-)
In the interviews Peter is directly pointing to spectral treatement as
far as I remember...

Stefan

--
Stefan Tiedje------------x-------
--_____-----------|--------------
--(_|_ ----|-----|-----()-------
-- _|_)----|-----()--------------
----------()--------www.ccmix.com

dodgeroo

have you asked around the soundhack forum?

On Thu, May 15, 2008 at 6:22 PM, Max wrote:
>
> Yeah I've used the latest Melodyne, but only for time stretching. I need a good timestretching algorithm to use as an intermediate step in something else that i'm trying to do.
>