ITT: Transpose

ADHD101's icon

So im working in a new patch, im not sure about the problem but i want to tanspose an audio file, if its posible...

1592.transpose.png
png
MIB's icon

transpose without changing the speed of the file? check out the helpfile for gizmo~

ADHD101's icon

gizmo~ just change the pitch, doesnt transpose, damn.
someone can REALLY help me?

Mike S's icon

changing the pitch is transposing

brendan mccloskey's icon

As MIB and Mike S say, look at [gizmo~]; depending on your source material, [freqshift~] may also suffice, but: ensure you know exactly what transpose means.

to vary or alter the pitch of an audio source

Brendan

Roman Thilenius's icon

you transpose samples by playing them at different speeds.

Charles Baker's icon

@ADHD - your statement is quite puzzling: what do you think transposition is? many encounter it only in the sense of "changing key center" to fit an instrument or voice's preferred range, as in "can we transpose Das Erlkonig down a third, it makes much easier on my voice?". This added musical connotation should *not* disguise the fact that
transposition is changing pitch.
if you wan to transpose, and keep the note values/tempo the same, use
gizmo~.
if you want "transpose like a tape deck or sampler" , where higher/upper pitched transpositions are 'faster' than 'slowed down' downward pitch transposition: then Roman has the answer, and you *should* look at the Max/Msp tutorials on sampling, quite detailed and powerful in themselves.
ok?
cfb

MIB's icon

not to split hairs, but it should be "DER Erlkoenig". and I would love to see the pianist that is willing to transpose that to suite the singer ;)

Charles Baker's icon

Eins:
ja ja der /das : too much correctness for a non speaker, ;-).

Two:
1 Tim Hoeckman accompanist for Stephen Richards at Florida State University often sight transposes.
2 My old comp. teacher John Boda sight transposed up a storm.
There was a story about Ernő Dohnányi at my old school (FSU in Tallahassee): he was challenged to a sight reading contest by a young composer/pianist. The challenger was sat down and a brand new full orchestral score was placed in front of him. He jumped to it, and performed a decent rendition of the music sight reading from score. Then Dohnányi was ushered in; he sat looking at the score for a few moments, then turned to the waiting musicians and asked "what key do you want this in?". He was awarded the win after he played a sight read reduction transposed a minor third from the notated pitch.

Charles Baker's icon

@MiB = and have to add,RE: Der Erlkonig: Yepper, you are correct sir, the notated repeated triplets are a bear in the original key, and in certain keys I imagine they would be almost impossible.

enough nineteenth century musical foo foo:
@ADHD = if you use gizmo~, there are tricks for improving the sound: I like to use two copies of gizmo~ set to the same transposition (or extremely near to same transposition,;-) ),with a small delay on input to one: this obscures some of the transposition artifacts to my ear.

brendan mccloskey's icon

@Charles Baker

sort of related (and probably quite common too) - perhaps a DSP ninja can explain why this 'trick' of combining a signal with a slightly delayed version of itself helps ameliorate transposition/AM artefacts. It's something I had to resort to after spending weeks trying apply two phase-offset windows to a granulation engine; in the end I just went for one amplitude window, split into two with one half delayed by around 28ms. Works a treat and a lot less of a headache.

Brendan

Charles Baker's icon

As i *loosely* understand it (and my brother is the one we should ask, he teaches neurobiology and researches auditory neural pathways at Northwestern Med Center, Chicago)
This 'improvement' is because we are confusing something our auditory system does: a sub-cycle phase matching 'imaging' done to the auditory signal by our brain: this is the secret of accurate binaural placement, the truth behind a strong image in stereo field: where it can the ears align the signals by phase, with the inter-aural ear distance as part of the calculation: with this it can locate the sound quite accurately,just by resolving interaural phase differences and precise early echo timings, all without any visual support.
BUT... with a larger than inter-ear delay mixed back in the signal, (especially one that has any variance at all) we are less likely maintain a clear phase aligned 'image' of the sound: we introduce this 'chorusing',and the ear is "image confused",it hears something with all the same spectral balance, and appearing to have gestalt 'common fate' as the original, single signal: but with the new signal the ear is clear to hear the parts of each signal with no artifacts as being part of a single, hard to locate signal with few artifacts (ie: inappropriate signal components), and all the little thumps/clicks/and window-induced-amp artifacts are heard as a (hopefully better masked) combined 'noise' signal. The effect is a perceptually smoother signal sounding remarkably like the original signal: adjust the precise delay so that any amplitude "tremelo" introduced by the effect's signal windowing is smoothed out by the delayed signal's 'window tremelo', and the improvement is multiplied this is rreally not hard to do by ear...
As i hear it, having *enough* time on the delay is critical to acheive the smoothing. too short a delay, and it is just another early echo delay;the sound does not really 'chorus' (your 23 ms sounds just about right for me!!).

Lil' disclaimer, given my bro's employment..: The above is just as i recall from various sources and discussions, and as praxticed in my electro comp...this psycho acoustics is not my main field (despite the PhD in muscomp, I write code for insurance companies), just a lifelong interest.. Lotsa good books on the subject, hint , hint.

cfb aka j2k

brendan mccloskey's icon

Hey thanks Charles for the comprehensive reply - perhaps i should have paid more attention during my Masters at SARC! Psycho-acoustics is a field of related but limited relation to my thesis, aaaargh more option anxiety!

Are you suggesting that a variable delay (as opposed to fixed) in the region of 20-30ms will have an even better 'improvement' effect?

Brendan

Charles Baker's icon

it will increase the amount of spectral issues: provide more 'confusion' this generally ='s more 'fatness', 'solidity', 'richness': an in many cases help with "fusing" the delayed signals into a single entity. This fusion effect random/&periodic pitch variance was first shown with 'fof' granualar synthesis.
each case must be judged on it's own merits: one set of parameters = genius with one signal, a tasteless mess with another, more active signal , and just plain ugly with a thin bell or pluck, ;, always the answer in the ear, right: but if u add random pitch vriance, make sure it can be adjusted in period and amt: any level for too long gets.....uh, predictable?...'n that can harm the aural magic. designing good chorus is an art...
well, nuf ramble.
Charlie

Roman Thilenius's icon

that reminds me on a kyma discussion where people wanted to biuld an
effect which makes a (mono or stereo) car motor sample sound more
like an actual recording.

using standard methods of making "more stereo" and shit will not gt you
there, there is far more to explore in this field - and making a sound source
more "not from one point" but broader can also be very interesting with and
for the generation of classic synthesizer sounds.

-110

Charles Baker's icon

Reminded here of John Chowning's (?julius?perry? someone from CCRMA) quote on the CCRMA approach to sound projection/imaging in concerts (the CCRMA setup is widely copied,(or reviled), it uses reversed 'rear' channels ): he said something like "contrary to what you would think, you do not want point sources: the more dispersed signal gives a better, less tricky image". This is about projection, but might have import here.

I think using tricks to "confuse the phase coherent image" brings other perceptual mechanisms such as gestalt psych's "Common Fate"&"Common Origin" more into play: thus better perceptual "fusion", and less aural "picky-ness" over sampling introduced aural oddities, such as loop/envelope artifacts.
just an idea: there is research could be done here.
njoying the discussion...
l&k j2k
aka cfb