Vecorized convolution external on intel mac (ala buffir~)

Dec 13, 2006 at 12:16pm

Vecorized convolution external on intel mac (ala buffir~)

Hi,

I coded a convolution external that is quasi identical to buffir~, but with added vector calculation from apple convolution routine/accelerate framework.

So on PPC mac, it was boosted by using altivec and provided 4x better performances than buffir~.

Now I want to make it work on intel mac too.
I thought that the external would automatically use the accelerate framework present on the system (and thus, on an intel mac use apple intel oriented library) but so far, it show no optimization on my imac intel (it’s even consuming more CPU cycles than the original buffir~…)

Would that means that accelerate framework on intel mac is not really vector optimisated ? Or more probably that something is wrong in my code ? :-)

I would greatly appreaciate any help, advice, or whatever to make me obtain the same performance (or better :-)) that I got on the ppc !
Attached in the codewarrior project if needed.

Thanks !

Salvator

#29205
Jan 23, 2007 at 5:03pm

Anyone ?

Hiring propositions are welcome !
Thanks,

Salvator

#90675
Jan 24, 2007 at 1:07am

Hi Salvator,

> I coded a convolution external that is quasi identical
> to buffir~, but with added vector calculation from apple
> convolution routine/accelerate framework.
>
> So on PPC mac, it was boosted by using altivec and provided
> 4x better performances than buffir~.

When building your code on vecLib through the Accelerate framework, it will be Altivec optimized on G4 and G5 (not G3), and SSE optimized on x86. Automatically, meaning no Altivec statements in your code.
However, a ‘pure Altivec’ code translated to a ‘pure SSE’ (SSE2) should have to be be compared to be sure about speed and stability.

> Now I want to make it work on intel mac too.
> I thought that the external would automatically use the
> accelerate framework present on the system (and thus, on
> an intel mac use apple intel oriented library) but so far,
> it show no optimization on my imac intel (it’s even consuming
> more CPU cycles than the original buffir~…)

You should get a boost on both architecture while burning more on G4/G5, and less on x86.
(On G3, vecLib will run in scalar mode!)

> Would that means that accelerate framework on intel mac
> is not really vector optimisated ?

It is really vectorized, for audio and for image processing.
(I used it for real-time optical processing, and worked fast!)

> Or more probably that something is wrong in my code ? :-)

Despite the great help of Olaf Matthes, I gave up about translating a vDSP code into a MaxMSP external :-(
Or it doesn’t work, or it’s even slower than built for CoreAudio/AudioUnit.
(I now focus my work on CoreAudio only (and Cocoa) and see later if something works for MSP.)
I suggest to ask for an help at the Apple’s CoreAudio list.

> Attached in the codewarrior project if needed.

Not anymore! Xcode is its name ;-)

Bye,
Philippe

#90676
Jan 24, 2007 at 1:32am

Many thanks Philippe for the advices !

>When building your code on vecLib through the Accelerate framework, it will be Altivec optimized on G4 and G5 (not G3), and SSE optimized on x86. Automatically, meaning no Altivec statements >in your code.
However, a ‘pure Altivec’ code translated to a ‘pure SSE’ (SSE2) should have to be be compared to be sure about speed and stability.
>

Yes, that’s what I thought. That it would automatically translate, but actually, there is no gain, it’s even 15% worse than buffir~
so I guess something is wrong in my code…

> I suggest to ask for an help at the Apple’s CoreAudio list.

Thanks I’ll give a shot there

> Not anymore! Xcode is its name ;-)

Did I said codewarrior ? Oh my bad typo … it’s indeed an Xcode project ! :-)
It compile fine here on both PPC and intel.
If ever you have time for a quick advice on the code …

Salvator

#90677
Jan 24, 2007 at 3:35am

Salvator wrote on Wed, 24 January 2007 02:32
—————————————————-
> If ever you have time for a quick advice on the code …

The problem is #include “z_dsp.h”
And “z_dsp.h” #include “z_altivec.h”

And all this Altivec stuff is inside MaxAPI.framework
And we don’t need anymore Altivec for i386 CPUs.

Therefore, your code runs in Altivec mode on ppc, but *not* in SSE on ppc i386 despite the #include .
=> Altivec turns your code into scalar on i386 !!

Sorry Salvator, there’s nothing I can do :-(
(I was not able to do anything for my own C++ !?!)

Someone else could bring us an help, please?

Kind regards,
Philippe

#90678

You must be logged in to reply to this topic.