lores~ code : understanding unroll and related functions
Hello everyone.
I am very new to programming Max externals.
To practice, I am making a Moog VCF-like filter. I have based my external on the SDK lores~ example.
My external works well, but, I really want to understand all functionalities that I've implemented.
I don't understand well the relationships between SMOOTHING_VERSION, lores_perform_unroll_smooth64, and maxvectorsize. Can someone explain this to me?
void lores_dsp64(t_lores *x, t_object *dsp64, short *count, double samplerate, long maxvectorsize, long flags){
x->l_2pidsr = (2.0 * PI) / samplerate;
lores_calc(x);
x->l_a1p = x->l_a1; // store prev coefs
x->l_a2p = x->l_a2;
x->l_fcon = count[1]; // signal connected to the frequency inlet?
x->l_rcon = count[2]; // signal connected to the resonance inlet?
lores_clear(x);
if (maxvectorsize >= 4) {
#if SMOOTHING_VERSION
dsp_add64(dsp64, (t_object *)x,(t_perfroutine64)lores_perform_unroll_smooth64, 0, NULL);
#else
dsp_add64(dsp64, (t_object *)x, (t_perfroutine64)lores_perform_unroll64, 0, NULL);
#endif
}
else
dsp_add64(dsp64, (t_object *)x, (t_perfroutine64)lores_perform64, 0, NULL);
}
Basically, why do we need an unroll function?
Hi!
Regarding smoothing: I don't see any definition of lores_perform_unroll_smooth64, and it looks commented out (via the #define SMOOTHING_VERSION 0). Perhaps it is a historical artifact?
Regarding unrolling: "loop unrolling" can sometimes(*) yield performance improvements.
What is different in the unroll method? It calculates a value `n` which is 1/4 of the vector size. Then inside the main loop, it loops over `n` instead of the vector size. Within that loop, it does four calculations at once instead of one.
while (n--) {
*out++ = yna = scale * (val = *in++) - a1 * ynb - a2 * yna;
*out++ = ynb = scale * (val = *in++) - a1 * yna - a2 * ynb;
*out++ = yna = scale * (val = *in++) - a1 * ynb - a2 * yna;
*out++ = ynb = scale * (val = *in++) - a1 * yna - a2 * ynb;
}
vs. the non-unroll method:
while (sampleframes--) {
val = *in++;
temp = ym1;
ym1 = scale * val - a1 * ym1 - a2 * ym2;
ym2 = temp;
*out++ = ym1;
}
(*) It can depend on a lot of factors on whether unrolling does improve things, how much, etc. You can read more here: https://en.wikipedia.org/wiki/Loop_unrolling . There is also the caveat of premature optimization -- be careful not to just add loop unrolling everywhere before you know you need to (by checking with benchmarks and profiling tools). :)
Hi Isabel,
Thank you so much for your explanation.
I have a clear idea of what is unrolling.
For smoothing, yes, it seems to be a historical artifact.
Thank you very much also for the references. I will use loop unrolling carefully. :)