Zero Stuffing in gen~ (upsampling)
Hey peeps,
I've been digging deeper into lots of things and just searching for different options to play with when it comes to upsampling/oversampling, antialiasing and filters. I've got some examples of 4x interpolation; doing the process 4 times and then interpolating which filters out before nyquist... (from this forum thx to stkr and ernest)
but can I do a simple zero stuffing type upsampling in gen~?
I dont need the accuracy of oversampling, just the temporary higher nyquist frequency. I tried using interp, input 1 my signal, input 2 being 0 with a factor of 0.5 interpolation but I didn't appear to work. Has anyone else tried this?
I have done a 2x up and down sampling patch. It's partially working but not really in a shape to share currently, but I can maybe talk through it a bit.
The upsampling case has a signal input and two signal outputs for the odd & even signals. The input is written into a delay line. The outputs are the same delay line convolved with a sinc waveform (for the lowpass filtering), but moving through it in steps of 2. The first output is convolved with indices 0, 2, 4, .. etc, and the second output is convolved with indices 1, 3, 5, etc., which is the zero stuffing part of it.
The downsampling case is near identical, but using two inputs (both writing into a delay line each), and summing the two convolutions back to a single output.
// ASCII diagrams of upsampling:
// input signal as x[n], x[n-1], ... i.e. n[t] where t is delay index
// zero stuffed for 2x upsampling:
// n9 _ n8 _ n7 _ n6 _ n5 _ n4 _ n3 _ n2 _ n1 _ n0 _
// convolve with ideal IR kernels
// ir len == 9, buflen = 3
// out0 _ b0 b1 b2 (1)b2 b1 b0 _
// out1 _ b0 b1 b2 (1)b2 b1 b0 _
// IR len == 11, buflen = 4
// out0 _ b0 b1 b2 b3 (1)b3 b2 b1 b0 _
// out1 _ b0 b1 b2 b3 (1)b3 b2 b1 b0 _
// where b[t] indices the index into the IR buffer
// ASCII diagrams of downsampling
// in2 in1 in2 in1 in2 in1 in2 in1 in2 in1 in2 in1 in2 in1
// n6 n6 n5 n5 n4 n4 n3 n3 n2 n2 n1 n1 n0 n0
// _ b0 b1 b2 b3* b2 b1 b0 _
// _ b0 b1 b2 b3* b2 b1 b0 _
// _ b0 b1 b2 b3 b4* b3 b2 b1 b0 _
// _ b0 b1 b2 b3 b4* b3 b2 b1 b0 _
The actual IR I was using was 21 samples long, computed as a sinc function under a Blackman window. Here's the genexpr for that part (assuming a [buffer~ ir_buf @samps 21]):
blackman(x) {
return 0.42 - 0.5*cos(2*pi*x) + 0.08*cos(4*pi*x);
}
Buffer ir_buf;
// cutoff frequency in Hz
hz = in1;
// gain factor applied to all kernel to preserve throughput
gain = 2*hz/samplerate;
// if they change, recompute the IRs:
if (once || change(hz)) {
once = 0;
// length of the IR
buflen = dim(ir_buf);
sinc_midpoint = buflen-1;
// the *conceptual* length of the IR
// +3 because we trimmed the first & last (0),
// and middle occurs once
len = buflen*2 + 1;
// convert Hz to radians per sample
w = hz * twopi/samplerate;
// for each frame of the buffer:
for (i=0; i<sinc_midpoint; i+=1) {
// conceptual phase of the IR:
// (i+1 because we trimmed the first 0-valued sample)
p = (i+1)/(len-1);
// k==0 at the dirac pulse in the centre of the sinc
// (this is 1 sample after the buffer ends)
// k==1 at the sample just before it, etc. etc.
k = sinc_midpoint-i;
// per-sample multiple/divisor for sinc:
kw = k * w;
// compute sinc kernel for lowpass
lp = sin(kw)/kw;
// sinc function is infinite,
// window it here to force sinc to zero at IR ends
lp *= blackman(p);
// write to the buffer:
poke(ir_buf, lp * gain, i);
}
// lastvalue:
poke(ir_buf, gain, sinc_midpoint);
}
How are you going with this Graham? Have you made much progress?
Interested too!
:O You can do it Graham! Oversampling would be a great addition to Oopsy Daisy.
I've been playing around with for loops and convolution trying to get something going myself but I keep going around in circles at this point :(
I've tried a few different things and I think the issue is the iteration & index of buffers not lining up all the time and perhaps that the second filter should be processing the zeros? in any case, heres an example;
// declare buffers:
Data input(64);
Data effect(64);
Data output(64);
// the impulse response: (14 zero crossings, kaiser window)
Buffer IR("sinc");
// the length of the IR:
len = dim(IR);
// declare variables for sums and output:
filter = 0;
FX_init = 0;
FX_filter = 0;
Out = 0;
// for loop
for (i=1; i<len; i+=2) {
//store input samples
poke(input,in1,i,0,0);
// Read IR:
// (by default peek will ignore out-of-bound indices)
FIR = peek(IR, i, 0);
// len-i is to reverse buffer for convolution of input
In = peek(input, (len-i), 0);
// accumulate their product:
filter += FIR * In;
//inserts zeros by only placing samples at iteration points
poke(effect,filter,i,0,0);
//read (&scale from 1st sinc filter)
FX_init = peek(effect, len-i, 0)*0.812;
//read IR again
FIR2 = peek(IR, i, 0);
//apply effect and filter (skipping zeros coz we can?)
FX_apply = tanh(FX_init);
FX_filter += FIR2 * FX_apply;
//store & read to decimate
poke(output,FX_filter,i,0,0);
Out = peek(output, i, 0);
}
Out1 = Out;
FYI - Ive kind of been using this as some sort of guide; https://www.willpirkle.com/app-notes/multiratepolyphase/
I'm also really interested in this. Robert, did you get any further?
Not really @DAMU but perhaps @Graham Wakefield can tease us with his updated Oversampling example which is coming in the next gen~ book :P ???
There's a new book coming? That's exciting. Do you know when that is due? I've still been working on this as much as possible but my DSP knowledge isn't enough to do anything better than rudimentary upsampling.
Hi folks
There is a 2x spline upsampler and downsampler in the 5D filter. I optimized the coefficients so it doesnt use much CPU.
Hi Ernest, thank you so much for replying. I have looked at the relevant parts of the 5D filter and i'm quite confused about which pieces to use to make a more general up/downsampling algorithm in gen~
I'd ideally like to make a 16/32/64x version which works like poly~ and allows me to do weird synthesis/waveshaping techniques in gen~ with minimal aliasing.
Following Will Pirkles methods outlined above, I understand the general principles of how this is done but I definitely need some more guidance on how to implement it. I have purchased Graham and Gregory's new book today which will perhaps enlighten me when it arrives on Friday.
I totally understand if you'd rather not get dragged into this as I understand it isn't a trivial matter, but any more light you could shed on the answer would be very much appreciated.
That little oversampling algorithm is pretty cool ernest. I noticed a similar one for the SVF.
I haven't managed to get either of those to work in a more general context though, I got either the incorrect output or no output, not sure why. I'll revisit all this oversampling stuff again soon.
I just finished reading Generating Sound and it says that oversampling will be discussed in Book 2, which is exciting. I wonder if we can get a sneek preview if I ask Graham politely one more time?..
I've done a bit more digging into Will Pirkle's code, but one thing still stumps me...
lets assume we are doing 4x OS with an FIR of 512 taps;
- the FIR is split into 4 'sub band filters' - 128 taps each
- interpolation uses reverse indexing (eg; i-- instead of i++)
- due to zero stuffing we perform gain compensation (multiply input by the OS ratio)
- 4x convolution processes are performed on each input sample (parallel)
- this results in 4 new outputs for every new sample
... then we process each sample individually, let's say with tanh, that means theres 4 parallel process taking place and we store these 4 samples into a buffer for the decimator.
- then the decimator is the same as the interpolator but with forward indexing (i++)
- we accumulate these 4 samples into a single sample for output.
Now for the questions... At first I would assume we need a 128 tap buffer for the input to convolve/interpolate but then why does the decimator only use 4 samples for the input? If we don't need it then are we still performing 512 (128 * 4) multiplications (convolution) on each sample input sample? how do we step through 128 indexes (sub band filters) within 1 sample frame?
I think after knowing that we can crack this 🤓
a kind individual posted a reaktor tutorial and ported it over to gen objects , then i iterated it a little by reversing the order of the output from the interpolator and increased the taps and it works, but its terribly inefficient (uses 10x as much cpu as poly but is linear phase).
patch: https://pastebin.com/wSRP4DrJ
tutorial: https://www.adsrsounds.com/reaktor-tutorials/oversampling-in-reaktor-part-ii/
We still need to design and run the filter at the original sr to get it down to 64 taps and then skip all the zeros and we should be onto a winner.
What a great idea. I should have thought to look for something in reaktor. It sounds good to me. Is the redesigned filter intended to lower the cpu or improve the sound quality?
Is it possible to oversample by 8x, 32x using this method? Or would I need to redesign the filters again?
because it is objects and not code, each OS case would have to be a different patch (or at least one very big patch with lots of switching which would probs be less efficient again). I'm working on doing it all in codebox so that changes can be made dynamically but im trying to get it as optimized as possible first. I'm certain that im missing something simple again.
But I think with linear phase oversampling we have to expect more resources being used than IIR filters used with poly. I'm hoping that we can get it so its related to the os ratio. eg; 4x OS uses 4x more cpu than poly up4. The goal should be better aliasing rejection and linear phase filtering at the cost of slightly more cpu.
I'm pretty determined to get this going...
"kind individual" here. I've updated the patch slightly to use a welch window. Note that other windows are left in the codebox for anyone who wants to futz around. I'm personally quite happy with the current filter performance as the rolloff compared to poly~'s implementation is only really noticeable above 18kHz.

Right now this version of 4x oversampling is only using 1.6x the resources as the poly~ comparison, at least on my system.
@damu: this shouldn't be that awful to manually expand out to higher oversampling ratios if you wanted. You'd need to break up the polyphase filter core slightly but it's not that annoying to deal with.
that patch doesn't seem to load properly on my end as I have no output from the gen~ patcher (EDIT; Max was just freaking out, works fine) but I'll upload my codebox example shortly with your new windowing functions and have the oversampling as functions so its easier to integrate into a patcher. I was trying all different ways of doing things but only your way has worked so far so not sure how much more we can optimize it.
^ Here it all is as genexpr which made it about twice as efficient for me. I started putting it into functions but was getting too many errors, so this will do for now. I still haven't put it in a M4L device yet to get a better measure of performance but at least we can say its done, I guess. Hopefully others can improve it more.
Thanks for commenting the code so thoroughly. I'm determined to understand this. I will try to make an adapted 16x version when I have a free day as that should make the process clear to me.
At present this works well for processing a signal, you've used tanh for the example. How could I create some oversampled oscillator using this same architecture?
There are other ways to oversample an oscillator that are more efficient, this approach would be more suitable for non-linear processing such as tanh (or any waveshaper, limiter etc). In the latest gen~ book there are examples of sinc-interpolation with mip mapping which is ideal for wavetable oscillators.
I'm positive there are many ways we can get this more efficient. Perhaps different filter arrays for each filter might speed things up, using fractional delay lines for the filters so the filters can be symmetrical, doing 2 steps of up/down sampling with diff filters to achieve the same oversampling and manually fine tuning each filter in various ways. This definitely isn't a trivial thing and very much based on the needs of the developer, I assume that each developer has their own tweaks or ways they like to do things.
so 5x OS allows for an odd length FIR and with 65 taps and hann window theres a latency of 12 samples and the aliasing rejection is better than poly and uses just under 2x as much cpu as poly. I tried an oversampling ratio of 9 but more oversampling didn't get me better aliasing rejection and used more cpu. so this seems like the best compromise. Time to try it in some M4L devices...
God damn, this is running very well. I'll also take a stab at modifying it to run a little faster. Right now on my M1 system it's showing just shy of 3x against poly~. Worst case I'll just convert it so the functions are easy to call for anyone who wants to incorporate it into their processes. Thanks for the sweet update!
Edit: at this point, it may make sense for us to try building out a library. I also want to make a real lightweight 2xOS version with a half band filter IIR frontend.
Hey peeps, little update. I implemented a similar iteration of the oversampling I last posted, in an update for my GMaudio Clipper device. I've been waiting two years for this to come a reality 🥳 I wrote a little article about the oversampling here;
https://fixationstudios.com.au/the-best-clipper-for-ableton-live/