Demystifying Digital Filters, Part 2

In the first tutorial in our series on digital filters, you learned that filters are systems that act on signals.

If the system is linear and time-invariant, you can use impulse responses to entirely characterize the system, too.

You saw that convolution and recursion are two algorithms that give us two varieties of filters: FIR (finite impulse response) and IIR (infinite impulse response).

This next part of our tutorial series will explore FIR filters in more depth. It will cover:

The advantages and disadvantages of FIR filters
Design and implementation of FIR filters
The windowing method
A brief look at the optimum-design method
Resources for FIR filtering in Max

FIR filter advantages and disadvantages

Before you dive into making an FIR filter, it’s worth knowing what they do well and what they do poorly.

Advantages:

It’s easy to design an FIR filter for linear phase response
The stability of an FIR filter is guaranteed (there are no feedback loops!)

Disadvantages:

They’re computationally intensive (convolution is a computationally expensive operation)
It’s harder to change the filter parameters in real time

Designing and implementing FIR filters

In the last part of this series, you were introduced to the general process for designing an FIR filter:

You create a desired frequency response — that involves drawing out the frequency response curve, sampling it, and put that result into a vector of values
Compute the FFT of the frequency response to get the impulse response
Use the impulse response as the filter kernel and convolve the input with the impulse response you found in step 2

The three-step procedure described above is theoretically sound, but when you start getting into the implementation, there’s a problem that arises — for nearly all situations, the impulse response you will compute in step 2 will be infinite, rather than finite.

Why does this happen, and what do we do about it?

A brick-wall lowpass filter

To see this issue firsthand, let’s consider an example.

Suppose you want to create a brick-wall lowpass filter (in other words, a filter with a perfectly flat passband, infinitely narrow transition band, and perfectly flat stopband). The frequency response for this would be would be a rectangle, like this:

To get the impulse response, you’d use the FFT. (Since the details of the FFT are not the focus of this series, let’s skip straight to the results):

This particular function appears so often in Fourier analysis that it has its own name: the sinc function. This function shows up all over in DSP texts, so it’s worth knowing. The equation for the sinc function is

There are two important characteristics of the function to consider.

It has an infinite domain. In other words, it outputs a value for every x.

You might notice that if x is zero, it appears that you’ll run into a division by zero. Mathematically, this is no issue, and it can be proved with a little calculus that the result is actually 1, not undefined. (Note that a computer will not have the insight that x = 0 is a valid input, so if you’re coding up your own sinc function, you’ll need to handle this special case.)

Related to the point above, the sinc function is “non-causal,” meaning it has non-zero values for negative values of x.

A non-causal impulse response of a system indicates that it is using feedback loops (and thus depends on samples in the future). This means the system is recursive.

Knowing these two characteristics of sinc (and thus the characteristics of the impulse response of our brick-wall filter), you should now see how you might end up with a filter kernel that is unsuited for an FIR filter.

This is just one example, though, and it is healthy to remain skeptical of how likely you’ll end up in this situation! To quell those concerns, you can research more about the duality property of the Fourier transform. To summarize the property, finite signals in one domain (like the frequency response we designed above) become infinite in another (like the sinc function) and vice-versa.

Now that you have an idea of why the issue will arise, let’s explore one of the most common solutions - windowing.

Windowing method

You can choose to “zero out” parts of the impulse response by multiplying the impulse response by a window function, a signal full of zeros except for a section of interest. By multiplying a signal with a window function, the result will look like a finite version of the original signal.

As an illustration, take a look at this arbitrary sinusoidal function (an infinite signal), a rectangular window, and the result of multiplying the two together:

You may wonder why you need to bother with the idea of a window function at all when you could just simply “crop” the infinite signal. The reasoning goes back to a property of linear time-invariant systems — multiplication in the time domain is convolution in the frequency domain, and vice versa. This means when you crop a sinc impulse response by multiplying with a rectangular window, you have just convolved the frequency response with the same rectangular window! In other words, by windowing the impulse response, you’re going to change the original frequency response, too.

All is not lost, though! If you are careful about selecting a window function, you can minimize the effects on the new frequency response and keep your original design.

There are many window functions to choose from, each one with their own optimizations.

You should notice that a common characteristic of window functions is their tapered edges. Much of window design comes down to just how this taper occurs, and sometimes window functions are simply called taper functions.

Pro-tip: If you’re interested in understanding why tapering is so crucial, you’ll want to read about the Gibbs phenomenon.

To help pick from all of those window functions, you can use the same tools you’ve learned about already — impulse and frequency response analysis.

Let’s analyze the responses of two popular windows, Hamming and Blackman.

First, notice that the Blackman window has better stopband attenuation — at frequencies further from the “main lobe,” the gain drops off more quickly. The tradeoff for this is that the frequency response of the Blackman window will have a slower roll-off than the Hamming window.

Another factor to keep in mind is the window length. The longer of a window you use, the more of the original signal will remain (so you’ll be using more of that impulse response you worked for earlier!), which will give you more accuracy.

The tradeoff for this is computation speed and delay — the larger your window, the larger your filter kernel will be and the longer you’ll have to run expensive convolution computations and the longer of a delay line you’ll need.

Let’s say you decide that you’re comfortable with a slightly slower rolloff in exchange for better stopband attenuation, so you choose the Blackman window.

Now you’re ready to apply the window to the sinc impulse response from earlier.

With that, you now have a finite impulse response ready to act as a filter kernel!

Optimum design methods

While the windowing method is straightforward, there are other ways to design FIR filter kernels. The main alternative to the windowing method is known as “optimum design.”

We’re not referring to the method as “optimal” because it is necessarily better than other methods. Instead, the name comes from the mathematical problem of optimization.

So far, you’ve seen that designing FIR filters with the windowing method involves some trial and error. You must:

Design your filter response.
Find the impulse response.
Try out some windows.
Examine the results.

What if you wanted to specify a parameter like, say, the maximum amount of ripple in the passband and not go through these computations over and over?

The answer, you may have guessed, is with optimum design.

The idea is that if you can describe the error of your resulting frequency response (which is shown in blue in the above illustration) compared to the ideal case (which is shown in orange), you can minimize that function for some amount of error you are willing to accept and use the resulting parameters.

That amount of error is sometimes referred to as the tolerance because it represents how much you will tolerate the result deviating from the ideal case.

A deeper discussion on the topic of optimum design is out of the scope of this series, but if you are interested in going further, check out the freely-available optimal FIR filters section of Spectral Audio Signal Processing by Julius O. Smith III or Chapter 7 of the classic textbook Digital Signal Processing by Alan V. Oppenheim and Ronald W. Schafer.

FIR resources in Max

To work with FIR filters in Max, there are a number of objects and examples at your disposal.

The buffir~ object takes a reference to a buffer~ containing an impulse response and uses it as the filter kernel. For help, see:
buffir~.maxhelp
buffir-eq.maxpat
Patching in gen~, which gives you single-sample control, is a natural fit for designing impulse responses. For an example, check out:
gen~.buffir.maxpat
The HISSTools Impulse Response Toolbox package contains externals and abstractions that are helpful for building FIR filters.
bufconvolve~ for offline (non-real time) convolution and deconvolution of buffers
iruser~ to create IRs from a specified frequency response and output phase

Up Next

For the third part of this series, get ready to dig into the other category of digital filters — IIR filters!

References

Introduction to Digital Filters with Audio Applications and Spectral Audio Signal Processing by Julius O. Smith III
The Scientist and Engineer's Guide to Digital Signal Processing by Steven W. Smith
Digital Signal Processing by Alan V. Oppenheim and Ronald W. Schafer

Learn More: See all the articles in this series

by Isabel Kaspriskie on March 2, 2021

👽'tW∆s ∆lienz👽

awwww snapzzzzz! this series is the best ever! i needed this one, in particular, for years now. thank you thank you thank you. it has all these things i'm studying about in parts(sinc function, Gibbs phenomenon(something like this occurs in Chebyshev polynomials, too... "Runge's phenomenon"?), duality property of FFT, FIR vs. IIR, etc. etc...), but i can never put it all together so clearly.. i need more like this to help me out on these subjects - it's been difficult to gain a cohesive engineering-degree understanding from the fragmented approach of my limited music-degree perspective. Thank You Again! 🙌

(edit: my 'fragmented approach' is now made smoother by applying the 'window' of articles like these! 😃 😄 😁)

Lilli Wessling Hart

Thanks, Raja! Copy/paste was not my friend today. Should be fixed now.

👽'tW∆s ∆lienz👽

Thank You, Lilli "Lifeblood of Cycling74" Wessling Hart! 🙌

Isabel Kaspriskie

Thanks for the kind words, Raja! It's often that we have many of the pieces floating around in our brains, and the "aha" moment comes when those pieces start to fit together, right? Or just knowing what words to Google and read about.

As you mention Runge's phenomenon, that's indeed similar in its appearance to Gibb's phenomenon with all of the wild fluctuations that start to happen. I'll nerd out on everyone to talk a little about the differences between the two! :)

So Runge's phenomenon happens when you use high-order polynomials to interpolate between evenly-spaced (evenly-sampled!) points. "Chebyshev nodes" try to fix this by changing from evenly-spaced points to ones that are more heavily sampled at the edges. (In fact, to think about exactly how the sampling is done, imagine a circle with evenly distributed points along the circumference. Then imagine picking up that flat circle and looking at it from the side, sort of like looking at the edge of a circular piece of paper, if that makes sense. The points will smoosh together on the edges, and that's the same spacing that Chebyshev nodes use.) This "trick" helps reduce those crazy oscillations that happen when trying to interpolate sampled points.

Gibb's phenomenon is more related to sharpness of edges and how it's hard to represent sharp edges only mixing together sinusoids. Basically, the sharper you want the edges in one function, like in a square wave, the higher number of sinusoids you need to mix together to represent it in the Fourier way. Eventually you'd start nearing an infinite number of sinusoids/harmonics. So you can think about the Fourier series' "convergence at discontinuities" to see how those sharp edges will behave. There's lots of stuff out on the Internet to read about convergence and Gibbs phenomena which illustrate this convergence concept nicely. Basically, it doesn't quite converge the way you'd think. (Like I know I'd think adding more and more sinusoids would smooth it out, but that doesn't actually happen.)

Anyway, a digression, but cool ideas, right?