Efficient Constant Q Transform

Cortexelus's icon

Is there any way to implement a Constant Q transform in MaxMSP? A CQT is like Fourier transform except its bins are logarithmically-spaced instead of linear, so that all octaves have equal resolution. Which makes way more sense for music analysis. http://en.wikipedia.org/wiki/Constant_Q_transform#Comparison_with_the_Fourier_Transform

Here is a paper by Miller Puckette on an efficient algorithm which makes use of the FFT
"An Efficient Algorithm for the Calculation of a Constant Q Transform"
http://www.wellesley.edu/Physics/brown/pubs/effalgV92P2698-P2701.pdf

Here's another paper on an efficient algorithm:
"Constant-Q transform toolbox for music processing"
http://www.elec.qmul.ac.uk/people/anssik/cqt/

If there is already a Max object for this, or if this is a problem which has already been solved, please let me know. Otherwise, any help in pointing me in the right direction as to how this ought to be best implemented would be awesome.

mzed's icon

I don't know of an implementation of this transform in Max off the top of my head. There are some good clues in this paper:

I haven't read it that closely, but I think it wouldn't be rocket surgery to port that matlab implementation into a max object written in C. Or gen~, perhaps.

Err0r_500's icon

I'm also very interested and will follow this thread. I'm totally aware that it's really over my capabilities both in maths and max and in my understanding the problem (for what I'd like to do) with Puckette's paper is that the process described is not inversible... but you can also have a look at this one : "Constructing an inversible constant-q transform with nonstationary Gabor frames" http://www.univie.ac.at/nonstatgab/pdf_files/dohogrve11_amsart.pdf

Chris Vik's icon

I know this is quite old, but before I try to wrangle this myself ( and it will be a dodgy-arse wrangle ) was there anyone that found an answer?

I've been digging around to attempt a log spread fft freq output and much of the maths is quite over my head. After finding the "constant q transform" method, it seems like a great potential solution ( as opposed to simply reallocating linear bins into log bins as a post ftt process ).

C

wbertrand's icon

The constant-q transform (as proposed by M.S.P. in the paper linked by cortexelus) has been implemented in Max as an eternal object. I have not yet verified myself that it works, and this thread is so old that the object may have been developed by one the early posters themselves, but here it is:
http://grrrr.org/research/software/constantq/. I figured I should post here for any future viewers of this thread.

Jesse's icon

Link?

wbertrand's icon

Okay, so the complied project is dependent on FFTW libraries, available here: http://www.fftw.org/index.html#documentation
However, after installing these libraries (./configure, make, make install), [constantq~] is still referencing a non-existent file path, "usr/local/lib/libfftw3f.3.dylib"
Note: I have zero knowledge of C and program development

Also think I'll try and look at his source code, though it's dependent on a development later he built (flext) and another external library (blitz++)

wbertrand's icon

After lots of trial and error, I've identified (and resolved) problems with installation, and my fftw installation is now compatible with [constantq~]. In the ./configure stage, some special flags are required. The installation must be compiled for 32-bit architecture, which is a special flag for the compiler, specified by the "CC..." portion of the command. Secondly, the default installation is a static library, where dynamic is required; this is fixed through "--enable-shared." Lastly, the installation has to have the "--enable-float" flag. The command which worked for me is as follows:

./configure CC="gcc -arch i386" CXX="g++ -arch i386" --enable-shared --enable-float

Gussi's icon

You could also make your own implementation of a constant Q transform, by using the native msp object:
fffb~

- Make chromatically spaced filters, by using a couple of fffb~'s, depending on how many bands you need.
(each object has a max filter number of 50) - read help file.
Or use quarter-note spaced filters, like the transform described by Judith C. Brown, in her paper
(Calculation of a constant Q spectral transform, 1990).

- Then get the RMS/peak amplitude for each band, and collect it in a list.
now you've got your own chroma-vector of size n...

You could potentially vary the q setting for each filter, to get variable bandwidth/time resolution for each band.

You may also adjust the gain of each filter in fffb~, this allows you to implement stuff like fletcher-munson - equal loudness curves,
to more accurately describe the signals harmonic content, in a way that relates to human hearing.

It might not be the least cpu intensive solution, but its flexible and easy to work with.
I've managed to achieve decent polyphonic analysis on old win xp computers, with max/msp 4.5 :-)

It might be worth a shot.

- gus

Nicholas Solem's icon

I assume no one on here has since implemented an efficient constant-q transform? I'm trying to wrap my head around the Puckette & Brown paper referenced above, in an attempt to get pfft~ to transform to a constant q. But I want to make sure no one has done this already, as this is only a very small part of a bigger project.

The [constantq~] object referenced above doesn't seem like it will work for my purposes, because it outputs a list, as opposed to signal rate info. And I don't yet have C or C++ skills to modify it to my liking.

And Gus, I suppose that's an option, but probably too inefficient compared to converting an fft into a constant q, as described in the Puckette & Brown paper.

Graham Wakefield's icon

If all you really want is to dimension-reduce the FT spectrum into something closer to audible perception, maybe a bark or Mel scale analysis can work. There's a sketchy start toward this here: https://cycling74.com/forums/gen-sharing-is-barking Essentially this is chunking geometrically increasing numbers of input FFT bins into output CQ bins. (Which is related to what the morlet wavelet transform does too) But unlike what the P&B paper is talking about, this sketchy approach doesn't allow you to chose the f0 or the Q ratio. The f0 is going to be the center frequency of the lowest FFT bin, and the ratio between CQ bins is going to be powers of 2.

I'm not fully grasping the P&B paper, but it seems that given an f0 and Q ratio, one needs to precompute a matrix (a kernel) mapping a subset of the input FFT bins into each output CQ bin. I'd guess that this could also be done as a gen~-in-pfft~ thing, but would likely need some codeboxing.

My other question is -- when you say you prefer the output to not be a list, but a signal, I'm not sure what this would actually look like. The number of CQ bins should be less than the FFT frame size, so I guess the CQ bins *could* be packed into an audio signal -- but how would you use them? What do you want to do with the analysis?

mzed's icon

FWIW, analyzer~ will output bark bands. Also, zsa.mel~ and zsa.bark~

Nicholas Solem's icon

Graham: Actually, I don't mind if f0 is automatically the lowest FFT bin. However, I believe that this won't work for what I need either, unfortunately.

The problem is that, after getting logarithmically spaced bins, I need to do some processing, and then convert back to linearly spaced bins. This needs to be as close to lossless as possible... and I believe that chunking several bins to get that spacing will end up losing data that cannot be retrieved after further processing.

And maybe the above will answer your question about list vs. signal. This is to be further processed, as opposed to merely viewed as analysis. Ideally, the signal would still look akin to the output out fftin~, carrying a real part per bin, an imaginary part per bin, and the current bin index.

mzed: unfortunately, those won't work for me, as they still output a list.

Graham Wakefield's icon

About getting a close-to-lossless reversible transformation. There's a time/space tradeoff that corresponds to the spectral layout -- we have more high frequency bins than low frequency bins because there's more cycles per unit time at high frequencies, i.e. more information. Working in a log-spaced transform, if it is executed in fixed-sized chunks (which is true if it is derived from a Fourier transform), means discarding a lot of information, which means non-lossless reconstruction. A way around this would is to analyze different bands at different window sizes, where high frequency bin analyses run more frequently than the low frequency bin analyses. A wavelet-based analysis can do this.

Years ago I got interested in this problem and after reading the Computer Music Tutorial wanted to learn about wavelets. The easiest thing to do was a time-domain Haar wavelet analyzer & synthesizer, which was able to losslessly reconstruct the analyzed signal. The intermediate format has typically 10-13 bins (depends on how low frequency you want to go), log spaced. But the information is preserved: a higher frequency bin has more samples per window than a lower bin, so overall the same amount of information is present. Actually it could be fun to knock this up as a gen~ example, the process is interesting. But frankly most of the processes you apply in the intermediate domain result in some noisy modulation artifacts. Which isn't surprising considering the simplicity of the Haar wavelet.

Fancier wavelet and dictionary-based analyses can do better compression, but are much harder to write, and might not be easier to operate on in their transformed domains.

Graham Wakefield's icon

Here's that Haar wavelet thing in gen~:

Max Patch
Copy patch and select New From Clipboard in Max.

Nicholas Solem's icon

@Graham: thanks for that. Could any arbitrary wavelet potentially be used for this? I wonder if these distortions could be aided from different wavelet types.
I messed around with wavelet~ from CNMAT too, but it's great that your implementation here provides frame and bin indexes.

FWIW, I actually got my logarithmic resampling to work! The problem was that I was trying to resample real and imaginary parts. Now, I'm only processing the magnitude with a logarithmic read function, then to get back to linear spacing, reading back from the logarithmically processed spacings exponentially. Then, this whole time phase was unprocessed, and is delayed to be matched with the re-read, re-linearized magnitudes.

Going back to the time domain, it sounds MUCH better than doing this with separate real and imaginary parts, but there is still some noise, esp. in the high frequencies. This problem is greatly helped by placing the pfft~ patch within a poly~ and upsampling by a factor of 4 or 8. That way, the bins that are most inaccurate by interpolating are irrelevant to the final output.

webe's icon

Waking this up again: I tried finding a compiled, up-to-date version of the constant-Q transform external. Does anyone know of an updated version?
The version from grrrr yields: "constantq~: unable to load object bundle executable"

Martin Beck's icon

Bump. I am also interested in...
got my attention after watching this video especially the spectrum shown at 15:20
https://www.youtube.com/watch?v=ONJVJBmFiuE

AudioMatt's icon

The matlab version doesn't seem toooo tooo complicated. Split the work between 8 people and you've got somewhere in the neighborhood of 25 lines of noncommented code each.

function varargout = cqt(x,varargin)
%Constant-Q nonstationary Gabor transform
%   CFS = CQT(X) returns the constant-Q transform of X. X is a
%   double-precision real- or complex-valued vector or matrix. X must have
%   at least four samples. If X is a matrix, CQT obtains the constant-Q
%   transform of each column of X. CFS is a matrix if X is a vector, or
%   multidimensional array if X is a multichannel signal. The array, CFS,
%   corresponds to the maximally redundant version of the CQT. Each row of
%   the pages of CFS corresponds to passbands with normalized center
%   frequencies (cycles/sample) logarithmically spaced between 0 and 1. A
%   normalized frequency of 1/2 corresponds to the Nyquist frequency. The
%   number of columns, or hops, corresponds to the largest bandwidth center
%   frequency. This usually occurs one frequency bin below (or above) the
%   Nyquist bin. Note that the inverse CQT requires the optional outputs G
%   and FSHIFTS described below.
%
%   [CFS,F] = CQT(X) returns the approximate bandpass center frequencies F
%   corresponding to the rows of CFS. The frequencies are ordered from 0 to
%   1 and are in cycles/sample.
%
%   [CFS,F,G,FSHIFTS] = CQT(X) returns the Gabor frames G used in the
%   analysis of X and the frequency shifts FSHIFTS in discrete Fourier
%   transform (DFT) bins between the passbands in the rows of CFS. CFS, G,
%   and FSHIFTS are required inputs for the inversion of the CQT with ICQT.
%
%   [CFS,F,G,FSHIFTS,FINTERVALS] = CQT(X) returns the frequency intervals
%   FINTERVALS corresponding the rows of CFS. The k-th element of FSHIFTS
%   is the frequency shift in DFT bins between the ((k-1) mod N)-th and (k
%   mod N)-th element of FINTERVALS with k = 0,1,2,...,N-1 where N is the
%   number of frequency shifts. Because MATLAB indexes from 1, this means
%   that FSHIFTS(1) contains the frequency shift between FINTERVALS{end}
%   and FINTERVALS{1}, FSHIFTS(2) contains the frequency shift between
%   FINTERVAL{1} and FINTERVAL{2} and so on.
%
%   [CFS,F,G,FSHIFTS,FINTERVALS,BW] = CQT(X) returns the bandwidths BW in
%   DFT bins of the frequency intervals, FINTERVALS.
%
%   [...] = CQT(X,'SamplingFrequency',Fs) specifies the sampling frequency
%   of X in hertz. Fs is a positive scalar.
%
%   [...] = CQT(...,'BinsPerOctave',B) specifies the number of bins per
%   octave to use in the constant-Q transform as an integer between 1
%   and 96. B defaults to 12.
%
%   [...] = CQT(...,'TransformType',TTYPE) specifies the 'TransformType' as
%   "full" or "sparse". The "sparse" transform is the minimally redundant
%   version of the constant-Q transform. If you specify 'TransformType' as
%   "sparse", CFS is a cell array with the number of elements equal to the
%   number of bandpass frequencies. Each element of the cell array, CFS, is
%   a vector or matrix with the number of rows equal to the value of the
%   bandwidth in DFT bins, BW. If 'TransformType' is "full" without
%   'FrequencyLimits', CFS is a matrix. If 'TransformType' is "full" and
%   frequency limits are specified, CFS is a structure array.
%
%   [...] = CQT(...,'FrequencyLimits',[FMIN FMAX]) specifies the frequency
%   limits over which the constant-Q transform has a logarithmic frequency
%   response with the specified number of frequency bins per octave. FMIN
%   must be greater than or equal to Fs/N where Fs is the sampling
%   frequency and N is the length of the signal. FMAX must be strictly less
%   than the Nyquist frequency. To achieve the perfect reconstruction
%   property of the constant-Q analysis with nonstationary Gabor frames,
%   both the zero frequency (DC) and the Nyquist bin must be prepended and 
%   appended respectively to the frequency interval. The negative
%   frequencies are mirrored versions of the positive center frequencies
%   and bandwidths. If the TransformType is specified as 'full' and you
%   specify frequency limits, CFS is returned as a structure array with the
%   following 4 fields: 
%   c:              Coefficient matrix or multidimensional array for the
%                   frequencies within the specified frequency limits. This
%                   includes both the positive and "negative" frequencies.
%   DCcfs:          Coefficient vector or matrix for the passband from 0 to
%                   the lower frequency limit.
%   Nyquistcfs:     Coefficient vector or matrix for the passband from the
%                   upper frequency limit to the Nyquist.
%   NyquistBin:     DFT bin corresponding to the Nyquist frequency. This
%                   field is used when inverting CQT.
%
%   [...] = CQT(...,'Window',WINNAME) uses the WINNAME window as the
%   prototype function for the nonstationary Gabor frames. Supported
%   options for WINNAME are "hann", "hamming", "blackmanharris",
%   "itersine", and "bartlett". WINNAME defaults to "hann". Note that these
%   are compactly supported functions in frequency defined on the interval
%   (-1/2,1/2) for normalized frequency or (-Fs/2,Fs/2) when you specify a
%   sampling frequency.
%
%   CQT(...) with no output arguments plots the constant-Q transform in the
%   current figure. Plotting is only supported for a single-vector input.
%   If the input signal is real, the CQT is plotted over the range
%   [0,Fs/2]. If the signal is complex-valued, the CQT is plotted over
%   [0,Fs).
%
%   % Example:
%   %   Plot the constant-Q transform of a speech sample using the maximally 
%   %   redundant version of the transform and 48 bins per octave. Set the 
%   %   frequency limits from 100 to 6000 Hz.
%
%   load wavsheep;
%   cqt(sheep,'SamplingFrequency',fs,'BinsPerOctave',48,...
%   'FrequencyLimits',[100 6000])
%
%   References: 
%   Holighaus, N., Doerfler, M., Velasco, G.A., & Grill,T.
%   (2013) "A framework for invertible real-time constant-Q transforms",
%   IEEE Transactions on Audio, Speech, and Language Processing, 21, 4, 
%   pp. 775-785.
%
%   Velasco, G.A., Holighaus, N., Doerfler, M., & Grill, Thomas. (2011)
%   "Constructing an invertible constant-Q transform with nonstationary
%   Gabor frames", Proceedings of the 14th International Conference on 
%   Digital Audio Effects (DAFx-11), Paris, France.
%   
%   See also ICQT

% Check number of input and output arguments
narginchk(1,11);
nargoutchk(0,6);

% Validate attributes on signal
validateattributes(x,{'double'},{'finite','nonempty','2d'},'CQT','X');

if isvector(x)
    N = numel(x);
    x = x(:);
    numsig = 1;
else
    % Columns as signals
    N = size(x,1);
    numsig = size(x,2);
end

% Plotting not supported for multidimensional inputs
if nargout == 0 && numsig > 1
    error(message('Wavelet:FunctionOutput:PlotTooManyDims'));
end

% Signal must have at least four samples
if N < 4
    error(message('Wavelet:synchrosqueezed:NumInputSamples'));
end

params = parseinputs(N,varargin{:});
if any(imag(x(:)))
    params.sigtype = 'complex';
end
Fs = params.Fs;
numbins = params.numbins;
fmin = params.freqlimits(1);
fmax = params.freqlimits(2);
% Nyquist frequency
NF = Fs/2;

% Set minimum bandwidth in DFT bins
minbw = 4;

%   Q-factor for CQ-NSGT and inverse for determining bandwidth Bandwidths
%   are $\varepsilon_{k+1}-\varepsilon_{k}$ where $\varepsilon_k$ is the
%   k-th center frequency. Described in Holighaus et. al. (2013). CQ-NSGT
%   Parameters: Windows and Lattices
Q = 1/(2^(1/numbins)-2^(-(1/numbins)));
% BW = CF*Q^(-1) so obtain 1/Q
Qinv = 1/Q;
%   Default number of octaves. Velasco et al., 2011 CQ-NSGT Parameters:
%   Windows and Lattices
bmax = ceil(numbins*log2(fmax/fmin)+1);


%   Note: fmin is rendered exactly. fmax is not necessarily
%   freq = fmin.*2.^((0:numbins*numoct-1)./numbins);
%   frequencies here are in hertz.  
freq = fmin.*2.^((0:bmax-1)./numbins);
freq = freq(:);

% Remove any bins greater than or equal to the Nyquist
% First find the first frequency greater than or equal to fmax, this should
% not be empty by construction. We will append the Nyquist so ensure we
% are not at the Nyquist or beyond.
idxgeqFmax = find(freq >= fmax,1,'first');
if freq(idxgeqFmax) >= NF
    freq = freq(1:idxgeqFmax-1);
else
    freq = freq(1:idxgeqFmax);
end

% First record number of bins, 1,....K prior to pre-pending DC and
% appending the Nyquist
Lbins = numel(freq);
% Now prepend DC and append Nyquist
freq = [0; freq; NF];
% Store Nyquist Bin
params.NyquistBin = Lbins+2;
% Store DC bin
params.DCBin = 1;
% Mirror other filters -- start with one bin below the Nyquist bin and go
% down to one bin above DC
freq = [freq; Fs-freq(end-1:-1:2)];

% Convert the frequency bins to approximate index of DFT bin.
fbins = freq*(N/Fs);

% Determine bandwidths in DFT bins. For everything but the DC bin and the
% Nyquist bins the bandwidth is \epsilon_{k+1}-\epsilon_{k-1}. Total number
% of bandwidths is now 2*Lbins+2
bw = zeros(2*Lbins+2,1);

%   Set bandwidth of DC bin to 2*fmin -- these are bandwidths in samples
%   (approximately), we will round to integer values.
bw(params.DCBin) = 2*fbins(2);
% Set Lbin 1 such that cf/bw = Q
bw(2) = fbins(2)*Qinv;
%   Set the bandwidth for the frequency before the Nyquist such that 
%   cf/bw = Q
bw(Lbins+1) = fbins(Lbins+1)*Qinv;
% Set the original k = 1,....K-1
idxk = [3:Lbins, params.NyquistBin];
% See Velasco et al. and Holighaus et al. CQ-NSGT Parameters: Windows
% and Lattices
bw(idxk) = fbins(idxk+1)-fbins(idxk-1);
% Mirror bandwidths on the negative frequencies
bw(Lbins+3:2*Lbins+2) = bw(Lbins+1:-1:2);
% Round the bandwidths to integers
bw = round(bw);

% Convert frequency centers to integers. Round down up to Nyquist. Round up
% after Nyquist.
cfbins = zeros(size(fbins));
% Up to Nyquist floor round down
cfbins(1:Lbins+2) = floor(fbins(1:Lbins+2));
% From Nyquist to Fs, round up
cfbins(Lbins+3:end) = ceil(fbins(Lbins+3:end));


% Compute the shift between filters in frequency in samples
diffLFDC = N-cfbins(end);
fshift = [diffLFDC; diff(cfbins)];

% Ensure that any bandwidth less than the minimum window is set to the
% minimum window
bw(bw<minbw) = minbw;


% Compute the frequency windows for the CQT-NSGFT
g = wavelet.internal.cswindow(params.winname,bw,Lbins);

% Obtain DFT of input signal
xdft = fft(x);

% Depending on transformtype value
if strcmpi(params.transformtype,'full')
    M = max(cellfun(@(x)numel(x),g))*ones(numel(g),1);
    [cfs,winfreqrange] = cqtfull(xdft,g,cfbins,M,params);
    
elseif strcmpi(params.transformtype,'reduced')
    M = zeros(size(bw));
    HFband = bw(params.NyquistBin-1);
    M([2:params.NyquistBin, params.NyquistBin+1:end]) = HFband;
    M(params.DCBin) = bw(params.DCBin);
    M(params.NyquistBin) = bw(params.NyquistBin);
    [cfs,winfreqrange] = cqtfull(xdft,g,cfbins,M,params);
    
elseif strcmpi(params.transformtype,'sparse')
    [cfs,winfreqrange] = cqtsparse(xdft,g,cfbins,bw,params);

end


if nargout == 0 

    % Plot if no output arguments
    if strcmpi(params.transformtype,'sparse')
        cfs = sparsecoefstofull(cfs,params);
    end
    
    plotcqt(cfs,freq,Fs,N,params);
    
else

    varargout{1} = cfs;
    varargout{2} = freq;
    varargout{3} = g;
    varargout{4} = fshift;
    varargout{5} = winfreqrange;
    varargout{6} = bw;
    
end
%--------------------------------------------------------------------------
%--------------------------------------------------------------------------
function [cfs,winfreqrange] = cqtsparse(xdft,g,cfbins,M,params)
Nwin = numel(g);
N = size(xdft,1);
winfreqrange = cell(Nwin,1);
numsig = size(xdft,2);
cfs = cell(numel(g),1);
% Algorithm for forward CQT due to Holighaus and Velasco
for kk = 1:Nwin
    Lg = numel(g{kk});
    % Note: Flip halves of windows. This centers the window at zero
    % frequency
    win_order = [ceil(Lg/2)+1:Lg,1:ceil(Lg/2)];
    % The following are the DFT bins corresponding to the frequencies in
    % the bandwidth of the window
    win_range = 1+mod(cfbins(kk)+(-floor(Lg/2):ceil(Lg/2)-1),N);
    winfreqrange{kk} = win_range;
    tmp = zeros(M(kk),numsig);
    % Multiply the DFT of the signal by the compactly supported window in
    % frequency. Then take the inverse Fourier transform to obtain the
    % CQT coefficients
    tmp(win_order,:) = ...
        xdft(win_range,:).*g{kk}(win_order);
    cfs{kk} = ifft(tmp);
    
end

if strcmpi(params.sigtype,'real')
    cfs = cellfun(@(x)(2*size(x,1))/N*x,cfs,'uni',0);
else
    cfs = cellfun(@(x)size(x,1)/N*x,cfs,'uni',0);
end
%--------------------------------------------------------------------------
%--------------------------------------------------------------------------
function [cfs,winfreqrange] = cqtfull(xdft,g,cfbins,M,params)
N = size(xdft,1);
Nwin = numel(g);
winfreqrange = cell(Nwin,1);
numsig = size(xdft,2);
% Using hop size of highest frequency band before the Nyquist.
cfstmp = cell(Nwin,1);
% Algorithm for forward CQT due to Holighaus and Velasco
for kk = 1:Nwin
    Lg = numel(g{kk});
    % Note: Flip halves of windows 
    win_order = [ceil(Lg/2)+1:Lg,1:ceil(Lg/2)];
    % The following are the DFT bins corresponding to the frequencies in
    % the bandwidth of the window
    win_range = 1+mod(cfbins(kk)+(-floor(Lg/2):ceil(Lg/2)-1),N);
    winfreqrange{kk} = win_range;
    rowdim = M(kk);
    tmp = zeros(M(kk),numsig);
    tmp([rowdim-floor(Lg/2)+1:rowdim,1:ceil(Lg/2)],:) = ...
        xdft(win_range,:).*g{kk}(win_order);
    cfstmp{kk} = ifft(tmp);
        
end

if strcmpi(params.sigtype,'real')
    cfstmp = cellfun(@(x)(2*size(x,1))/N*x,cfstmp,'uni',0);
else
    cfstmp = cellfun(@(x)size(x,1)/N*x,cfstmp,'uni',0);
end

if strcmpi(params.transformtype,'reduced')
    DCcfs = cfstmp{1};
    Nyqcfs = cfstmp{params.NyquistBin};
    cfstmp([1 params.NyquistBin]) = [];
    cfstmp = cell2mat(cfstmp);
    Numtpts = max(M(2:params.NyquistBin-1));
    c = reshape(cfstmp,Numtpts,Nwin-2,numsig);
    % Permute frequency and hop so that the coefficients matrices are
    % frequency by time
    c = permute(c,[2 1 3]);
    cfs = struct('c',c,'DCcfs',DCcfs,'Nyquistcfs',Nyqcfs,'NyquistBin',...
        params.NyquistBin);
else
    cfs = cell2mat(cfstmp);
    Numtpts = max(M);
    cfs = reshape(cfs,Numtpts,Nwin,numsig);
    % Permute frequency and hop so that the coefficients matrices are
    % frequency by time
    cfs = permute(cfs,[2 1 3]);
end

%--------------------------------------------------------------------------
%--------------------------------------------------------------------------
function plotcqt(coefs,freq,Fs,L,params)
if strcmpi(params.sigtype,'real') && ...
        (strcmpi(params.transformtype,'full') || ...
        strcmpi(params.transformtype,'sparse'))
    freq = freq(1:params.NyquistBin);
    coefs = coefs(1:params.NyquistBin,:);
elseif strcmpi(params.sigtype,'real') && strcmpi(params.transformtype,'reduced')
    freq = freq(2:params.NyquistBin-1);
    % Nyquist frequency has already been removed from c field
    coefs = coefs.c(1:params.NyquistBin-2,:);
elseif strcmpi(params.sigtype,'complex') && strcmpi(params.transformtype,'reduced')
    freq([1 params.NyquistBin]) = [];
    coefs = coefs.c;
end

Numtimepts = size(coefs,2);
t = linspace(0,L*1/Fs,Numtimepts);

if params.NormalizedFrequency
    frequnitstrs = wgetfrequnitstrs;
    freqlbl = frequnitstrs{1};
    ut = 'Samples';
    xlbl = ...
        [getString(message('Wavelet:getfrequnitstrs:Time')) ' (' ut ')'];
    
else
    [freq,~,uf] = engunits(freq,'unicode');
    [t,~,ut] = engunits(t,'unicode','time');
    freqlbl = wgetfreqlbl([uf 'Hz']);
    xlbl = ...
        [getString(message('Wavelet:getfrequnitstrs:Time')) ' (' ut ')'];
    
end
ax = newplot;
hndl = surf(ax,t, freq, 20*log10(abs(coefs)+eps(0)));
hndl.EdgeColor = 'none';
axis xy; axis tight;
view(0,90);
h = colorbar;
h.Label.String = getString(message('Wavelet:FunctionOutput:dB'));
ylabel(freqlbl);
xlabel(xlbl);
title(getString(message('Wavelet:FunctionOutput:constantq')));

%-------------------------------------------------------------------------
%-------------------------------------------------------------------------
function coefs = sparsecoefstofull(cfs,params)

if strcmpi(params.sigtype,'real')
    cfs = cfs(1:params.NyquistBin);
end

numcoefs = max(cellfun(@(x)numel(x),cfs));
coefs = zeros(numel(cfs),numcoefs);

for kk = 1:numel(cfs)
    tmp = cfs{kk};
    x = linspace(0,1,numel(tmp));
    xq = linspace(0,1,numcoefs);
    F = griddedInterpolant(x,tmp);
    F.Method = 'nearest';
    coefs(kk,:) = F(xq);
end

%--------------------------------------------------------------------------
%--------------------------------------------------------------------------
function params = parseinputs(N,varargin)
params.sigtype = 'real';
params.NormalizedFrequency = true;
params.Fs = 1;
params.winname = 'hann';
p = inputParser;
addParameter(p,'SamplingFrequency',[]);
addParameter(p,'BinsPerOctave',12);
addParameter(p,'FrequencyLimits',[]);
addParameter(p,'Window','hann');
addParameter(p,'TransformType','full');
parse(p,varargin{:});
if ~isempty(p.Results.SamplingFrequency)
    params.NormalizedFrequency = false;
    params.Fs = p.Results.SamplingFrequency;
    validateattributes(params.Fs,{'numeric'},{'scalar','positive'},...
        'CQT','SamplingFrequency');
end


params.numbins = p.Results.BinsPerOctave;
validateattributes(params.numbins,{'numeric'},...
    {'integer','scalar','>=',1,'<=',96},'CQT','BinsPerOctave');
params.freqlimits = p.Results.FrequencyLimits;
if ~isempty(params.freqlimits)
    validateattributes(params.freqlimits,{'numeric'},{'numel',2,'positive','>=',params.Fs/N,...
        '<=',params.Fs/2,'increasing'},'CQT','FrequencyLimits');
end
params.transformtype = p.Results.TransformType;
validtypes = {'sparse','full'};
params.transformtype = ...
    validatestring(p.Results.TransformType,validtypes,'CQT','TransformType');
if ~isempty(params.freqlimits) && strcmpi(params.transformtype,'full')
    params.transformtype = 'reduced';
end
if isempty(params.freqlimits)
    params.freqlimits = [params.Fs/N params.Fs/2];
end
validwin = {'hann','hamming','itersine','blackmanharris','bartlett'};
params.winname = validatestring(p.Results.Window,validwin,'CQT','Window');
Roman Thilenius's icon


You could potentially vary the q setting for each filter, to get variable bandwidth/time resolution for each band.

in an old patch of mine i was doing something like that - one time for chromatic series and one time for overtones - but it doesnt really work good.

does anyone know how you have to set the Q settings in fffb for a series of bands and/or in relation to the band frequency?

AudioMatt's icon

Roman,

I just found this 28 year old text by MILLER (F-ing) PUCKETTE describing a constant Q transform. (An efficient algorithm for the calculation of a constant Q transform)

It's this guy's understanding that the reason it's constant-Q is because you get nearly identical performance to an FFT.

Roman Thilenius's icon


i´ve already copied so much from him, so it probably doesnt matter if we repeat that.

isnt that the same as his "proposal for a nonlogarithmic fft whitepaper" (or similar titled)?

however, in my case, i dont need "true" CQT, it is mainly for roughly resynthesizing stuff with wavetables or additive synthesis, with some sonic artefacts beeing part of the plan.

so fffb will be fine once i fix this Q issue.

it actually even has a big benefit over DFT/FFT/CQT, because you can set the frequencies exactly to what you want then to be (for example normalize the pitch to middle C or whatever.)