Image to Sound Conversion (via pfft) Sounds Choppy? [SOLVED]

    Feb 27 2014 | 9:25 pm
    I'm trying to convert arbitrary images to sound but I’m getting very clicky noisy results compared to software like PhotoSounder:
    Does anybody have a clue about how I could attain smooth-sounding results from my inverse FFT?
    (I attached a zip with my patch and 2 example images: 1 speech sonogram and 1 trippy fractal.)
    I’ve been learning a lot from Tadej Droljc's Sonographic Sound Processing suite along with the spectral tutorials from Jean-Francois. FFT is slowly being demystified. Thanks to them both!

    • Aug 15 2015 | 8:44 pm
      I'm replying to my own post with 2 important answers:
      1) It's all about the phase plane. The IFFT method relies on an image with 2 planes: amplitude-difference and phase-difference. The phase-difference plane is inherently missing from whatever image you intend to sonify. If it's empty then the result is robotic and choppy. If you fill it with noise the smoothness improves marginally. If you scale that noise between 0 and pi it improves dramatically. I finally ran into this nugget of info in a paper or forum post somewhere but I forgot where. Sorry! (Demo patch is attached where you can load an image and toggle between phase plane fill modes: "matrix2sound test ZLP-PiNoise.maxpat")
      2) The smoothest image sonification patches I've made thus far have used an FFT filter applied to a noise source. In most cases this seems to result in more natural sounding translations than what I mention above, at the cost of high CPU usage. (Demo patch is attached: "matrix2sound test ZLP-FFT-filter.maxpat")
      You can find the same basic architecture in...
      Help menu / examples / fft-fun / forbidden-planet
      ... but driven by a multislider instead of a matrix.
      Also see Patch 28 from BazTutorials Patch-A-Day series on youtube:
      Also see ARSS The Analysis & Resynthesis Sound Spectrograph...
      ... an open source command-line utility with great examples. It later became Photosounder, the commercial software I mentioned in my original post.
      DISCLAIMER: I'm no FFT expert. These patches did the job for me, but maybe some gurus will chime in to correct my mistakes.
    • Aug 20 2015 | 5:31 pm
      Well done, ZLP. I would even scale the noise to [0; 2PI] (or [-PI; PI] instead of [0; PI].
      To hear even more dramatically the differences, make a version of your patch with a smaller FFT size: 2048 or 1024.