Best way to replicate this visualization

Rodrigo Cadiz's icon

Hi! I am trying to replicate something like this in Jitter:

I wonder what the best strategy is. So far, I can isolate a single column of a video using jit.gl.pix and doing a crop operation. I need the data contained in that column for sonification purposes and I can get that data using something like jit.spill. However, I don't know how to proceed. I have been thinking of storing the pixel values into a buffer~ object (using jit.peek or something like that) and then trying drawing the buffer content on top of the image vertically, as in the video. But I don't really know how to do that effectively, in terms of visual results and computational resources. Also, perhaps something like this can be achieved using gl operations only? Any ideas? Many thanks!

👽'tW∆s ∆lienz👽's icon

Hi, it's a very cool project, but the description for the video is not detailed enough to say what's really going on:

The translation begins on the left side of the image and moves to the right, with the sounds representing the position and brightness of the sources. The light of objects located towards the top of the image are heard as higher pitches while the intensity of the light controls the volume.

^the detail they miss, is that it's not just the 'pitches' that are different dependent on vertical location, but also the instrument within the orchestra... so the visualization is not a straight audio-waveform, but more like a compilation of (what looks to me like) bandpass responses tuned according to instrument, then fed into a spectrum analyzer(or else, could just be a visually windowed-compilation of each instrument's waveform, all in one vertically rectified line)...

and, also, the visuals can be done in all gl operations, but at some point you'd need to handle audio, so i'm not sure the entire thing would be done within that context(there would be an analysis of audio that could not completely be done in gl)... unless... you cheated: you could create the visual in its entirety, including (what i will call) "the rectified waveform of bandpass analysis", and then sonify from that artificially created audio-waveform(so everything is visualized first in gl, then spit out to audio at the end).

as for this:

I have been thinking of storing the pixel values into a buffer~ object (using jit.peek or something like that) and then trying drawing the buffer content on top of the image vertically, as in the video.

this sounded to me like a good idea at first, but then when i think it through, it doesn't feel necessary. the pixel values are just 'triggers' for audio to begin from there, so they can be event-rate, or at the very least do not inherently define an audio-rate level of specificity(stored wherever, whether in jit.matrix or dict or coll or buffer~, etc. doesn't matter)... so maybe you could start from the point of how you're going to 'sonify', and then figure it out from there:

create a patch first where you simply read each vertical column of pixels, and figure out what you want to do with just one column of that information... from there, if you end up sonifying that data, you can then decide how to visualize it, or from there, you might decide to visualize right away(maybe by creating some type of flashy-shader effect that reacts to brightness of pixels...), and then sonify that(?)

i might opt for the first route: gather a column of data, trigger some audio, send it through bandpass filters per instrument in order to highlight the frequency spectrum of them, and then create visualization of spectrum from that to map ontop of the video.

_________________________

and finally, how to technically do all this? i have no idea!

🤣 😋(<-this second one is my 'stupid-face' emoji, not 'yummy-face' emoji 😇)

it sounds like an insanely brilliant project, though. figured i'd lend some ideas until some smart people come along. i do think, if you're looking for better clues on how to get technically proficient at stuff like this in Max, you can look up "Federico Foderaro" and "Matteo Marson" tutorials on youtube and patreon, and you'll find many gl explanations that can help(both, in jit.gl.pix, and in coded OpenGL form).

here's Federico's youtube:

and Matteo has a free video on his patreon that can help immensely with understanding how to draw reactive-audio/waveforms with jit.gl.pix:

__________________________

hope it can help 🍄

TFL's icon

👽'tW∆s ∆lienz👽 gives a lot of clues already!

Given that we want to sonify a still image, I would load it both in a jit.gl.texture for visualization, and in a jit.matrix for sonification.

My next step would be to simply draw a vertical line on the image (in the jit.gl world) and display it, and get a vertical slice of the image (in the jitter world) at the same position as the vertical line.

Actually, here is it:

Max Patch
Copy patch and select New From Clipboard in Max.

From now, it's up to you to make decisions about the sonification process.

And I guess at the end you'll want to draw the line in a completely different way, to make it a curve following the luminance, instead of a straight line.

LSka's icon

use jit.gl.mesh to display the "waveform"

Max Patch
Copy patch and select New From Clipboard in Max.

Pedro Santos's icon

Hi! See the following topic from a few years ago.

It deals with a similar subject and can be useful:

TFL's icon

LSKA it seems that you copypasted the original patch instead of the one with your modification?

I don't know why I absolutely had in mind the idea that the "waveform" would have to be drawn as fragments in a jit.gl.pix or slab instead of using geometry.

Here is my take using jit.gl.mesh:

Max Patch
Copy patch and select New From Clipboard in Max.

And a variation with jit.gl.path, maybe more suited here

Max Patch
Copy patch and select New From Clipboard in Max.

Pedro Santos's icon

And here's an example of sonification of a slice of an image using FFT.

Max Patch
Copy patch and select New From Clipboard in Max.