I'm trying to recreate an optical soundtrack reader in Jitter. I've used max for quite a while but only just starting using jitter, but essentially I want to convert video/images to sound by recreating the technique used in optical soundtrack reading.
In theory I should be able to do this by splitting each frame into 1764 rows( at 44100 sample rate) and by taking the average brightness of each of these rows and transposing this to 1 to -1 an audio signal can be output.
So this reading of 1764 rows has to occur within 40ms, added to a buffer and output.
To do this Im looking for advice on how to:
- Split video into a grid of 1764 rows, I'm using rgb2luma so all videos will only have 1 plane. I was thinking I could convert all videos to have a height resolution of 1764 and then read through each row, but this could be really processor heavy and I'm not sure how to do this?
- Read row of cells as one to obtain average brightness easily, rather than having to cycle through and add each cell individually, as this would not work as quick as required.
- any other objects, techniques or tutorials that could be useful for me to look at.