Help recreating jit.op in opengl using slab/jit.gl.pix
I'm using the jit.openni object for an installation project and I've figured out how to scan and cut out near and far points of the depth map using jit.op. Lately I've been trying to convert a lot of my patches to run on my GPU to improve performance but I'm still completely flummoxed as to how I recreate many of my jit.op processes into either jit.gl.pix or jit.gl.slab. Attached is my attempt at recreating the process in both pix and slab although I'm well aware that both attempts do not work and are probably way too literal of an idea as to how to recreate the jit.op version. Any help on how to go about recreating this in slab or pix would greatly appreciated.
hey matt!
definitely a good idea to make use of gen for this.
you don't need to create a separate gl.pix object for each operation, just combine them all in a single gen file:
i took a somewhat different approach for my kinect project.
i believe there were some weird issues handling the long data type with gl.pix, so i uses the jit.pix gen object instead (functions the same, but on the cpu instead of gpu).
this will still be a speed improvement over the multiple jit.op object approach.
i basically clip the input to user-defined parameters, and then scale that to the char data type range (0 - 255). you can even make use of the scale exponential parameter for non-linear scaling, which can be cool (although word of warning, any exp value != 1 will eat more cpu cycles).
i send this output to a char matrix, and this allows me to further refine the input by using the jit.matrix srcdim attributes.
this allows you to really zoom in a get nice details on a specific range of the depth image.
Hey Rob!
Thanks for the response. I tried implementing some of your ideas and couldn't get them working so I upgraded from 6.0.8 to 6.1.3 and it started working so yay for current versions of Max!
Your method is great. Cutting out near and far points has much more of a gradient flow than my method. I can't totally wrap my head around the code cause it's a different way of thinking for me but always nice to be exposed to that.
One thing I'm having trouble with is when I try recreating my original jit.op method in jit.pix by replicating the same operators I can get it to work if I separate each step but if I compile it all into one jit.pix it doesn't work. Again, I'm probably approaching gen too literally but am curious as to why that is. I attached another patch showing how I've approached it.
Also, my jit.op method seems to get the most consistent & highest frame rate of the 4 methods in this patch so I'm trying to figure out if switching to a jit.pix method will actually improve my performance.
yeah i see what you mean.
according to gen-master wes, it has to do with floating point rounding. everything is processed as doubles internally, and then converted back to long on output. the divisions create small fractions that get mapped to 0 when output, but not when processed internally.
in this case, you might be better sticking with the jit.op's.
you could also try outputting as float32 from jit.openni, and using jit.gl.pix, but not sure if that will be more efficient.
as you've demonstrated, always a good idea to profile and not assume gen will perform better (as i did in my original response).
I don't have a kinect to test with and haven't used jit.openni, but form what I gather you're processing long matrices. In jit.gen and jit.pix, long matrices are internally converted to doubles with @precision auto. You can specify @precision float32. It may give you a speed boost.
I'm quite surprised that you get better performance form many jit.pix than from one. Each jit.pix will incur conversion costs between long/double formats and a single jit.pix will make better use of the cache since all the calculations are done at once and memory is continually cycled through for each operation.
You won't see any output on the jit.op side of this patch, but a comparison between jit.op and jit.pix shows that jit.pix is faster except for very small matrix sizes:
I got a slight performance boost by using attribute @type float32 in jit.openni and using jit.gen instead of jit.pix, since jit.openni outputs single-plane matrices and jit.pix only works with 4 planes.
Also, I don't know what your purpose is, but you could consider that the depthmap values output by jit.openni are in millimeters, so if you just need to cut out close and far objects, you can simply use gtep and ltep operators in jit.gen
@wes Just to clarify I wasn't getting better performance from multiple jit.pix as opposed to a single one containing all the operations. I was commenting that when I separated each operator as it's own jit.pix it worked the way I intended it to, but when I compiled all the operators into one jit.pix it didn't work at all. Again, my understanding of gen is admittedly shaky so I'm guessing it's just a programming error on my part.
@LSKA That's a great way to approach it! I was so intent on recreating my previous method in jit.op (which admittedly came about from lots of semi-random trial and error rather than logical thinking) that I hadn't spent much time thinking about other ways to achieve it.