That old problem: Kinect to surface (mesh or nurbs?) with data filtering

matheus leston's icon

I'm developing a patch that uses a Kinect (the 360 version) to create portraits and I'm facing (no pun intended) the issue of trying to create a surface at the same time that I want to filter the data.

I tried two different approaches:

Max Patch
Copy patch and select New From Clipboard in Max.

- jit.gl.mesh: by using the amazing xray.jit.sift I was able to filter the points that I don't need to be rendered. The problem is that now I have a matrix that is actually one long line, and so I can't create a proper surface.

Max Patch
Copy patch and select New From Clipboard in Max.

- jit.gl.nurbs: I can't get to filter the points with xray.jit.sift. Using jit.scanwrap to make a long line, I get the error "jit.gl.nurbs: requires order

Is there a solution? Which solution is better? If using jit.gl.mesh, is there a way to organize the indexes (nearest points or something like that) in order to create a proper surface? Or is there a way to filter what is being sent to jit.gl.nurbs?

diablodale's icon

What is your end goal? Technically or artistically, what do you want?
I'm not able to figure your goal from your patches.

matheus leston's icon
Max Patch
Copy patch and select New From Clipboard in Max.

diablodale, this is sort of what i'm looking for. an image is attached and the patch is bellow:

i want to generate those surfaces from a face, but at the same time i want to be able to filter the background below a certain depth. the only way i could do it was with xray.jit.sift and jit.gl.mesh, but them the "tri_grid" draw method loses the correlation between the nearest points and the triangles are draw in sequential order (left to right, top to bottom), not a grid.

Screen-Shot-2016-01-29-at-10.38.25-AM.png
png
diablodale's icon

To try and clarify, are you wanting a tri-grid image of just a face? If so, that's a lot of hand work because you'll need to edit out the heck and chest, and other parts of the body.

Are you wanting a pointcloud of whatever is in the camera's view minus points below/above a depth value? Drawn as points? Or drawn as a tri-grid in your picture?

Must it be a pointcloud? Or can it be a 3d face model of 60 or maybe up to 1000 points?

And you write "below a certain depth". Below would mean the points on the nose would start to disappear first, then eyes, then ears, etc. Do you really mean below -or- do you mean to filter the background for those points who have a distance above a certain distance?

matheus leston's icon

Diablodale, sorry I couldn't explain properly. Explaining complex things in english is not my forte ;) I will try my best.

I'm helping a friend which is also an artist to develop a system to generate portraits. The main concept is to draw it with polygons that can be edited, creating deformations. A realistic representation is not the focus. But first I'm thinking about the capture part of the project and leaving the editing for later.

My first thought was that using facetracking and a 3d face model would be too "general". It would not properly capture the differences between each face, even if the differences captured by the kinect are "low res". Am I wrong in this respect? So I thought that it would be better to generate a pointcloud and draw the triangles (hence the tri_grid) based on those points.

Editing the rest of the body is not an issue. My only concern is to filter the background, removing points below a certain threshold in the z plane (depth, in terms of proximity to the camera, not distance). I could do it fairly easily with xray.jit.sift and I was able to generate the pointcloud with no issues, but now I can't create the tri_grids, as the points lost their x and y position in the matrix.

I thought about doing some sort of analysis of the x and y planes of the coordinates matrix and create an index matrix to feed the jit.gl.mesh, by finding the nearest point in each direction, but I wasn't successful. Also tried getting a similar result with nurbs, as it creates surfaces, but I couldn't filter the z coordinate (I needed to transform the matrix with jit.scanwrap in a single line / row, and jit.gl.nurbs didn't worked).

Is there a way to reconstruct the tri_grid after the points have been filtered in the z coordinate? As the resulting matrix is one dimensional with variable length, I can't figure out a way to find the corresponding indexes of the triangles.

Or maybe facetracking is the way to go? I got interested in this 1.000 points face model you quoted. Would it be able to capture ones facial features?

Spa's icon

use geometry shaders

matheus leston's icon

SPA, thank you for your answer. To be honest, this is not something I was familiar with and I got really interested. Since yesterday I'm trying to learn more about it by looking at the few examples in Max as well here in the forum. But as my experience with custom shaders are really limited (just used a few for extremely basic stuff and switched to gen as soon as I could), I'm having a hard time wrapping my head around it and understanding how can I use it to achieve what I want. Could you explain a little more or indicate something some reference about it?

I also saw a 2012 post here in which you say that you achieved exactly what I described. Could you share your shader? It may be a valuable way to learn more about that kind of stuff.

diablodale's icon

Using a geometry shader might not do all you need. The geometry shader will be applied during the render process. Therefore, you will not get a dataset output which allows you to then edit by hand. Using SPA's approach would be
1) collect pointcloud data from Kinect
2) hand edit point cloud to make the artistic deforms or changes you want
3) feed points into rendering engine (for example, jit.gl.mesh)
4) geometry shader runs to remove triangles further than xxx meters
5) you see the visual

Step (2) could also remove the background; then you don't need the (4) geometry shader. There are some open source code/tools that allow changing/editing of pointcloud data into a mesh. I have not used any of them. Here are some of them:
https://www.google.de/search?q=pointcloud+mesh+editing&oq=pointcloud+mesh+editing&aqs=chrome..69i57.3461j0j7&sourceid=chrome&es_sm=122&ie=UTF-8

Another option is to use my dp.kinect2 object (runs on windows), you can get a high definition face model that has 1347 vertices and 2630 triangles. It is only the face; no neck and no shoulders.

matheus leston's icon

Diablodale, thank you so much for you complete answer.

I thought that what SPA suggested was a little different from what you proposed. Something like this:

1) collect point cloud
2) remove points further then x meters
3) deformations in remaining points
4) rendering engine (jit.gl.mesh) -> as I described before, it should be rendered as points, not as triangles,
[so far I could do all of this fairly easily]
5) use a geometry shader to render the point cloud as triangles, "detecting" nearest points to connect.

Could a geometry shader do this kind of thing? Thought my research, I found that it might be possible (as the old post from SPA I quoted or others like this one: https://cycling74.com/forums/first-steps-to-generate-polygon-mesh-and-line-mesh-from-video-input-2), but I don't not quite exactly how.

I'm really interested in using your object. I have read a lot about it, but I didn't know that it could deliver that many points from a face. Seems to be exactly what I need. The problem is that I'm using a Kinect 1 and, although a Kinect 2 is pretty easy to find, the adaptor for USB is extremely rare here in Brazil. I'm trying to figure out a solution to buy the adaptor and as soon as I get my hands on it, I will get dp.kinect2.

diablodale's icon

A geometry shader runs in OpenGL and operates on "primitives". Primitives are things like points or triangles.
http://www.informit.com/articles/article.aspx?p=2120983&seqNum=2

You do not have access to the entire pointcloud in a geometry shader. Without access to all of them...you can't search for the nearest one. Perhaps you could use a displacement map and send the data into the OpenGL pipeline as a texture. And then search through the texture for the nearest neighbor. However, this quickly becomes complex.

You only have access to the vertices in the primitive. For example, if you are drawing points, then you only have access to the single vertex. If you are drawing triangles, then you have access to the 3 vertices of the triangle. You can create new/change the primitive type. For example you could receive points and output triangles. However, you still only have access to the single vertex (point input) or the 3 vertices (triangle input).

If you only need the visual and don't need the data after the culling (filtering out the background), then try using a far clip (no need for shader, its built into the max opengl objects as an attribute). @far_clip removes primitives beyond the given distance. If you don't like this effect, then you can write a geometry shader. The link above has some good examples you could extend.

matheus leston's icon

Thank you for the great explanation, I believe I understand it better now and indeed it doesn't seem to solve my problem. I'm left with two possibilities:

1) Use the mesh as triangles without any filtering with @far_clip. I have tried it before, but I couldn't do something simple: define the point of view of the @far_clip separated from the camera. Is that possible? I want to be able to rotate the object or the camera, but keep the background clipping in the same point. That seems to be an easy task, but I wasn't able to do it.

2) Find a way to search for the nearest triangle. That was my first thought, trying to find the triangles and feeding the indexes in the correct order into the last inlet of jit.gl.mesh. As the filtered matrix keeps the X and Y coordinates in the first and second plane, it should be doable, but I couldn't find a way to do it. Tried jit.gen, tried a sort of xray.jit objects, but couldn't find a proper way to do it.

The third solution (and probably the best one) is to use dp.kinect2 with a Kinect v2. I'm trying to get the adaptor here in Brazil which is more expensive the Kinect itself (?), but it seems that I could get the best resolution and precision. But for now, i'm stuck with Kinect v1.

diablodale's icon

far_clip is a core feature of OpenGL. I think it is always relative to the camera.

The nearest neighbor is a widely researched computer topic. There are many algorithms that try to reduce the large computational expense. https://en.wikipedia.org/wiki/Nearest_neighbor_search

You might combine a few things like:
1) collect point cloud
2) deform/edit points for your art; keep the background points unchanged
3) feed the edited pointcloud into jit.gl.mesh to draw as triangles
4) write a geometry shader for jit.gl.mesh. write it to accept a parameter "z distance from origin (0,0,0)". Then discard any triangles which have points further than your parameter value. Code like: if (trianglepoint.z > zdistancefilterparam) then discard;

The geometry shader for jit.gl.mesh can be written to use a local coordinate system for the distance filtering. This would allow you to filter based on the original pointcloud values which have the 0,0,0 origin of the Kinect.
Then you can use a jit.gl.camera to change your view of the jit.gl.mesh.

matheus leston's icon

I believe the clipping method is far more efficient then trying to find the nearest neighbors. I will study more of geometry shaders and try to build what you suggested, based on the origin point.

By the way, I guess I will buy a Kinect 2 and use one of the available hacks to use it without the adapter, which is impossible to find here in Brazil. But I couldn't find examples of the face tracking dp.kinect2 does. Do you know where I can find some images or could you share? I'm I right to assume that it will generate a pointcloud (with thousands of points, as you described) that could feed jit.gl.mesh?

matheus leston's icon

(a pointcloud of the face only)

diablodale's icon

The demo patch that comes with dp.kinect and dp.kinect2 both have a face patch in it. It is in a subpatcher with the more advanced features. Both externals will output points, triangles, and indices to triangles. You can chose the method you want using the @face3dmodel attribute.

I added a picture of the 3d face data being rendered by jit.gl.mesh into each wiki. You can see the pictures and the face model documentation for the original Kinect sensor at:
https://github.com/diablodale/dp.kinect/wiki/Message-based-Data#face-tracking
and for the Kinect v2 sensor at:
https://github.com/diablodale/dp.kinect2/wiki/Message-based-Data#face-tracking

matheus leston's icon

Wow, the picture of the face render of the Kinect v2 is really impressive. No doubt that this is the best solution for what I want.
I just bought the Kinect and now I'm trying to solve the USB issue. As soon as I figure it out I will get dp.kinect2, which seems to be amazing.
Diablodale, thank you for all your help!

Reilly Donovan's icon

I was wondering if any progress has been made on converting a kinect point cloud into a mesh. I've been able to adjust the different rendering parameters (triangles, quads, etc) but I can't seem to figure out how to fill in the vertices with faces. Also, when I use any other renderer other than the points, everything seems to extrude from the origin of the kinect, making it difficult to see the reconstruction. Is there a way to remove or apply a threshold choker to the point cloud so I can remove the vertex at the kinects origin? Also, I'm curious about implementing 2+ kinect v2, is there a way to do this? Thanks for your help and for your plugin is great, I will be buying it in the next couple days.

Here is a sample of what I've done so far, it involves bringing a live feed of the kinect v2 into a VR environment so you can see your own body as a point cloud. https://youtu.be/2wqX2uylHps

diablodale's icon

I haven't done any new work related to this. The ideas above I believe are still options.
Personally, if you don't need the vertex data after filtering, I would experiment with jit.gl.mesh and a geometry shader that filters out triangles you don't want (like the ones having the model origin as a vertex). The same jxs can also compute normals if you want the triangles you create in this geometry shader to respond to lighting.

After you pass the vertex position from the vertex shader to the geometry shader, you can compare all three vertices of a triangle to see if any are the model's origin. And if they are, then discard that triangle. Otherwise, calculate the normal and the world position of the vertices given the modelviewprojection matrix.

jit.gl.mesh can also calculate normals for you. It has some limitations which are so technical it doesn't make sense to discuss unless you experience the issues.

Max 7 includes some example geometry shaders at C:\Program Files\Cycling '74\Max 7\resources\media\jitter\shaders\geometry

Microsoft limits there to be only one Kinect v2 per computer. I write here about alternatives https://cycling74.com/forums/kinect-hd

Reilly Donovan's icon

Thanks for your feedback, I will move forward with your suggestions and report back. I'm working on a way to record / playback the depth / color streams as a pointcloud. I've successfully recorded the depth / color to a mov, with the depth being a bit of a trick, as the recording doesn't support a float32, so I converted it using a jit.matrix object to a char 1 which is then connected to a jit.pack object to combine the depth signal into one grayscale video. Now I would assume this becomes an issue on the other end when I try to connect the depth video playback to the depth video input of the jit.gl.mesh object that receives the depth /color data in your pointcloud example. I can't seem to get the results one would expect when connecting a recording of the color / depth data to the same inputs that the live stream from the kinect v2. Any suggestions on how to get recorded depth / color data to playback in your pointcloud example? Thanks and sorry if this inquiry is redundant, it seemed that I was logged out while submitting a previous inquiry

diablodale's icon

recording has a thread of its own at https://cycling74.com/forums/record-dp-kinect-matrices/
I recommend you provide your patch when you ask there so everyone can comment/find it.