face recognition - state of the art?
hi everybody,
i'm currently embarking on a project using face recognition and wonder what options i have - especially since the only two solutions i find are at least four years old... what is currently the best tool for this?
my requirements are: recognize face, track position (including distance of the viewer) and viewing angle (head tilt, up/down+left/right). and i need to be able to select a camera and have the live output as well in max for further processing.
the only two options i found so far are:
- cv.jit: i found a bunch of example patches (https://cycling74.com/wiki/index.php?title=Jitter_Recipes) including tilt at least in one direction. i have the impression though that it will need a lot of data smoothing and customization to get to a robust solution.
- faceosc: https://github.com/kylemcdonald/ofxFaceTracker / max-patch: http://www.tomhall.com.au/blog/?p=3604
seems to have some trouble (at least on macos) to recognize faces but once it finds a face, the results seem quite good. i do not see any way to get the clean camera image back to max though or even select what camera this app uses.
are there any other options? or more robust examples with cv.jit? thanks in advance for any input!
k
- or what about a kinect v2 and dp-kinect2? (https://hidale.com/shop/dp-kinect2/)?
can anyone tell me wether this works well and provides the stuff i need (see above)?
and is there any solution for mac (dp-kinect is windows only - i would switch to windows for this but if i don't have too...)?
- another alternative i found here: https://vimeo.com/53367663 / https://github.com/tetard/oo
but i did not find neither a compiled version of this nor an example patch. does anyone understand more about this then i do?
any other options? what would be my best bet?
k
please... doesn't anyone have some experience with this and can give some advice?
is it really like we have to invent the wheel with max every single time anew (which would be me getting into cv.jit)?
k
not quite sure what you mean about (re)invent the wheel.
i've been face tracking recently, cv.jit is still very capable, even was updated 4 days ago: https://github.com/Cycling74/cv.jit . the OpenCV community is still active, although I haven't been successful in using any of their new tracking xml's in Max, although this new update might provide an easier way(?). I just started with the cv.jit.faces help patch and continued from there. the object spits out index number and coordinates. Making those coordinates relevant to the physical dimensions of your space will take a lot of measurements and tweaking, but totally doable.
I haven't really experimented with Kinect, although it is also a great solution. On a mac, not so much (as you've said) and I'm not sure if anyone has even integrated the Kinect2 into Max on the Windows side. Kinect2 is where it's at.
Quartz Composer also has a great face detector patch, so you could use that in conjunction with Max. Use Syphon in this case to pass video between software
hi greg,
thank you a lot for your reply!
i guess, maybe i just don't get this cv.jit-thing really: this github repository, these are updated patches, but they all do rely onto these externals http://jmpelletier.com/cvjit/ , right? or do i need to compile the source code there (which is beyond my skills)?
because from what i understand of cv.jit, it does provide the underlying functionality but if i want to get to some robust easy-to-use solution which provides me with what i need (head position, distance, rotation - pretty standard i guess) it would require a lot of work - which probably is, as you put it "totally doable" but this is what to me feels like "reinventing the wheel" since i am sure it has been done many times. or is there any advanced example-patch i missed?
apart i looked some more into dp.kinect2 and as far as i understand, this should do what i need but from just looking at the patch (without having a kinect yet) it's hard to tell. is this really as simple as it sounds? maybe someone who actually uses this can tell me... eg. in the subpatcher "p draw_face" i see variables for rotatexyz and position (xyz) being handed over... isn't this exactly what i am looking for?
thanks again for for everyone helping me!
k
@Karl, here: https://github.com/Cycling74/cv.jit/releases you can download the new compiled externals which are Max7 and 64-bit compatible.
thanks for pointing me there, don't know how i could miss that...
anyway: i did some more searching regarding cv.jit and the most elaborate example-patch i found, is this: https://cycling74.com/forums/jit-faces-3d-camera-movement-tracking/ (last post). it still does not do two-axis-rotation and distance however and it looses track often so i guess i'll try just to get a kinect and hope to get this to work (the requirements seem rather strict)... i'll post the results.
regardless: if anyone knows of some more robust solution with cv.jit (or some other software), please let me know - because installing a kinect plus a camera to basically do the same seems not really efficient...
thanks again for both of your help so far!
k
ps: just for reference, i found some other tracking device here: http://www.fovio.com/ - but i do not see prices and being aimed at business clients it surely will be way more expensive. and i did not find any bridge to max either.
try Quartz Composer. Apple's "Detector" patch is fairly good and doesn't seem to be a resource hog like cv.jit (especially at higher resolutions). It tracks (x, y, width, height) of the whole face, (x,y) of each eye and your mouth and has something called faceAngle, that i'm not quite sure what it does. The downside is there is virtually no documentation on this patch and little support for QC itself (the Facebook page is the best place to ask question about it).
check out this composition for a good use of it, plus using an iterator to capture multiple face: http://qcdesigners.com/index.php/forums/topic/17/fun-demo-of-detector-patch
with this, he is using the coordinates of each eye compared against each other to determine tilt. distance from camera to face will only be accurately possible with IR emitters/receivers (something the kinect has), or something you can cheaply and discretely set up with an arduino and sensors. otherwise you'll have to calibrate the system by standing 0.5 meters, seeing how large the bounding box around your face is, standing 1 meter away and seeing how big it is that time, and using that math (hopefully it's linear) to determine the correlation between distance from camera and size of bounding box.
the QC detector seems to consistently hold my face with less shakiness better than the cv.jit, however it doesn't seem to do as well as cv.jit does with far distances.
any video in QC you can syphon to Max, or you can send the values and coordinates over OSC to Max as well.
is quartz composer still existing ?? i thought it had been terminated ??
anyway i saw that OO video some time ago and remember beeing amazed. It is a OpenFrameworks library, so it's designed to work with OpenFrameworks so it's all c++ and you would need to use OF and compile everything from c++ yourself.
i'm not fully done yet but today arrived the kinect i ordered and the first results are awesome.
rotation, position, distance, all there - almost right out of the box (that "box" being dale's dp.kinect2 which does a fantastic job). and it's really robust too - rarely looses track even if i bounce up and down. my office mates start wondering what's wrong with me.
i had a look at quartz composer tracking too but not very thorough since i need to move this project to windows in the end anyway. my first impression however was more comparable to cv.jit, maybe a tad better but nothing anywhere close to the performance of the kinect.
i guess it's really the hardware (ir) or respectively, the software callibrated well to the hardware which makes this thing work so well.
thank you all for your help!
k
p.s.: since the kinect is rather picky regarding hardware - it does work on a macbook pro retina, mid 2012 (2,6 GHz i7). windows 8.1. pro installed via bootcamp.
very cool! please keep us updated with any progress and patches you make. I'm very interested in trying out Kinect in the future. I wish the whole enclosure was a bit smaller though...
anyways, which model did you get? if the Kinect 2 works with bootcamp, that would be really awesome.
i think that kinect is a 1520 (i have it mounted in a setup i currently can't remove but i'm pretty sure).
and yes, i installed my windows 8.1 pro partition via bootcamp. machine is a macbook pro retina, mid 2012 (2,6 GHz i7, 16 GB RAM).
curiously the kinect configuration verifier tool just crashes when started but the sdk (v 2.0) works fine and so does the kinect.
attached below is the facetracking bit from dp.kinect2 i use (basically just a stripped down version of the example patch provided with dp.kinect2).
best
k