new dp.kinect3 for new Azure Kinect v3 sensor

diablodale

Hi all. I've released dp.kinect3 for the new Microsoft Azure Kinect (the 3rd generation Kinect). https://hidale.com/shop/dp.kinect3/

This is an early alpha release to allow existing dp.kinect2 users to experiment with their new Azure Kinect sensor and provide feedback on dp.kinect3 functionality.

The production-ready release later this year will have most features of dp.kinect2 -- easier to upgrade your patches to the new Kinect sensor. I'll add new features too. This early release is limited to some basic features including...

Depth, color, and infrared sensor output in many pixel formats
Point clouds
More resolutions and control of the frame rate
Some post-processing like horizontal flip and undistortion
Option to use hardware acceleration and GPUs for image processing

This early release has not been performance optimized. You may encounter errors/crashes of dp.kinect3. These early releases are still in development, not completely tested, and do not yet have robust error handling.

More info at https://hidale.com/shop/dp.kinect3/

Happy patching!

Spa

Great, Dale.
Will you implement skeleton detection in the future?
Does the new microsoft library make it possible?
Thanks

diablodale

Definitely *will* have skeleton detection. That and aligning the images are the two features I am working on now.

Full disclosure: I am not satisfied with the frame rate performance of Microsoft's skeleton detection technology with the new v3 sensor. This a widely discussed issue https://github.com/microsoft/Azure-Kinect-Sensor-SDK/issues/514 and Microsoft's official statement is, "...[in the next semester] we are focusing on a more performant DNN." We are now waiting on that performance update.

Until then, high framerate body tracking technology needs strong hardware. For GPU it requires an NVidia GTX 1070 or better. https://docs.microsoft.com/en-us/azure/kinect-dk/system-requirements#body-tracking-host-pc-hardware-requirements

Naturally, your specific hardware and your specific body tracking usage may be great, good, or poor. It is possible to body track on older GPUs (one of my test machines is a 7 year old Intel i7-3720 + NVidia GTX680). And possible to body track on CPU-only; albeit slowly.

I have explored other body tracking technology but have not yet found a solution that performs well and has a licensing model that works. One possible company is reworking their licensing/pricing and I'm waiting to see the outcome. I will keep looking :-)

Spa

Thanks for the detailed answer.
We're so lucky to have someone with your expertise working on the kinect for Max.

My main use of kinect is multi body tracking, with as short latency as posible. Reading the github linked discussion, it's quite surprising that the new sdk pipeline gives such poor result.
In case of using an 'optimal' GTX1070, do you have any input of the 'real' framerate obtained, latency, and not backloging?
Does it gives open/closed hand states?
Your experience, discussions?

With dpkinect2, I obtain around 5-6 frame latency at 60 fps (83 to 100ms) for my whole pipeline. From a movement to onscreen display with interaction.
2-3 frame for dpkinect2 skeleton output, 1f for smoothing, 1f for interactive scripts, 1f for display (projector latency).
I think this 83 ms delay is workable and aceptable by the final public/user. But I think that a 150ms delay will be already too much, and does not allow good experience and reactivity.

Is the azure sensor less suceptible to high ambient lighting, like K4W?
In particular with the fact K4W cuts the framerate in 2 at 15 fps (or even stop tracking) in case it does not get enough depth point tracked in a lighty environment?

diablodale

It is too early for me to provide performance statistics; I'll know more in a month or two as I have more functionality working and a good pass at performance tuning. I'm still writing core code ;-)

Microsoft's new body tracking API adds joints for handtip, and thumb. Same time, they removed all of the body properties: restricted, handstate (open,lasso,closed), and lean. Perhaps an open/closed hand state can be derived by looking at the coordinates of hand, handtip, and thumb. The coords of the three will always form a triangle. And a single line between thumb and handtip. Perhaps if the distance between thumb/handtip -or- the sum distance between all three is below some value...that could be considered closed. And if not closed, then open. Brainstorming ideas.

How does the Kinectv3 operate in "high" light scenarios? Microsoft writes in their specifications, "Global shutter that allows for improved performance in sunlight." I don't yet have experience with this. The color camera exposure can be manually controlled. The topic of strong infrared sources (ir leds, stage lights, the sun) is always present when infrared sensors are involved. Some users want this. They put infrared LEDs on dancers or use stage lights to identify and track movement using the Kinect's irmap. But that same stage light might make it more difficult for the SDK's neural net to calculate depth and 3d skeleton joint positions. More practical experimentation is needed with both the new sensor hardware and the software.

For the opposite, the new Kinectv3 color camera is more sensitive in "low" light. My office tends to be low-light and I do not experience a 15fps autoswitch. [FYI: the previous Kinectv2 sensor's color camera will automatically switch to 15fps in "low" light. And when it does, it forces the depth camera to do the same. No ability to control the behavior. As a hack, I sometimes attach an LED close and shine it directly into the color camera of Kinectv2 to get 30fps.]

From what I understand by reading Kinectv3 code/issues, this fps autoswitch was a problem again, then fixed, then half-fixed. I'll know more in a few weeks when I get to camera exposure controls. As I understand, we can switch the Kinectv3 to manual camera control to override any automatic behavior -- yet now we have to manually control exposure. :-/ The idea of a fps-priority framerate setting was removed from the Kinectv3 SDK. I need to get to that code and experiment more.

diablodale

Body tracking is available in the v1.3.20200905 update. Download at https://hidale.com/shop/dp-kinect3/

Body tracking has an additional download from Microsoft for the neural net data. Instructions are in INSTALL.md in the download.
Body tracking needs a recent+strong GPU for high framerate. Details above in this thread and the CHANGELOG.md in the download.
No fixed limit on the number of bodies it can track.
Align depth/ir/pointcloud to color resolution is supported; the reverse is not yet available.
Ongoing refactoring and bugfixes

More patches, examples, tutorials, helpfiles from dp.kinect2 now work. Change the object from (dp.kinect2) to (dp.kinect3) and try them. 👍

diablodale

Playermap, more post-processing, reworked body tracking GPU usage and framerates in the v1.3.20201007 update. Download at https://hidale.com/shop/dp-kinect3/

playermap output
align, undistort, and flipx works with more output
synchronization of depth, color, ir, player, and skelton data
more utilization of GPU/CPU in body tracking neural net
fixed alignment, flipx, OpenCL bugs and crashes

Your dp.kinect and dp.kinect2 patches might work by changing the object from (dp.kinect?) to (dp.kinect3). 🚧👍

diablodale

Hi all. New update v1.3.20201027 at https://hidale.com/shop/dp-kinect3/ includes improvements in backwards compatibility, infrared quality, skeleton joints, and alignment of coordinate spaces. Full changelog is in the download.

diablodale

Happy new year 🥳 I have a significant new update v1.3.20210105 at https://hidale.com/shop/dp-kinect3/

It has more reliable processing of Kinect data and workarounds for OpenCV issues. I have been diagnosing OpenCV issues for many weeks -- some of my fixes are now part of OpenCV itself. I can now focus more on dp.kinect3 features.

diablodale

Good progress included in the alpha update v1.3.20210207 at https://hidale.com/shop/dp-kinect3/

Significant improvements in parallel/async processing. Now can spread computation across CPU, iGPU, and dGPU with each working in parallel across and within.