Low latency video input – seemingly impossible!
I apologize for the long message but it includes some useful resources that attempt to clarify the problem. I’ve had this problem for a number of years and have yet to find a reasonable solution.
How do I get low latency video from an external camera into jitter (or any software for that matter), perform simple manipulations and then send it out to a projector?
I need the projected image to not apply a noticeable delay to the original source. If I am filming someone talking I want the sound coming from their lips and projected image to appear "live" without latency. I’m at the point where I’d like the image to be better than SD but will settle for anything.
A humble request in 2012 with such powerful computing power no?
I have a very fast macbook pro laptop with great graphics card and a thunderbolt connection and firewire 800. I also have a Mac Pro desktop.
I own a canopus advc55 firewire and have tried many many different cameras with it. I still can’t achieve low enough latency. I’ve found that analog cameras seem to work better than digital cameras with analog outputs but it’s still not good enough. I’ve tried the same camera setup in modul8 and isadora with similar latency results. I assume this means that it doesn’t have to do with jitter or my not so great jitter programming.
I read about the new Blackmagic intensity. Sounds great! but after reading this thread (http://cycling74.com/forums/topic.php?id=38378) it doesn’t seem like a solution, especially considering that it uses the only thunderbolt port and can’t daisy chain to allow a projector. Lame!
It was suggested that the DFG 1394-1e was a good option but it is now discontinued. The Matrox MXO2 Mini seems ok but again doesn’t solve the thunderbolt daisy chain problem.
Frieder Weiss developed the Eyecon system and created an amazing visual system for Chunky Move. He suggests the FALCON PCI Bus Framegrabber. Unfortunately it will not work with a laptop and I don’t even think it works on Mac.
Vade wrote this article 5 years ago discussing an older intensity card but says that it "requires some fairly in depth knowledge of Jitter," to make work.
So after all of this I still am left without a solution. It’s frustrating because it seems like a simple operation that many people would want to do.
Are there any options left aside from getting a PC with the Falcon grabber?
Your best bet is to get a PC running windows xp.
Get the aiptek 5900 dv cam
It works as a high quality webcam.
(Onboard software controls let you control sharpness gamma contrast ect.)
It can be output live via its software that comes with it.
I used the same setup and displayed footage real time that I could tweak and control on my projector.
Matrox does make a PCI-e card for the MX02 mini – I have one and this works well in my Mac Pro tower. I also have the Expresscard interface, but you need a 17" MacBook Pro to make that work. Have not tried the Thunderbolt option and I likely won’t.
People have also reported good luck with using the Intensity Pro PCI-e card from Blackmagic, which is a cheaper system.
any update on this? I’m also searching for a portable low latency video input solution for the current MacBook Pro with Thunderbolt and USB 3.0. But it doesn’t have an Expresscard slot anymore. How is the latency with the MX02 mini?
Beware of the Matrox MX02 Mini. I have one with the PCI interface and there is a very long latency (feels like 1/4 or 1/2 sec!). I also have the BlackMagic Intensity PCI card and there is hardly any latency (about 1 frame). I assume the thunderbolt version is similar but I’ve never tried it. (Bummer about the lack of thunderbolt passthrough though.)
FYI: There is no "in-depth knowledge of MAX" required beyond the OpenGL optimizations that should be used for all capture and rendering pipelines anyway. (Capture in uyvy with "@unique 1", do rgb conversion with a shader, only bang the OpenGL renderer when a new frame arrives, etc) See the opengl tutorials and search for vade’s "jitter movie playback optimizations" for ideas. I think he mentioned the "in-depth knowledge" because that blog post covered several programming systems and some of them are more optimized out-of-the-box.)
You will never get low latency with a camera connected to the Canopus box because it converts your cam’s analog SD signal to DV which then must be decoded on the CPU causing unavoidable latency. I have the DFG box for analog SD capture and it has virtually no latency.
Overall I’m leaning toward the Intensity for your application.
Thanks for the advice. I’ll check out the Blackmagic stuff. If anyone has experience with either the Intensity Shuttle or Extreme, please tell. Seems like the Shuttle USB 3.0 doesn’t work with Mac. For Mac you need the Thunderbolt version.
I’m needing low latency for live processing of staged video capture (theater project). I’m also needing to upgrade my macbook from several gens.
I’m eyeing the BlackMagic Extreme for capture from hdmi (directly from an HMC-40).
I am wondering if any one has had experience with the new retina macbook pro that has onboard hdmi out. Would that solve problems of Jitter not being able to consume and broadcast a signal through the thunderbolt port – since you could broadcast out the htmi port instead?
I ask, because otherwise i would stick with the non-retina macbook pro to keep my firewire800…
sorry for reopening this older thread.
I am starting to work on a project where I will need 3 live-inputs with no (or almost no) latency. I would like to go better then SD.
I have different choices and wonder what would be the best way to go. Right now I am using a MacPro (actually a Hackintosh) which I could reconfigure into a Windows if necessary.
A) getting 3 HD-SDI security cameras and use a Blackmagic DeckLink Duo + 1 Blackmagic Decklink Extreme. I have both cards "lying around" but never tested if they work all at once. I also don’t know about quality on security cameras
B) setup a windows system and get 3 PCI Gamer Capture Cards (something like this:
http://www.amazon.de/StarTech-com-Express-Video-Capture-Karte/dp/B007U5MGBE/ref=sr_1_11?ie=UTF8&qid=1432898744&sr=8-11&keywords=hdmi+capture+card) (sorry, it’s a german site, but you’ll get the idea). The card is able to capture component or HDMI. That way I would have the choice of getting eather 3 HD(V) Cameras with component-out or HDMI-out.
C) something else I haven’t yet thought of :) Maybe a multi-stream capture card??
What’s your opinion? Where are my best chances to get 3 live-inputs at all and at low/no latency (720p would be ok): SDI, HDMI or Component?
Good day to all,
I have had some good experience with some simple USB webcams, especially using Logitech c310 (http://www.logitech.com/en-gb/product/hd-webcam-c310), I have encountered very little latency overall (around 1 or 2 frames, tops).
Is there any way to control those web-cams in terms of exposure, focus etc.?
I usually use FW / USB3 PointGrey cameras for machine vision.
They have the lowest latency I’ve encountered, and you can programmatically control exposure, shutter etc (on some of them).
(bear in mind usually projectors can introduce a lot of latency too),
I also haven’t used them in max/jitter, but directly in C++ with PtGrey SDK on windows, and libdc on mac. I’d be curious to hear if there are max externals. Actually I’d also be curious to hear if it’s possible to have a cross-platform Max patch that uses the relevant external. I.e. a windows .mxe which uses PtGrey SDK, and an osx .mxo which uses the libdc external, and they respond to the same messages / attributes etc. So you can develop a max patch that uses the camera and can seamlessly work on windows and mac? (In my native C++ apps I use #ifdef to include and compile the relevant SDK for the platform, so the same application code can be compiled and run on both platforms).
@Mr.L : on Mac I use this app: https://itunes.apple.com/en/app/webcam-settings/id533696630?mt=12
@Memo Atken: Yeah, Point Grey are great…mine is the poor man’s solution!
Re: cross-platform patches: I think this is totally doable. The externals should just have the same name and be in Max’s file path.
the advantage would basically be (besides low latency) to save a capture device, right?
Have you used more then 1 camera at a time? (Would MAX be able to recognize 3?).
I still see disadvantage in not having manual control and no preview for the actor.
How about image quality?
@Memo Atken: there are also industrial cameras with USB2.0. Are they sufficient, or better go for 3.0. (quite expensive).
@LSKA: good idea about the app. I will try out one of those logistic. It’s worth the money… Have you used more then 1?
Yes, I used up to 2 cameras, but I think it’s only a limit of my setup (I use a Macbook Pro with 2 USB ports, so far, I have been able to have only one camera for USB port working, I guess it’s a bandwidth/power consumption issue. I think with a desktop and more USB ports, you can have more cameras attached.
I have used multiple Blackmagic cards in a single machine, and have had good luck with this using the Black Syphon application, which provides feeds into Jitter via Syphon. The program maintained by Vidvox and has been very reliable.
Anecdotally, SDI is lower latency than analog, HDMI, or USB. Less overhead in the protocol. Most people I know who are looking for lowest latency are using SDI cameras connected to dedicated digitizers, but you’ll have to test the specific models you’re targeting.
For multiple SDI cameras I’d probably buy the Decklink Quad
As Memo points out, many lower end projectors themselves have large framebuffers, which results in high latency.
Hi all, I’d welcome any recent tips and recommendations on this topic as I’m embarking on a project that needs very fast response times to captured video input.
After reading the posts above I’ve put together the following shopping list of initial hardware:
A 3G-SDI camera that runs at least to 60fps:
And a choice of three capture cards:
AJA U-TAP (USB 3, up to 1080p60)
Blackmagic Mini Recorder (Thunderbolt, up to 108060i – not capable of 1080p60.)
Epiphan AV.io (Thunderbolt)
All claim to be low-latency, but I really don’t know which is truly fastest. The Blackmagic device only captures 60fps up to 720p, but I don’t think this will be a limitation as I’m likely to use that resolution.
I’ll work on whatever platform will respond fastest, which turn out to be a PC with a bus-based capture card, but from what I’ve read Thunderbolt 2 intrinsic latency should be similar. Anyone have an opinion on that? If I could use my MacBook Pro without compromising anything that would definitely be a bonus!
Thanks in advance for any help.
I’ve used the Blackmagic Ultra Recorder HD-SDI input box with great success.
You can use with natively within Jitter or you save some CPU cycles with the Black Syphon app and bring the signal into jitter thru ‘jit.gl.syphonclient’.
Black Syphon app here – http://vdmx.vidvox.net/blog/black-syphon
This combination has given me the lowest latency.
At 720p60/50 you should be able to keep it down to 2 frames of delay which is perfectly acceptable as other things in the output chain will add latency.
I’ve used this Sonnet PCI expansion box with a thunderbolt loop port to continue feed into 2nd monitor or projector –
You’ll need a capture as well, I’ve used a BMD Decklink Duo which gives you 2 inputs.
Be careful in your jitter patches as bringing jitter thru syphon into a matrix (CPU) and back to GPU can add up to a frame of latency. Best practice is to keep it all on the GPU if low latency is your intention.
Also it can depend on which camera you use, I’ve tried a few PTZ Sony cams which added 2-3 frames of latency.
I’ve used the Blackmagic Mini Recorder extensively, with both HDMI and SDI inputs. Works perfectly. I do use Black Syphon to input textures into Max, as I posted previously. This is the highest performance solution I’ve found, particularly on today’s multi-core systems – the capture/preview process is put onto another thread by running a separate application. For that reason I do recommend Blackmagic.
I haven’t tried the Epiphan or AJA solutions, but they look useful for systems that only have USB 3.0 ports available.
Thanks very much for those helpful insights. My main application will be generating reactive visuals from musicians performing live, starting with a conductor’s hands. Before committing to the above setup I’m investigating Point Grey specialty cameras mentioned upthread such as the following:
This is mono 720p but very high frame rate (> 100fps). This camera choice would mean I wouldn’t be able to output visuals superimposed on the source image, but I wasn’t likely to do that anyway. This is a USB3 device, so no capture card is needed but comments above suggest this could still be slower to respond than SDI+Thunderbolt capture with Black Syphon…
I see that cv.jit needs a mono image, so perhaps a mono camera could have me a few cycles too, or with modern GPUs is that something that I can ignore? Since I’m capturing hands I wonder if IR sensitivity is something I should consider. Lots of options to consider! Thanks again for any opinions shared.
I wouldn’t worry about RGB2Luma transforms, this can be done quite simply in a shader. There are, however, several things for you to think about here:
– cv.jit operates on Jitter matrices, not textures in OpenGL. So if you’re going with the recommended Black Syphon workflow you’ll have to do a readback from the Syphon texture to a matrix, which you can do with jit.asyncread. This isn’t complex, but can add some overhead to your patch that you’ll have to account for.
– while cv.jit will adequately allow you to track hands under ideal conditions, you may run into image contrast issues and/or occlusion issues (when the hands pass in front of the body or each other). This will depend highly on where you place the camera, and the amount of control you have over the stage lighting. Without knowing more about your proposed setup it’s hard to comment.
Have you considered using the Kinect 2 instead? This has several advantages, including being impervious to most lighting issues given that it’s a time-of-flight sensor. It also has a robust skeleton model and an available Max implementation in dp.kinect, although it’s not free. I’ve seen very good hand tracking done with this and can recommend it. However, the camera’s size and the cable length limitations of USB3 may make installing on a stage difficult. There are ways around the latter but they do add to the cost of the setup. The Kinect 2 runs on Windows only as far as I know.
Others on this forum have been working with Leap Motion for hand tracking and are having some success there. I have one but haven’t worked with it much since it first came out. You’d have to work out the placement of the sensor, either on the body or (perhaps) on a music stand?
Thanks, Jesse. I did start off intending to use Kinect, but the first couple of people I approached steered me away citing latency issues. In theory I can get away with much less rich a model than Kinect delivers (no need for depth data, skeletal model, etc.) which all things being equal should mean quicker response. But based on what you’ve said I think I will do my own testing to confirm Kinect really is not fast enough.
The case I’m most interested in is the conductor’s hand movements, which is also fairly easy vision-wise as the hands are almost never obscured. And the background will be a darkened auditorium. I would love to pick up the tip of the baton but that might be tricky unless I add an IR emitter to it.
Sadly, I don’t believe the range on a Leap Motion is sufficient for most conductors – about two feet – because that’s nearly ideal.
If anyone has knowledge of how low latency on Kinect can get, please chime in.
regarding the chameleon 3: i currently can’t check but from what i remember, latency seemed low… actually i would just call point grey and ask them about it, they have an amazing support.
what is in important however is, that you will have to run the camera in yuv mode so it only delivers a little above 90 fps. the background is, that for the full framerate (and least latency), the camera has to run in raw mode where it delivers un-debayered images which then get debayered on the host machine by the special driver. unfortunately this driver-debayering though does not work properly with max (at least i did not manage to work) so you have to do the debayering in-camera – which costs framerate and probably introduces some latency too. from what i remember i did find no configuration using the directshow-drivers i would exceed 105fps out of max plus the debayering looked bad – which is why i used yuv.
having said that, the camera itself however could even get faster then 150 fps by using pixel binning (at cost of resolution)…
hope that helps
Thanks, Karl. I think I will give Point Grey a call, as you suggest.
In related latency news:
I tested the built-in webcam on a MacBook Pro and found the time-to-display for a video signal to be ~85ms, i.e. ~6 frames when counting at 60fps. This is slower than I expected. It certainly is a noticeable delay. Although, it does corresponds to about 100 feet of sound travel time, so would look OK to some of the crowd in a larger performance space…
After a great deal of almost-scientific testing I believe the best solution for my realtime application (visuals derived from conductor’s hand motions) is going to be Kinect rather than by-hand video capture and processing. It’s hard to say anything definitive without developing more realistic tests for both approaches, but given that low latency is going to be critical, Kinect does seem like it’s going to be the quickest. Here’s a summary of what I found:
+ On a mid-range Windows laptop* dp.kinect2 will display skeleton and hand motion in Max that lags reality by about 70ms.
+ On a 2015 MacBook Pro, captured video can be displayed by Max that lags reality by about 80ms using the Thunderbolt based BMD UltraStudio MiniRecorder connected to an HD-SDI 720p60 camera.
+ The AJA U-TAP SDI USB 3 capture card had 85ms of latency through Max on my Windows laptop, but this rose to 115ms on the Mac.
This is not an apples to apples comparison because the Mac/video scenario is burdened with displaying 60fps video in jit.window which may or may not be comparable to the visualization work I eventually do based on the input video. And the 70ms lag seen on Kinect already includes the detection of hands, deriving positions in 3D, and then rendering a simple 3D model.
The other factor which points towards Kinect is CPU usage. Rendering Kinects 3D skeletal display only required 11% of the CPU – leaving plenty for visualizations – where the Mac/video scenarios required a great deal more.
Using HDMI video sources resulted in more latency than SDI, as expected.
The kinect skeletal model certainly feels closer to realtime than the video, but as the difference is only ~10-15ms that’s mostly a perceptual issue.
* Boring details:
+ The MacBook Pro used is a 2015 15" Quad Core Intel i7 at 2.8GHz with 16GB RAM and AMD Radeon Radeon R9 M370X GPU 2GB VRAM.
+ The Windows laptop used is an Asus UX303LN with Dual Core i7 at 2GHz with 16GB RAM and NVIDIA 840M GPU.
+ Microsoft requires a USB 3 controller be dedicated to the Kinect, but as I’m using a laptop it’s shared with the trackpad. (I disabled other USB devices including the webcam.)
+ For the video test I output jit.qt.grab to a jit.window on an external monitor with sync and double-buffer disabled.
+ I tested the additional rgb2luma step separately which came at about a 10% CPU penalty, which seemed high to me.
+ Develop more realistic tests on Kinect, and perhaps camera based too.
+ Double-check there weren’t optimizations I missed that made could have lessened video latency. Anyone have any suggestions?
thanks much for sharing your findings!
You’re very welcome.
One additional note: I did try the Black Syphon trick that Jesse and Bill suggested upthread and this did indeed reduce CPU usage as advertised, But for me at least, it added up to 35ms of latency over the straight jit.qt.grab approach. As someone who’s new to Max I worry there are optimizations I could have made in my test patches that would change the results, so it’s only a first pass.
And if anyone’s wondering how I did the timings:
For video capture:
+ I used a camera flash to create a brief pulse of light, pointed at a 60Hz monitor showing the Max output window.
+ I recorded the scene on an iPhone at 240fps.
+ I counted the frames recorded between the original flash and the playback shown in the Max window. (Averaged over several cases to minimize impact of recording timing errors and 60Hz granularity of the monitor display.)
+ I video recorded a scene that included my hand being raised and lowered in front of the sensor and being rendered by a dp.kinect2 Max patch.
+ Looking at the 240fps video I found the apex of the real hand movement, and the rendered hand, and then counted the frames between them. I averaged over >10 samples to minimize errors.)
Thx for the info.
I’m curious which camera and monitor you used for the testing?
I’ve been able to get latency down to 2 -2 1/2 frames using BlackSyphon as the input.
Which version of Max? Where you using native BMD capture within jit.grab?
Sorry for all the questions!
Hi Bill, are you counting frames at 60Hz, in which case 2-2.5 frames would be 33-40ms, or 30fps which would be 66-83ms?
the setup was as follows:
For Mac video capture:
Max 7.3.1 (32-bit, apparently?)
Asus PA279Q 60Hz monitor on HDMI
Marshall CV150-M SDI camera at 720p60
BMD Thunderbolt MiniRecorder native gave a latency of 80ms, but via BlackSyphon this went up to 115ms.
Kinect for Xbox One with USB 3 adapter
Current Kinect SDK
Max 7.3.1 64-bit
Current Windows 10
I just used the laptop’s 60Hz display. I might redo the timings on the Asus HDMI display just in case that’s a factor.
Well, it was a good idea to do the Windows timings outputting to the same monitor as the Mac tests because they gain 35ms, meaning both the Kinect and AJA U-TAP capture timings increase to 120ms. Turning off double-buffer and sync didn’t change the timings.
This is interesting because I originally switched to timing the Mac video capture on an external monitor as that showed quicker response times than with jit.pwindow in the patch on the laptop’s LCD.
This means that the BMD MiniRecorder case on Mac is now more noteworthy as it will show video on an external monitor with just 80ms of lag.
Years ago Imaging Source used to make a digitizer for the mac that was blazing fast. Unfortunately they discontinued it and now only have it available for PC. There are two versions: SDI and RCA inputs. Output is USB.
I’m curious if anyone with a PC has used this product.
It turns out that intrinsic display latency is a factor that has to be considered so I did some tests on the Asus PA279Q I have been using:
it consistently lags the Windows laptop LCD by 20ms over HDMI. That leaves another 15ms lag found in my external monitor timings which I guess we attribute to the overhead of driving a larger screen area?
For my MacBook Pro, the Asus display lags the laptop by 25ms lag over HDMI. Over DisplayPort it’s fractionally less at 22ms. Interestingly the DisplayPort timings were noticeably more variable than with HDMI.
So the timings I reported above need to be considered with this in mind. In theory, overall system latency could be lessened from the ones I show by choosing an output device with an intrinsic latency that’s less than 20-25ms.
While all of this is interesting, it seems to me you may be losing the thread of what you’re actually trying to measure. I’d say that a measurement of input –> usable gestural datastream would be more appropriate and edifying for you since that’s what you’ve said you’re interested in. Of course in camera-based testing that would require you to build out a cv.jit test rig, but something simple shouldn’t take too long.
The initial tests you posted suggest that the Kinect2 would be a more appropriate choice given 1) apparent latency, 2) CPU load, 3) availability of tracking data, and 4) resilience of the system given variable stage lighting conditions. I also think field of view/resolution might be an issue for visible light cameras unless the conductor faces forward consistently and constrains their movements. As I said in my earlier post, with occlusion typical CV strategies will fail – and this may be unavoidable if your conductor turns or gestures as they might typically.
Keep in mind also that many consumer projectors have significant frame buffering that can add to observed latency, and cable lengths/topologies also can add to this. This won’t impact your testing for now, but would certainly impact total system performance.
Hi Jesse, you are of course correct that a more useful test will be one that is closer to the end-result, including some reacting visual elements. Unfortunately I’m brand new to Max, so started with the modest goal of of reassuring myself that ‘instantaneous reactive visuals’ are probably achievable, and to see if there are any clear winners or dead-ends in terms of technical approach. I’m now sufficiently optimistic that I’m going to create some better tests. (Although if ‘instantaneous’ means within 100ms then there’s not going to be much time for graphical creativity once you allow for input and projector display.)
The trouble is my beginner Max patches are not likely to be very efficient. I may have found a collaborator fluent in Max, which will be a big help.
Some updates and corrections to timings upthread:
Latency of video captured and displayed in a Max window from BMD MiniRecorder via BlackSyphon, when NOT shown on an external monitor, is ~45ms. (I think the much slower time upthread was due to the external monitor latency plus me forgetting to turn off sync and double-buffer).
CPU usage on my Mac was 5% for this setup.
If the native BlackMagic input is used, latency was 60ms, and CPU usage was 10%.
Some random display device latency measurements: consumer Toshiba TV: 60ms. Panasonic projector: 90ms!
A useful site: http://www.displaylag.com/display-database/
And I was shocked to find that my 25ft HDMI cable adds 8ms of lag over the 6ft cable!
Next up: Timings for patches that are more representative of the final application.
According to the manufacturer of the cam you are using, it has 1-2 frames of delay. I assume depending on frame rate.
Hi, Adrian. Since you’re interested in this subject, here’s an excelent article from anandtech.com dealing with input lag (from input to output). Despite the application scenario (mouse and keyboard for videogames), you get a sense of the many variables in play from initial input until final output. And, since we’re at it, that doesn’t even reflect our own perception latency for the visual stimuli…
Forums > Jitter