sylvain

for a project i want to analyze live television images and find any subtitles in the image (if present) and convert it to text. Anyone any Max/MSP/Jitter or Java solutions? The framerate is not important, it will run in a seperate thread anyways :p

subtitles-to-text-ocr

Quote: sylvain wrote on Tue, 20 January 2009 16:46

----------------------------------------------------

> for a project i want to analyze live television images and find any subtitles in the image (if present) and convert it to text. Anyone any Max/MSP/Jitter or Java solutions? The framerate is not important, it will run in a seperate thread anyways :p

Wow, good luck with that... as I understand it, OCR is very tough. I highly doubt there's a straightforward way to do it in Max or Java, unless there's a trick workaround which takes the image and processes it using a separate scanner/OCR application. This "sending-processing-getting results" *might* be possible, but I'm not sure.

It's a bit like speech recognition---takes a ton of pretty complex programming to get it to work well, as there's so much "fuzziness" in the reading/sampling and analyzing. If you've visited aka's page --- wiimote etc. objects which rock! --- he has an object for speech recognition, but it doesn't do it by itself, it's an interface to utilize the built-in Mac speech recognition capabilities. Maybe something like that would be possible for OCR, especially if there are free or built-in apps which can handle the heavy lifting.

As far as getting the video and messing with it, Max/Jitter is definitely your friend though :) maybe you'll find other things you want to do with it if the OCR thing falls through. You *could* try some basic recognition on simple text by comparing individual images (submatrix of main image) to stored sample letters/numbers (in their own matrices), analyze the difference in levels using jit.op @op - and abs, but you'd need to decide what to do when things aren't identical (which will be all the time). Tolerance? Fuzziness? Similar levels but wrong shape? Different sizes of letters to analyze, how to scale them to fit? These are tough issues ... but it would definitely be cool if you got something working!

Quote: sylvain wrote on Tue, 20 January 2009 16:46
----------------------------------------------------
> Hi all,
>
> for a project i want to analyze live television images and find any subtitles in the image (if present) and convert it to text. Anyone any Max/MSP/Jitter or Java solutions? The framerate is not important, it will run in a seperate thread anyways :p
>
> Hope anyone has an idea.
----------------------------------------------------

It's a bit like speech recognition---takes a ton of pretty complex programming to get it to work well, as there's so much "fuzziness" in the reading/sampling and analyzing. If you've visited aka's page --- wiimote etc. objects which rock! --- he has an object for speech recognition, but it doesn't do it by itself, it's an interface to utilize the built-in Mac speech recognition capabilities. Maybe something like that would be possible for OCR, especially if there are free or built-in apps which can handle the heavy lifting. 

As far as getting the video and messing with it, Max/Jitter is definitely your friend though  :)  maybe you'll find other things you want to do with it if the OCR thing falls through. You *could* try some basic recognition on simple text by comparing individual images (submatrix of main image) to stored sample letters/numbers (in their own matrices), analyze the difference in levels using jit.op @op - and abs, but you'd need to decide what to do when things aren't identical (which will be all the time). Tolerance? Fuzziness? Similar levels but wrong shape? Different sizes of letters to analyze, how to scale them to fit? These are tough issues ... but it would definitely be cool if you got something working!


I know OCR is very tough. But i have seen programs which implemet it. EverNote for instance (although they use a server i think). It doens't really matter if it is Java or Max or anything else. If i can start OCR processing from the command line, i can also start it within a MXJ object. Also server-sided OCR can be handled this way. So the actual question is: is there a free/cheap OCR application available that can be controlled through the command line / applescript / http requests etc.

there is subrip(windows+src) and d-subtitler(mac) that are specifically for

subtitle extraction from videos. but they expect canned videos,need training

i think best method is to find an ocr program and feed it the subtitled area

i dont know for mac but for windows there are many good ones. like omnipage.

the better ones can be traind and automated.

On Sat, Jan 24, 2009 at 6:37 PM, sylvain wrote:

> I know OCR is very tough. But i have seen programs which implemet it.

> EverNote for instance (although they use a server i think). It doens't

> really matter if it is Java or Max or anything else. If i can start OCR

> processing from the command line, i can also start it within a MXJ object.

> Also server-sided OCR can be handled this way. So the actual question is: is

> there a free/cheap OCR application available that can be controlled through

> the command line / applescript / http requests etc.

there is subrip(windows+src) and d-subtitler(mac) that are specifically for
subtitle extraction from videos. but they expect canned videos,need training
and are ment for very specific usage.

i think best method is to find an ocr program and feed it the subtitled area
from your grabbed video feed as images.
i dont know for mac but for windows there are many good ones. like omnipage.
the better ones can be traind and automated.

On Sat, Jan 24, 2009 at 6:37 PM, sylvain  wrote:

>
> I know OCR is very tough. But i have seen programs which implemet it.
> EverNote for instance (although they use a server i think). It doens't
> really matter if it is Java or Max or anything else. If i can start OCR
> processing from the command line, i can also start it within a MXJ object.
> Also server-sided OCR can be handled this way. So the actual question is: is
> there a free/cheap OCR application available that can be controlled through
> the command line / applescript / http requests etc.
>

subtitles to text (ocr)