# [ann] Computer Vision Segmentation Patch for download

Hey everyone,

I’ve been working on a little computer vision patch implementing an

interesting paper I read a few weeks ago. To that end, I made 2 new

externals along the way. One which calculates the median of an image.

If given a 3d buffer, it will calculate the temporal median at each

pixel, providing a nice and simple way to acquire the background image

of a scene. The other external computes an algorithm called the "A

Contrario Method" of segmentation.

It uses a statistical distance measure to calculate the most

meaningful portions of an image based on comparing the gradient of the

background image and a frame. I’ve implemented most of the algorithm

here but haven’t done the final refinement step. Also, the code is

highly beta and is thus very unoptimized.

In the download package, I’ve provided the new externals compiled for

OSX as well as the patch and source code for the algorithm’s external.

There’s also a schematic of the patch/source code and a PDF of the

paper. If anyone wants a Windoze version, contact me and if I get

bugged enough, I’ll make one.

Here’s the link: http://www.mat.ucsb.edu/~whsmith/vision.html . The

download link is at the bottom of the page.

cheers,

wes

looks promising

please consider that windows port

i was a bit quick to post.

this subeject is very close to my heart.

a lot of times i come across a papers like http://www.google.co.il/search?l&q=background+moving+seg mentation but i never could figure out what level of matematics is needed to translate most of those algorithms to machine langauge.

please if you can tell me a bit about your level of math and even better the process of tackeling such a task.

hope its not to much

yair

Hi Yair,

I too am often frustrated by computer vision and graphics papers.

Often (especially in journal papers), they leave so much out. It may

not seem so when you read the paper, but when you go to implement it,

there are many seemingly small decisions that are actually implicit

assumptions of the paper. These can be really tricky to sort out.

Something that really annoys me about the 2 fields above is the lack

of shared code. It would make things so much better as more people

could gain access to the ideas through seeing their implementation.

For this paper in particular, I has having quite a time figuring out

the statistical measure of significance. At first, I was oing off of

the IEEE version of the paper. Fortuitously, I went to one of the

author’s webpage and he had an extended version of the paper with a

simplified and approximated significance measure. This saved my ass

and I was able to proceed.

That said, my math skills are quite good as I have a degree in

electrical engineering. I’m mostly limited by obscure notation and

lack of details in a paper and somewhat a lack of thorough background

in the field although this is changing as I read and implement more.

Basically, it requires banging your head against the screen for many

months.

As far as the maths are concerned, you should know a bit of numberical

analysis and statistics. One thing that’s really useful to know is

how to take the gradient of an image.

best,

wes

PS….I didn’t mention this in my original email, but I was quite lazy

in handling the boundary conditions of the gradient function in the

xray.jit.probsegment code. The way I did it is quite wrong, but I was

just trtying to get the algorithm working in the first place and

didn’t really care about the edge pixels too much.

wes

Hey,

I just finished a spatial 3×3 median filter external for jitter as

well. I have run across this filter many times while reading CV

journals. It often runs just after thresholding to remove small salt/

pepper noise. A close operator is then often applied.

It runs quite quickly.

I also have a hough transform (for line finding) for jitter if anyone

is interested in trying it out.

If anyone is interested in testing these, please let me know. More

are on the way.

Christopher

I suspected from your oprevious emails that you might’ve made a median

filter. The Hough transform filter sounds quite interesting. I

implemented a really really slow one as a ptacher using GL render to a

matrix and accumulating the rendered curves. I love how the images

look from the Hough transform. Can’t wait to see what else you’ve got

in store for us.

wes

Have you done any work with jit.gl.slab and computer vision? A

median filter, dilate, erode, etc. were included for jit.gl.slab. A

nice start. I’ve seen optical flow, correspondence, sobel, canny,

and a host of other standard algorthms ported to the GPU using CG etc

(openvidia project for example). Some interesting implementations of

computer vision algos have also been ported to QuarzComposer, which

is an interesting piece of software.

I have only implemented rudimentary convolution type algos to pixel

shaders. This is definitely an interesting way to go although

ofttimes I want to use the resulting data for control signals which

means bringing something back into software…not a very efficient

thing right now. Plus, the data is usually floating point, so having

floating point textures more widely supported would be grat as well.

wes

Remeber this? For those interested, I’ve updated the source to use a

look up table for calculating the arctangent used in finding the

gradient of the video. It’s a really crude linear piecewise LUT, but

it’s accurate enough and it gives about 3 fps better performance on

320×240 video. The earl is

http://www.mat.ucsb.edu/~whsmith/vision.html . The new source is

linked to at the bottom of the page. For now, only osx is recompiled,

so if you want windows, download the osx package and compile it with

cygwin or visual studio.

best

wes