PDA

View Full Version : Disparity maps: Not smart enough



jwatte
05-04-2014, 11:27 PM
So, while waiting for UPS ground to slowly crawl the Onyx Fire II boards into the Amazon warehouses, I'm turning back to my stereo vision project for my rover.

Here's a screen shot of two calibration images and the disparity map calculated from them:

5530

Thinking about how disparity maps work, I'm not sure that's actually such a good way to do 3D/depth reconstruction. I want to use this mostly for "navigation" and "obstacle determination," which means that there will be a known ground plane which I could almost hard code (or at least estimate from an IMU.) A better reconstruction algorithm would start out with the theory that there is flat, unobstructed ground ahead of it, and then build up areas of evidence for this theory (passable) and areas of contrary evidence (blocked) based on what it already knows of the calibrated camera project. The current OpenCV methods just use projection to rectify the images, and then goes back to matching pixels to each other along scanlines, which seems dumb.

Ideas? Suggestions? Are there working algorithms like this implemented somewhere already?

Hey, what do you know? http://graphics.stanford.edu/~pmerrell/Pollefeys_UrbanReconstruction07.pdf

Will UrbanScape become open source any time soon? It *is* my tax money paying for it after all...

Gertlex
05-11-2014, 01:38 PM
I got nothing, but the other day I did randomly see your question on OpenCV's self-hosted stack overflow regarding this :D


A better reconstruction algorithm would start out with the theory that there is flat, unobstructed ground ahead of it, and then build up areas of evidence for this theory (passable) and areas of contrary evidence (blocked) based on what it already knows of the calibrated camera project.

I've yet to do any reading on calibration images and disparity maps and stereoscopic-y things. How would you handle the transition from e.g. grass to pavement? Or would you in the Robomagellan case desire to stay on grass, and thus default to treating the pavement as an "obstacle"?

jwatte
05-12-2014, 11:20 AM
How would you handle the transition from e.g. grass to pavement?

So, with stereo, the important part is whether the left and right cameras see the same color for the same projected world coordinate or not. As long as the pavement and the grass is level, the stereo should still get the same view.

One assumption might be that the ground would have some uneven-ness, so you have to search the image for a "nearby" correlation. The delta distance of that correlation compared to canonical would tell you how much higher/lower that patch is. For 3" grass, that would probably work fine!

tician
05-12-2014, 12:53 PM
There was an independent stereo matching and disparity finding library (http://www.cvlibs.net/software/libelas/) that we briefly experimented with in the lab a few years ago. Unfortunately, we were trying to use multi-megapixel images (full-res from fuji stereo cameras) and it was crapping out on anything more than one megapixel. IIRC, I was thinking it was the feature/reference point detection grid being too small relative to the image size, so it would find too many reference points and rarely be able to match them correctly. Not sure if that particular belief was correct or if the multi-megapixel issue has been fixed since then.

jwatte
05-12-2014, 01:27 PM
Hmm. That still looks disparity based. And grayscale, to boot. Also, one second for a megapixel image? I'm looking for 10 Hz at 640x360...

Rectifying both images in real time takes about 30% of one CPU core right now at 30 Hz. That's using software only on a i5, and it's converting from YUYV to RGB at the same time. I can probably optimize that a little bit. I'm happy to throw one or two CPU cores at the "find stereo" problem.

I made some progress last night, using my "assume there's a ground plane" algorithm.

A red/blue visualization of the input data post rectification and ground plane reprojection:

5579

Note that things "in the ground plane" converge; things above that plane diverge.
I'm considering 2D feature detection/matching to find "true spans" of matches/mismatches, rather than just "color based 1D" like disparity maps do.