Friday, May 11, 2007

DARPA Urban Challenge: Obstacle avoidance - Isolating important cues

I have had requests from several parties to give an overview on the type of obstacle avoidance scheme that might be most promising. Right now, we (Pegasus) are still evaluating some of these, so this entry should not be construed as part of our algorithm unveiling entries but rather a general overview we did a while back. It is important to realize that the main contribution of this entry is really about defining a hardware + software solution to the localization of cues that will be later learned/categorized. One large component of an obstacle avoidance solution is the machine learning/statistical device used to identify rapidly these cues as problematic for the autonomous vehicle or not. This is not unlike the human cortex (see reference section [1] [2] [3]).

In the case of the Urban Challenge, we are facing not only stopped obstacles but moving one as well. The moving obstacle have behaviors from which one needs to learn from as well. In other words, in the case of vision, a lot of work boils down to producing some amount of cues/features (a small number) from a very large set of data (pixels from an image). In some areas of computer science this is called dimensionality reduction.

  1. Stereo-imaging:
    1. The fly algorithm, a robust stereo algorithm using real time a genetic algorithm (yes, there is such thing as real time genetic algorithm!) and has been tried on cars. and specifically to has been used to avoid people and other objects. The initial thesis with the algorithm is in french. Improvement over the thesis have been focused on the car driving experience.
    2. There are also numerous commercial solutions as listed by the folks at V2_lab's where they discuss each of them. I found this entry pretty revealing about the state of the affairs with regards to stereovision, you have to look at the comment section
      For most stereo matching algorithms the Firewire cameras produce higher quality uncompressed images that do not wreak havoc on sensitive feature detectors. I use the Unibrain Fire-I board camera http://www.unibrain.com/index.html with the CMU 1394 Digital Camera API (Windows), which gives you very complete control and works with just about any Firewire camera, because they all use the same standard interface. http://www.cs.cmu.edu/~iwan/1394/ . When I read the technical reports for the 2005 DARPA Grand Challenge almost every report showed pictures of vehicles equiped with stereo pairs of cameras, but at the race just about all of them had been removed, presumably because of basic issues such as camera synchronization.


  2. Monocular evaluation of the distance field. There are two approaches that caught our attention:
    1. Vision-based Motion Planning for an Autonomous Motorcycle on Ill-Structured Road, Dezhen Song, Hyun Nam Lee, Jingang Yi and Anthony Levandowski from the bike entry at the last Grand Challenge, and
    2. Depth Estimation Using Monocular and Stereo Cues / 2197, Ashutosh Saxena, Jamie Schulte, Andrew Y.Ng and Learning Depth from Single Monocular Images, Ashutosh Saxena, Sung Chung, and Andrew Y. Ng. In NIPS 18, 2006. [ps, pdf]
    3. (With regards to Monocular information, we should not forget the excellent Mono-SLAM : This website has a Matlab implementation of the SLAM using monocular vision. There is a timely thesis on the subject here where it looks like using two cameras implementing both the monoslam algorithm.)
  3. A Random Lens Imager, it is a hardware implementation of a totally new concept in data processing known as compressed sensing (don't ask anybody around you about it because it is too new). It needs only one camera but much of the work goes into the calibration.

References:

[1] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber and T. Poggio. Object recognition with cortex-like mechanisms. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 29 (3), pp. 411-426 , 2007

[2] T. Serre. Learning a dictionary of shape-components in visual cortex: Comparison with neurons, humans and machines, PhD Thesis, CBCL Paper #260/MIT-CSAIL-TR #2006-028, Massachusetts Institute of Technology, Cambridge, MA, April, 2006
(Page 154-163 have the model parameters).

[3]
T. Serre, L. Wolf and T. Poggio. Object recognition with features inspired by visual cortex. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), IEEE Computer Society Press, San Diego, June 2005

Extended paper: [CBCL Paper #243/AI Memo #2004-026] and the code is here.

1 comment:

Anonymous said...

We actually have an implementation meant for obstacle avoidance in cars at:
http://ai.stanford.edu/~asaxena/rccar/

I was planning to upload the code as well.

Printfriendly