Introduction.- State-of-the-Art.- World Model Representation.- Methods for Robust and Accurate Image Acquisition.- Environmental Visual Object Recognition.- Visual Uncertainty Model of Depth Estimation.- Global Visual Localization.- Conclusion and Future Work.- Bibliography.
This book provides an overview of model-based environmental visual perception for humanoid robots. The visual perception of a humanoid robot creates a bidirectional bridge connecting sensor signals with internal representations of environmental objects. The objective of such perception systems is to answer two fundamental questions: What & where is it? To answer these questions using a sensor-to-representation bridge, coordinated processes are conducted to extract and exploit cues matching robot’s mental representations to physical entities. These include sensor & actuator modeling, calibration, filtering, and feature extraction for state estimation. This book discusses the following topics in depth:
• Active Sensing: Robust probabilistic methods for optimal, high dynamic range image acquisition are suitable for use with inexpensive cameras. This enables ideal sensing in arbitrary environmental conditions encountered in human-centric spaces. The book quantitatively shows the importance of equipping robots with dependable visual sensing.
• Feature Extraction & Recognition: Parameter-free, edge extraction methods based on structural graphs enable the representation of geometric primitives effectively and efficiently. This is done by eccentricity segmentation providing excellent recognition even on noisy & low-resolution images. Stereoscopic vision, Euclidean metric and graph-shape descriptors are shown to be powerful mechanisms for difficult recognition tasks.
• Global Self-Localization & Depth Uncertainty Learning: Simultaneous feature matching for global localization and 6D self-pose estimation are addressed by a novel geometric and probabilistic concept using intersection of Gaussian spheres. The path from intuition to the closed-form optimal solution determining the robot location is described, including a supervised learning method for uncertainty depth modeling based on extensive ground-truth training data from a motion capture system.
The methods and experiments are presented in self-contained chapters with comparisons and the state of the art. The algorithms were implemented and empirically evaluated on two humanoid robots: ARMAR III-A & B. The excellent robustness, performance and derived results received an award at the IEEE conference on humanoid robots and the contributions have been utilized for numerous visual manipulation tasks with demonstration at distinguished venues such as ICRA, CeBIT, IAS, and Automatica.