For achieving significant levels of autonomy, legged robot behaviors require perceptual awareness of both the terrain for traversal, as well as structures and objects in their surroundings for planning, obstacle avoidance, and high-level decision making. In this work, we present a perception engine for legged robots that extracts the necessary information for developing semantic, contextual, and metric awareness of their surroundings. Our custom sensor configuration consists of (1) an active depth sensor, (2) two monocular cameras looking sideways, (3) a passive stereo sensor observing the terrain, (4) a forward facing active depth camera, and (5) a rotating 3D LIDAR with a large vertical field-of-view (FOV). The mutual overlap in the sensors' FOVs allows us to redundantly detect and track objects of both dynamic and static types. We fuse class masks generated by a semantic segmentation model with LIDAR and depth data to accurately identify and track individual instances of dynamically moving objects. In parallel, active depth and passive stereo streams of the terrain are also fused to map the terrain using the on-board GPU. We evaluate the engine using two different humanoid behaviors, (1) look-and-step and (2) track-and-follow, on the Boston Dynamics Atlas.
Related links
Details
Title
Perception Engine Using a Multi-Sensor Head to Enable High-level Humanoid Robot Behaviors
Publication Details
2022 International Conference on Robotics and Automation (ICRA), pp.9251-9257
Resource Type
Conference proceeding
Conference
International Conference on Robotics and Automation (ICRA) (Philadelphia, PA, USA, 05/23/2022–05/27/2022)
Publisher
IEEE
Grant note
N00014-19-1-2023 / ONR (10.13039/100000006)
80NSSC20M0197 / NASA (10.13039/100000104)