Autonomous mobile robots operating in realistic, populated environments require powerful perceptional capabilities to safely navigate through their surroundings, detect potential users and interact with them. Our research group is equipped with a MetraLabs Scitos G5 mobile platform. It is mainly used in student projects to introduce motivated students into the challenging fields of computational perception, human-machine interaction and autonomous robot control.
Previous related Projectgroups (Partybot, CampusGuide) developed software for the Scitos platform including a robotic assistant using web-based services like Open Street Map for navigation, and the realization of a service-based attention architecture with OSGi.
When persons interact, non-verbal cues are used to direct the attention of persons towards objects of interest. Achieving joint attention this way is an important aspect of natural communication. Most importantly, it allows to couple verbal descriptions with the visual appearance of objects, if the referred-to object is non-verbally indicated. In this contribution, we present a system that utilizes bottom-up saliency and pointing gestures to efficiently identify pointed-at objects. Furthermore, the system focuses the visual attention by steering a pan-tilt-zoom camera towards the object of interest and thus provides a suitable model-view for SIFT-based recognition and learning. We demonstrate the practical applicability of the proposed system through experimental evaluation in different environments with multiple pointers and objects.