Gaze is known to be a dominant modality for conveying spatial information, and it has been used for grounding in human-robot dialogues. In this work, we present the prototype of a gaze-supported multi-modal dialogue system that enhances two core tasks in human-robot collaboration:
- our robot is able to learn new objects and their location from user instructions involving gaze, and
- it can instruct the user to move objects and passively track this movement by interpreting the user’s gaze.
We performed a user study to investigate the impact of different eye trackers on user performance. In particular, we compare a head-worn device and an RGB-based remote eye tracker. Our results show that the head-mounted eye tracker outperforms the remote device in terms of task completion time and the required number of utterances due to its higher precision.