What are they looking at? Automatic Simultaneous Dyadic Gaze Detection from Videos
Skip to main content
eScholarship
Open Access Publications from the University of California

What are they looking at? Automatic Simultaneous Dyadic Gaze Detection from Videos

Creative Commons 'BY' version 4.0 license
Abstract

Gaze is cardinal for interaction since it allows us to visually interact with the environment and understand the attention and intent of others. Gaze also plays a critical role in HRI tasks such as object recognition and manipulation, as a robot can use gaze to direct its attention to specific objects or areas of interest in its environment. In this study, we automate gaze estimation for various types of gaze behaviours (such as turn taking, joint attention, gaze following, gaze aversion and mutual attention) in a natural dyadic interaction using videos without wearable cameras or eye trackers that are implementable on a robot. There is no single dataset that covers all of the different gaze and scene combinations that is address in this paper. We propose a model that utilizes the manual annotation of gaze targets in a natural dialogue setting and generate simultaneous gaze prediction for both parties in the video, along with attention heatmaps that provide exclusive information of the target object-of-interest in the scene, by also providing out-of-scene gaze predictions. Our model performs better than the baseline methods that currently exist and the data that was generated is available for different categories of gaze

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View