THESIS
2020
1 online resource (xvi, 109 pages) : illustrations (chiefly color)
Abstract
How humans look at their visual environment conveys rich information about them:
their attention, intent, and mental state. Eye gaze provides a spatiotemporal measure
of cognitive state and the visual processes that guide human behavior in various environments.
In this thesis, we first focus on humans and investigate how visual behavior
interacts with physiological state. We then consider how human gaze, which as an indicator
of attention reveals human strategies for selecting information in many tasks, can
benfit learning in artificial intelligent systems.
We collected measurements of eye gaze while human subjects view scenes from different
viewpoints. Frontal views are the most common view for humans. Eye gaze collected from
frontal views can reveal subjects' mental and cognitive stat...[
Read more ]
How humans look at their visual environment conveys rich information about them:
their attention, intent, and mental state. Eye gaze provides a spatiotemporal measure
of cognitive state and the visual processes that guide human behavior in various environments.
In this thesis, we first focus on humans and investigate how visual behavior
interacts with physiological state. We then consider how human gaze, which as an indicator
of attention reveals human strategies for selecting information in many tasks, can
benfit learning in artificial intelligent systems.
We collected measurements of eye gaze while human subjects view scenes from different
viewpoints. Frontal views are the most common view for humans. Eye gaze collected from
frontal views can reveal subjects' mental and cognitive state during interaction with the
environment. For example, the way they explore a scene with their eyes may reflect their
mood. The gaze of human experts may also provide rich information that can be exploited
by imitation learning. Bird's-eye views are less common. The human views a scene from
overhead, much like a game player in many video games. However, such information can
prove useful, as humans are experts in understanding social rules and behaviors. Thus,
gaze can be used as an indicator of human strategy, especially in predicting interactions
between humans in dynamic environments.
We suggest several ways to utilize gaze information in artificial intelligence systems.
For vision-based autonomous driving, we propose to incorporate the gaze map using the
gaze modulated dropout, which de-emphasizes task-irrelevant information. For robot navigation in crowds and the closely related problem of pedestrian trajectory prediction, we
use attention weights estimated from human gaze data to modulate interactions between
humans in graph-based models. We find that tasks that humans are experts in, like driving,
crowd navigation, and trajectory prediction, can benefit from the incorporation of
eye gaze models. Though eye gaze is not directly related to task performance, it provides
a way to incorporate auxiliary information about human strategies, which enables the
deep networks we describe to beat the state-of-the-art.
Post a Comment