THESIS
2020
xii, 103 pages : illustrations ; 30 cm
Abstract
Biological vision systems are often active and rely on a number of eye movements to sense and interact with the environment through the perception-action cycle. Remarkably, these vision systems have the ability to actively explore the environment and autonomously self-calibrate, but the underlying mechanisms are still poorly understood. In this thesis, we apply the active efficient coding framework, a model of the joint development of both perception and behavior, to model the learning of eye movements during the saccade-fixation cycle.
First, we integrate saliency driven saccades and vergence learning under the active efficient coding framework. We propose a binocular saliency model, called BAIM (Binocular Attention based on Information Maximization), to drive saccades based on learne...[
Read more ]
Biological vision systems are often active and rely on a number of eye movements to sense and interact with the environment through the perception-action cycle. Remarkably, these vision systems have the ability to actively explore the environment and autonomously self-calibrate, but the underlying mechanisms are still poorly understood. In this thesis, we apply the active efficient coding framework, a model of the joint development of both perception and behavior, to model the learning of eye movements during the saccade-fixation cycle.
First, we integrate saliency driven saccades and vergence learning under the active efficient coding framework. We propose a binocular saliency model, called BAIM (Binocular Attention based on Information Maximization), to drive saccades based on learned binocular feature extractors, which simultaneously encode both depth and texture information. Our results show that saliency driven saccades lead to better vergence performance and faster learning than random saccades.
Second, we investigate the simultaneous learning of vergence and saccadic control. We propose to apply the active efficient coding framework to learn the saccadic control strategy, rather than using a hand-crafted saliency-driven saccadic control policy. Our results show that learned saccadic control method leads to a similar policy as the saliency-driven policy, but that it relies upon fewer prior assumptions.
Third, we extend the active efficient coding framework to model the developmental process of the torsional eye movements, which control the eye rotations around the lines of sight. The learned torsion control policies are consistent with biological findings. In particular, the learned eye torsion control follows Listing’s law. The learning system also has the ability of overcoming an orientation misalignment between left and right cameras, eliminating the need for manual calibration.
Finally, we further combine the learning of all of the eye movements investigated in this thesis, i.e. vergence, torsion, and saccades. The unified framework has one common perceptual representation, which is used as the input to the networks controlling all of these eye movements. This perceptual representation also determine the reward signal optimized during the reinforcement learning of these controllers. The learned representation and the learned policies are similar to those in the separate learning cases, which demonstrates that all these eye movements can be learned simultaneously under the active efficient coding framework. To our knowledge, this is the first model that simultaneously accounts for the learning of vergence, torsion and saccades using a unified framework.
Post a Comment