Building and leveraging implicit models for policy gradient methods

HKUST Electronic Theses

Building and leveraging implicit models for policy gradient methods

by Zachary William Wellmer

THESIS 2019

M.Phil. Computer Science and Engineering

ix, 56 pages : illustrations ; 30 cm

Abstract

In this thesis, we study Policy Prediction Network and Policy Tree Network, both are deep reinforcement learning architectures offering ways to improve sample complexity and performance on continuous control problems. Furthermore, Policy Tree Network offers the ability to trade extra computation at test time for improved performance via decision-time planning. Performance gains are still observed even in the case of not using decision-time planning(i.e. no extra computation cost relative to the model-free baseline). Our approach integrates a mix between model-free and model-based reinforcement learning. Policy Prediction Network is the first to introduce an implicit model-based approach to Policy Gradient algorithms in continuous action space. Policy Tree Network is the first t...[ Read more ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree M.Phil. Department Computer Science and Engineering Authors Wellmer, Zachary William Subjects Reinforcement learning Machine learning Implicit learning Language English Call number Thesis CSED 2019 Wellme DOI 10.14711/thesis-991012757568703412

Full record

Building and leveraging implicit models for policy gradient methods

by Zachary William Wellmer

Post a Comment Cancel reply