Decoupling deep learning and reinforcement learning for stable and efficient deep policy gradient algorithms

This thesis explores the exciting new field of deep reinforcement learning (Deep RL). This field combines well known reinforcement learning algorithms with newly developed deep learning algorithms. With Deep RL it is possible to train agents that can perform well in their environment, without the need for prior knowledge. Deep RL agents are able to learn solely by the low level percepts, such as vision and sound, they observe when interacting with the environment.

Combining deep learning and reinforcement learning is not an easy task, and many different methods have been proposed. In this thesis I explore a novel method for combining these two techniques that matches the performance of a state of the art deep reinforcement learning algorithm in the Atari domain for the game of Pong, while requiring fewer samples.

Utgiver

NTNU