Prediction Error and Actor-Critic Hypotheses in the Brain.- Reviewing on-policy / off-policy critic learning in the context of Temporal Differences and Residual Learning.- Reward Function Design in Reinforcement Learning.- Exploration Methods In Sparse Reward Environments.- A Survey on Constraining Policy Updates Using the KL Divergence.- Fisher Information Approximations in Policy Gradient Methods.- Benchmarking the Natural gradient in Policy Gradient Methods and Evolution Strategies.- Information-Loss-Bounded Policy Optimization.- Persistent Homology for Dimensionality Reduction.- Model-free Deep Reinforcement Learning — Algorithms and Applications.- Actor vs Critic.- Bring Color to Deep Q-Networks.- Distributed Methods for Reinforcement Learning.- Model-Based Reinforcement Learning.- Challenges of Model Predictive Control in a Black Box Environment.- Control as Inference?
Boris Belousov is a Ph.D. student at Technische Universität Darmstadt, Germany, advised by Prof. Jan Peters. He received his M.Sc. degree from the University of Erlangen-Nuremberg, Germany, in 2016, supported by a DAAD scholarship for academic excellence. Boris is now working toward combining optimal control and information theory with applications to robotics and reinforcement learning.
Hany Abdulsamad is a Ph.D. student at the TU Darmstadt, Germany. He graduated with a Master’s degree in Automation and Control from the faculty of Electrical Engineering and Information Technology at the TU Darmstadt. His research interests range from optimal control and trajectory optimization to reinforcement learning and robotics. Hany’s current research focuses on learning hierarchical structures for system identification and control.
After graduating with a Master’s degree in Autonomous Systems from the Technische Universität Darmstadt, Pascal Klink pursued his Ph.D. studies at the Intelligent Autonomous Systems Group of the TU Darmstadt, where he developed methods for reinforcement learning in unstructured, partially observable real-world environments. Currently, he is investigating curriculum learning methods and how to use them to facilitate learning in these environments.
Simone Parisi joined Prof. Jan Peter’s Intelligent Autonomous System lab in October 2014 as a Ph.D. student. Before pursuing his Ph.D., Simone completed his M.Sc. in Computer Science Engineering at the Politecnico di Milano, Italy, and at the University of Queensland, Australia, under the supervision of Prof. Marcello Restelli and Dr. Matteo Pirotta. Simone is currently working to develop reinforcement learning algorithms that can achieve autonomous learning in real-world tasks with little to no human intervention. His research interests include, among others, reinforcement learning, robotics, dimensionality reduction, exploration, intrinsic motivation, and multi-objective optimization. He has collaborated with Prof. Emtiyaz Khan and Dr. Voot Tangkaratt of RIKEN AIP in Tokyo, and his work has been presented at universities and research institutes in the US, Germany, Japan, and Holland.
Jan Peters is a Full Professor of Intelligent Autonomous Systems at the Computer Science Department of the Technische Universität Darmstadt and an adjunct senior research scientist at the Max-Planck Institute for Intelligent Systems, where he heads the Robot Learning Group (combining the Empirical Inference and Autonomous Motion departments). Jan Peters has received numerous awards, most notably the Dick Volz Best US PhD Thesis Runner Up Award, the Robotics: Science & Systems - Early Career Spotlight Award, the IEEE Robotics & Automation Society’s Early Career Award, and the International Neural Networks Society’s Young Investigator Award.
This book reviews research developments in diverse areas of reinforcement learning such as model-free actor-critic methods, model-based learning and control, information geometry of policy searches, reward design, and exploration in biology and the behavioral sciences. Special emphasis is placed on advanced ideas, algorithms, methods, and applications.
The contributed papers gathered here grew out of a lecture course on reinforcement learning held by Prof. Jan Peters in the winter semester 2018/2019 at Technische Universität Darmstadt.
The book is intended for reinforcement learning students and researchers with a firm grasp of linear algebra, statistics, and optimization. Nevertheless, all key concepts are introduced in each chapter, making the content self-contained and accessible to a broader audience.