Continuous Action Reinforcement Learning Automata, an RL technique for controlling production machines

TitleContinuous Action Reinforcement Learning Automata, an RL technique for controlling production machines
Publication TypeThesis
Year of Publication2013
AuthorsRodriguez, A
UniversityVrije Universiteit Brussel
CityBrussels, Belgium

Reinforcement learning (RL) has been used as an alternative to model based techniques for learning optimal controllers. The central theme in reinforcement learning research is the design of algorithms that learn control policies solely from the knowledge of transition samples or trajectories, which are collected by online interaction with the system. Such controllers have become part of our daily life by been present at from almost every home appliance to really complex machines in industry. This extension to the frontiers of applications of RL has also demanded from the learning techniques to face more and more complex learning problems.

This dissertation is centered in the changes necessary to an existing simple RL algorithm, the continuous action reinforcement learning automaton (CARLA) to control production machines. For a better introduction of the results we first present some background information about the general framework for bandit applications, Markov decision process (MDP), and RL for solving problems with discrete and continuous state-action spaces. The changes to the algorithm aim to optimize the computational and sampling cost as well as to improve local and global convergence. The standard CARLA method is not able to deal with state in- formation. In this dissertation we also propose a method to take state information into account in order to solve more complex problems.

After introducing these theoretical results, we present a number of problems related to production machines and the appropriate way to use the described method to solve them. The applications vary in complexity in terms of stationarity, centralization and state information.