Population based Reinforcement Learning

Context

Current multi-agent learning approaches are limited to relatively small, semi-static systems, wherein the number of agents is fixed, the size is relatively small, and the environment changes at a slower rate than the time required to adapt the behaviour. As such, these limitations are at odds with what normally is expected of an adaptive system. The goal of this thesis is to develop a new model for learning in autonomous multi-agent systems that can handle large scale and highly dynamic systems.

The approach we want to explore is a population-based reinforcement learning approach, where we will combine the current knowledge on multi-agent reinforcement learning with the population-oriented theories of evolutionary game dynamics and cultural evolution. Both theories should allow the system to exploit the common knowledge and previously learned information inherent to the multi-agent systems, speeding up the learning process, and leading hopefully to swifter responses to environmental changes.

Goal

Different projects are possible, but the main goal is the following.

We want to develop a hybrid learning approach where agents can decide to either apply social learning, i.e. learning by observing other agents, or individual reinforcement learning. Ideally, the agents should find out how to balance both ways of learning to learn as efficient as possible as a team.

Contact

Interested students should contact Tom Lenaerts for more information.

Promotor: Ann Nowé and Tom Lenaerts.