MARL with RNN in Smart Grids


The Smile-IT project aims to develop a multi-agent reinforcement learning framework for studying and managing modern distributed networked systems (telecom networks, smart grids, traffic networks…)  that contain a large number of entities or agents, both machine and human, which strive to achieve their personal objectives. The framework developed within the project will train these entities to make decisions in order to achieve system-wide optimal behaviour in the face of diverging and incompatible personal goals, and limited information input.


Reinforcement Learning (RL) is a machine learning approach that allows an agent to learn through trial and error, by interacting with the environment. A key assumption generally made in RL is that the agent can fully perceive all the information necessary to decide on its next best action. In a real-world setting, this assumption is not valid, meaning we are mostly dealing with a partially observable environment (Partially Observable Markov Decision Process - POMDP). One possible approach of dealing with this issue is to equip the agents with memory-like capacities. This can be done, for example, by using Recurrent Neural Networks (RNNs), which can model the temporal behaviour of an agent.


For this thesis you will train and investigate whether RNN agents are able to learn how to perform complex tasks in a smart grid setting.