Hierarchical Multi-Agent Reinforcement Learning


The Smile-IT project aims to develop a multi-agent framework for studying and managing modern distributed networked systems (telecom networks, smart grids, traffic networks…)  that contain a large number of entities or agents, both machine and human, which strive to achieve their personal objectives.


One of the challenges of cooperative multi-agent systems is developing a coordination mechanism that can guide these entities, either through direct control or by way of incentives, in order to achieve system-wide optimal behaviour, satisfy global objectives and adhere to the system’s operational constraints in the face of diverging and incompatible personal goals.


We propose here a socially stratified multi-agent reinforcement learning approach as a possible coordination mechanism, where the first level is represented by independent learners that are trying to optimize their own reward, while higher level agents are learning how to guide them towards an optimal global outcome (by either actions or rewards). The work can be validated in a typical tragedy of the commons setting.