Online multi-criteria reinforcement learning

1 Jan 2014
31 Dec 2017

The goal of this project is to develop multi-criteria (or multi-objective) reinforcement learning (MORL) algorithms in an online setting in order to efficiently use them for multi-criteria control problems emerged from industrial applications. While the standard reinforcement learning (RL) algorithms learn an optimal policy that maximized a scalar long term reward, the multi-objective reinforcement learning algorithms search for a set of equally good policies, called Pareto set of policies, optimizing a long term reward vector.

The focus is to design and analyze MORL algorithms which efficiently explore/exploit the Pareto set of optimal policies. We investigate the combination between different trendy, like decomposition and the transformation of the multi-objective search space, and specific techniques from multi-objective optimization and reinforcement learning techniques such as single and multi-state RL. We are interested in all the aspects of the algorithmic design like the theoretical and experimental analyzes, and testing of the designed algorithms on practical applications in order to validate the proposed models but also to generate new design requirements for the algorithms.

Multi-objective control applications from energy saving are provided and developed together with Flanders Mechatronic Technology Center (FMTC) and the industrial partners on ongoing IWT projects like PERTETUAL and LeCoPro. The goal of this project is to create: i) high impact research, and ii) social impact with the deliverables (i.e. software) to partners.

Involved members: