Section: Computer Science and Engineering
Tutor: GATTI NICOLA
Advisor: RESTELLI MARCELLO Major Research topic
:Risk-Averse Reinforcement Learning in Partially Observable EnvironmentsAbstract:
Reinforcement Learning has recently proved to be highly successful in a variety of tasks, from robotic control to videogames, obtaining and, sometimes, beating human performance by means of a learning process that does not rely on specific domain knowledge. Indeed, its general framework could be applied, in principle, in every context in which, at each interaction, it is possible to obtain as feedback the state of the environment, and a reward which is representative of the performance w.r.t. the goal. However, in real problems, it could be difficult to observe the exact state of a system, that could present hidden features or complex time dependencies. In the financial setting, e.g, only a reduced subset of the variables that influence the economic environment can be observed (prices, volatilities, etc), while other important factors cannot be easily embedded in a state (political decisions or unexpected events). Furthermore, maximizing a reward does not always capture human goals, that often involve also minimizing or bounding some risk measure. In an Autonomous Navigation setting, e.g., it is essential for a vehicle to reach the target position, but it is also of extreme importance to avoid any failure which can damage its motion capabilities. While some work has been done for both handling partial observability (POMDP framework, Recursive Neural Networks) and risk constraints (on Variance, VaR, CVaR , etc.), current solutions approach consists in proposing novel algorithms, tailored for the specific case, that often do not have the performance of state-of-the-art algorithms. The subject of this thesis is the development of risk-averse Reinforcement Learning techniques for partially observable environments, in order to ease the direct application of state-of-the-art Reinforcement Learning algorithms in a wider area of real problems.