|TIRINZONI ANDREA||Cycle: XXXIII |
Section: Computer Science and Engineering
Tutor: GATTI NICOLA
Advisor: RESTELLI MARCELLO Major Research topic
:Exploiting Structure for Transfer in Reinforcement LearningAbstract:
Recent advancements have allowed reinforcement learning (RL) algorithms to achieve impressive results in a wide variety of complex sequential decision-making problems, ranging from playing board and video games to the control of sophisticated robotics systems. However, current RL techniques are still very inefficient, in the sense that they require a huge amount of experience samples to be collected before learning a nearly-optimal policy. The main limitation is that RL algorithms learn every new task from scratch, without considering any knowledge that the agent might have gathered from its previous experience in learning other tasks. Humans, on the other hand, never learn from scratch. In fact, whenever we face a new task, we implicitly reuse all knowledge collected throughout all our lives, thus quickly adapting to the new task. Bridging this gap between human learning and machine learning is fundamental in order to build fully autonomous RL agents that can solve complex real-world tasks.
A possible solution is transfer learning, which focuses on reusing knowledge the agent has obtained while solving a set of source tasks to speed-up the learning process of a similar, but different, target task. Several algorithms have been proposed in the literature to transfer different elements involved in the learning process. Unfortunately, existing algorithms still suffer severe limitations. First, many approaches require a human expert to be involved in some part of the transfer process, typically by helping in selecting what knowledge should be reused or by providing mappings between tasks. Although helpful, having a human in the loop dramatically limits the autonomy (and thus the applicability) of the approach. Furthermore, algorithms that do not require a human to intervene typically make strong assumptions on the similarity between the tasks involved so as to allow an efficient transfer. Second, even though transferring information alleviates the huge sample requirements of RL, such requirements are reduced to 'task complexity', i.e., how many tasks need to be observed before the agent can successfully transfer knowledge and quickly learn the target task. Finally, current state-of-the-art approaches are well-motivated heuristic procedures, but hardly ever they are theoretically analyzed. Since a very naive transfer procedure can even hamper the learning process instead of benefiting it (the so-called negative transfer), providing theoretical guarantees in these settings is even more important than in plain RL.
This work aims at making a step forward in overcoming the above-mentioned limitations. In particular, our goal is to design general transfer algorithms that are applicable to a wide variety of problems and are able to autonomously decide what and when to transfer, without ever requiring human intervention. Furthermore, we intend to provide a deep theoretical analysis of all our proposed approaches. In particular, we want to formally guarantee that our algorithms are (i) beneficial for solving the target under general assumptions, (ii) efficient in terms of both sample and task complexity, and (iii) robust to negative transfer. Finally, we intend to empirically evaluate all our approaches in both synthetic problems and realistic domains. We believe that designing algorithms satisfying these requirements will contribute an important progress in the applicability and deployment of transfer-based RL agents to the real world.