SABBIONI LUCA | Cycle: XXXV |
Section: Computer Science and Engineering
Advisor: RESTELLI MARCELLO
Tutor: GATTI NICOLA
Major Research topic:
Meta Reinforcement Learning for Hyperpameter Tuning
Abstract:
Deep Reinforcement Learning (RL) methods have driven impressive advances in artificial intelligence in recent years, exceeding human performance in domains ranging from Atari to Go to no-limit poker. A RL agent interacts with its environment by observing its current state, choosing an action and obtaining a reward. The ultimate goal is to learn a policy of actions able to maximize the sum of rewards. Policy gradient methods are among the best techniques to solve complex control problems, but their application has several main drawbacks. At first, the algorithms are often parametrics, hence their performance is depending on the value of several hyperparameters (as, for example, the step size in the Stochastic Gradient Ascent). In order to optimize the results, practitioners have to manually tune the settings of the algorithms with a trial-and-error procedure. This procedure can be transformed in a Meta Decision Process, where learning itself can be considered as a reward for RL algorithms. Another main concern with standard algorithms is related to the ability of the learnt models to generalize to different tasks. Meta Learning has the aim of designing models that can learn new skills and to rapidly adapt to new environments with few training examples. Current solutions for Meta Learning involve the consideration of a distance metric between tasks (Matching Networks, Context Embeddings), the application of recurrent neural architectures with explicit storage buffer or with different update speeds (Meta-RL, RL^2, MANN), or an External Learning Optimizer, which considers the search directions seen in different tasks (MAML, LSTM, Reptile). While these works are mainly devoted to the ability to generalize, they are seldom used for hyperparameter tuning and learning optimization (Meta-SGD, Autonomous Optimization), and the theoretical research related to convergence properties of these method is very poor. The goal of this thesis is the development of Meta Reinforcement Learning techniques, with the ability to learning the best way to learn (and generalize) with strong theoretical guarantees.
Cookies
We serve cookies. If you think that's ok, just click "Accept all". You can also choose what kind of cookies you want by clicking "Settings".
Read our cookie policy
Cookies
Choose what kind of cookies to accept. Your choice will be saved for one year.
Read our cookie policy
-
Necessary
These cookies are not optional. They are needed for the website to function. -
Statistics
In order for us to improve the website's functionality and structure, based on how the website is used. -
Experience
In order for our website to perform as well as possible during your visit. If you refuse these cookies, some functionality will disappear from the website. -
Marketing
By sharing your interests and behavior as you visit our site, you increase the chance of seeing personalized content and offers.