Current students


Section: Computer Science and Engineering

Major Research topic:
Reinforcement Learning for Trading

Reinforcement Learning (RL) provides a framework to find optimal policies for a wide range of sequential decision making tasks, where the dynamics are modelled by a Markov Decision Process (MDP). The MDP is defined by its state space, action space, state transition function and reward function. The usual goal of the RL agent is to find a policy maximizing the expected discounted sum of reward.
Clearly, not every sequential decision making task satisfies all the classical hypothesis of the framework and therefore, not every task can be solved efficiently as it stands by current RL solutions. One particular such task is trading. In the field of financial mathematics and for trading in particular, finding optimal strategies and automatization has been a research topic for many years. Naturally, RL has been applied in this context. However, due to the particular nature of financial markets, there are many limitations to applying current RL approaches. The result from the state of the art are promising but there is room for improvement, particularly, as I believe, by extending the RL framework to handle the peculiarities of trading. In particular, the agent acting in the financial market suffers from the following, non-exhaustive, list of problems:
 - Non-stationarity of the data
 - Delays in reward collection, state observation or action execution
 - Partial Observability of the state
 - Risk aversion of the agent
 - Lifelong learning, there is one realization of the past for each asset

The aim of the research is to extend the results of the RL framework to handle some of the conditions of a real trading environment, in order to reduce the gap between theory and practice.