LEVENI FILIPPO | Cycle: XXXV |
Section: Computer Science and Engineering
Advisor: BORACCHI GIACOMO
Tutor: ALIPPI CESARE
Major Research topic:
Data-driven models for anomaly detection and clustering
Abstract:
Anomaly Detection (AD) in datastreams is a fundamental problem in Data Science, which raises both theoretical and practical challenges when monitoring datastreams that are high-dimensional and non-stationary. My research focuses on designing new data-driven AD methods that are able to handle high-dimensional data and, at the same time, adapt to changes in the process generating normal (i.e. anomaly-free) data. Particular interest is devoted to unsupervised techniques, which are very practical to use in real-world scenarios, but that deserve further investigation from an algorithmic perspective.
My research in the AD domain is supported by a collaboration between Politecnico di Milano and Cleafy - a company providing systems for monitoring and assessing transactional risk factors. In this collaboration I am exploring and applying tree-based AD techniques, with a particular emphasis on the Isolation Forest approach, with the aim of detecting threats in online web sessions.
My research concerns also the investigation of new Clustering techniques, which are very related to the AD field. In fact anomaly-free data are often identified as those belonging to a certain group, or satisfying a particular model. Clustering is a widely addressed problem in pattern recognition and data mining in the data exploration phase, of which however there is no universal solution. Among all the possible Clustering techniques, I am interested in the Multi-Model Fitting (MMF) approach, where it is assumed that data belonging to the same cluster satisfy the equation of some parametric model.
In particular, I have been investigating an extended version of a MMF algorithm for the case where multiple families of parametric models need to be employed. I am currently investigating how to combine MMF and Ensemble Clustering approaches based on the construction of Random Trees through Local Sensitive Hashing, a technique grounded on the probabilistic approximation of distance functions. A particularly interesting problem that I will address is clustering of data-structures characterized by very variable density.
In this direction, I will investigate new probabilistic strategies to improve AD and Clustering performance.
My research in the AD domain is supported by a collaboration between Politecnico di Milano and Cleafy - a company providing systems for monitoring and assessing transactional risk factors. In this collaboration I am exploring and applying tree-based AD techniques, with a particular emphasis on the Isolation Forest approach, with the aim of detecting threats in online web sessions.
My research concerns also the investigation of new Clustering techniques, which are very related to the AD field. In fact anomaly-free data are often identified as those belonging to a certain group, or satisfying a particular model. Clustering is a widely addressed problem in pattern recognition and data mining in the data exploration phase, of which however there is no universal solution. Among all the possible Clustering techniques, I am interested in the Multi-Model Fitting (MMF) approach, where it is assumed that data belonging to the same cluster satisfy the equation of some parametric model.
In particular, I have been investigating an extended version of a MMF algorithm for the case where multiple families of parametric models need to be employed. I am currently investigating how to combine MMF and Ensemble Clustering approaches based on the construction of Random Trees through Local Sensitive Hashing, a technique grounded on the probabilistic approximation of distance functions. A particularly interesting problem that I will address is clustering of data-structures characterized by very variable density.
In this direction, I will investigate new probabilistic strategies to improve AD and Clustering performance.
Cookies
We serve cookies. If you think that's ok, just click "Accept all". You can also choose what kind of cookies you want by clicking "Settings".
Read our cookie policy
Cookies
Choose what kind of cookies to accept. Your choice will be saved for one year.
Read our cookie policy
-
Necessary
These cookies are not optional. They are needed for the website to function. -
Statistics
In order for us to improve the website's functionality and structure, based on how the website is used. -
Experience
In order for our website to perform as well as possible during your visit. If you refuse these cookies, some functionality will disappear from the website. -
Marketing
By sharing your interests and behavior as you visit our site, you increase the chance of seeing personalized content and offers.