D'ONGHIA MARIO | Cycle: XXXVI |
Section: Computer Science and Engineering
Advisor: ZANERO STEFANO
Tutor: GATTI NICOLA
Major Research topic:
Novel malware analysis techniques
Abstract:
As malicious programs have become more resilient to traditional detection and analysis approaches, researchers have turned their attention to Machine Learning as a potential (and powerful) alternative to the techniques that have been applied so far. In particular, Machine Learning has been applied, among others, to classify malicious binaries and to perform automatic detection of malware.
Clustering of “behaviors” is one of the most promising applications of Machine Learning in this context. It consists in grouping together samples that share similar run-time behaviors, with the ultimate goal of highlighting similarities among different samples, which can then be used to identify the so-called “families” of malware. Behaviors are most often defined as sequences of API and system calls, as recorded while the sample executes in an instrumented environment (such as a sand-box). These sequences are then plugged into a highly dimensional space and used as features for a clustering algorithm.
Unfortunately, it has been shown that clustering algorithms can be defeated through poisoning attacks, one of the three main classes of Adversarial Machine Learning techniques. In particular, it has been shown that only few “adversarial samples” are needed in order to completely invalidate the clustering process performed by these malware analysis systems.
My research will focus on showing that these systems can also be defeated through evasion attacks. In this context, evasion is much more challenging than poisoning, as it requires the actual modification of the run-time behavior of a malicious program. This has to be performed in a way that can make the sample look as belonging to a different family, while also guaranteeing that this still performs the functions it was originally designed to carry out. For this reason, my research will also touch some themes that are central in other research areas, such as Program Verification. The attacks will be carried out against both open-source systems, which are known to employ an approach based on clustering of behaviors, and proprietary ones, whose underlying implementation is not publicly known. Therefore, I will evaluate the efficacy of my attacks both in “white-box” and “black-box” scenarios.
Exploitation of these systems will be then followed by the definition of new techniques and algorithms that can make this type of malware analysis systems more resilient to Adversarial Machine Learning attacks. In particular, this will involve exploring two research paths: ;
Lastly, as Deep Learning has recently been employed to perform malware classification based on run-time behaviors, I also intend to experiment with evasion attacks, similar to those that I have been researching for clustering algorithms, against Deep Neural Networks.
Clustering of “behaviors” is one of the most promising applications of Machine Learning in this context. It consists in grouping together samples that share similar run-time behaviors, with the ultimate goal of highlighting similarities among different samples, which can then be used to identify the so-called “families” of malware. Behaviors are most often defined as sequences of API and system calls, as recorded while the sample executes in an instrumented environment (such as a sand-box). These sequences are then plugged into a highly dimensional space and used as features for a clustering algorithm.
Unfortunately, it has been shown that clustering algorithms can be defeated through poisoning attacks, one of the three main classes of Adversarial Machine Learning techniques. In particular, it has been shown that only few “adversarial samples” are needed in order to completely invalidate the clustering process performed by these malware analysis systems.
My research will focus on showing that these systems can also be defeated through evasion attacks. In this context, evasion is much more challenging than poisoning, as it requires the actual modification of the run-time behavior of a malicious program. This has to be performed in a way that can make the sample look as belonging to a different family, while also guaranteeing that this still performs the functions it was originally designed to carry out. For this reason, my research will also touch some themes that are central in other research areas, such as Program Verification. The attacks will be carried out against both open-source systems, which are known to employ an approach based on clustering of behaviors, and proprietary ones, whose underlying implementation is not publicly known. Therefore, I will evaluate the efficacy of my attacks both in “white-box” and “black-box” scenarios.
Exploitation of these systems will be then followed by the definition of new techniques and algorithms that can make this type of malware analysis systems more resilient to Adversarial Machine Learning attacks. In particular, this will involve exploring two research paths: ;
- ;
- The first one will be concerned with defining new methods for sanitizing the behaviors extracted from malicious programs; ;
- The second one, instead, will focus on creating new security-oriented clustering algorithms, which can be safely used in a highly adversarial setting, such as that of malware analysis. This will eventually lead to the definition of a better and safer clustering of behaviors approach. ;
Lastly, as Deep Learning has recently been employed to perform malware classification based on run-time behaviors, I also intend to experiment with evasion attacks, similar to those that I have been researching for clustering algorithms, against Deep Neural Networks.
Cookies
We serve cookies. If you think that's ok, just click "Accept all". You can also choose what kind of cookies you want by clicking "Settings".
Read our cookie policy
Cookies
Choose what kind of cookies to accept. Your choice will be saved for one year.
Read our cookie policy
-
Necessary
These cookies are not optional. They are needed for the website to function. -
Statistics
In order for us to improve the website's functionality and structure, based on how the website is used. -
Experience
In order for our website to perform as well as possible during your visit. If you refuse these cookies, some functionality will disappear from the website. -
Marketing
By sharing your interests and behavior as you visit our site, you increase the chance of seeing personalized content and offers.