Current students


D'ONGHIA MARIOCycle: XXXVI

Section: Computer Science and Engineering
Advisor: ZANERO STEFANO
Tutor: GATTI NICOLA

Major Research topic:
Novel malware analysis techniques

Abstract:
As malicious programs have become more resilient to traditional detection and analysis approaches, researchers have turned their attention to Machine Learning as a potential (and powerful) alternative to the techniques that have been applied so far. In particular, Machine Learning has been applied, among others, to classify malicious binaries and to perform automatic detection of malware. 

Clustering of “behaviors” is one of the most promising applications of Machine Learning in this context. It consists in grouping together samples that share similar run-time behaviors, with the ultimate goal of highlighting similarities among different samples, which can then be used to identify the so-called “families” of malware. Behaviors are most often defined as sequences of API and system calls, as recorded while the sample executes in an instrumented environment (such as a sand-box). These sequences are then plugged into a highly dimensional space and used as features for a clustering algorithm. 

Unfortunately, it has been shown that clustering algorithms can be defeated through poisoning attacks, one of the three main classes of Adversarial Machine Learning techniques. In particular, it has been shown that only few “adversarial samples” are needed in order to completely invalidate the clustering process performed by these malware analysis systems.

My research will focus on showing that these systems can also be defeated through evasion attacks. In this context, evasion is much more challenging than poisoning, as it requires the actual modification of the run-time behavior of a malicious program. This has to be performed in a way that can make the sample look as belonging to a different family, while also guaranteeing that this still performs the functions it was originally designed to carry out. For this reason, my research will also touch some themes that are central in other research areas, such as Program Verification. The attacks will be carried out against both open-source systems, which are known to employ an approach based on clustering of behaviors, and proprietary ones, whose underlying implementation is not publicly known. Therefore, I will evaluate the efficacy of my attacks both in “white-box” and “black-box” scenarios.

Exploitation of these systems will be then followed by the definition of new techniques and algorithms that can make this type of malware analysis systems more resilient to Adversarial Machine Learning attacks. In particular, this will involve exploring two research paths: ;
    ;
  1. The first one will be concerned with defining new methods for sanitizing the behaviors extracted from malicious programs;
  2. ;
  3. The second one, instead, will focus on creating new security-oriented clustering algorithms, which can be safely used in a highly adversarial setting, such as that of malware analysis. This will eventually lead to the definition of a better and safer clustering of behaviors approach. 
  4. ;
;

Lastly, as Deep Learning has recently been employed to perform malware classification based on run-time behaviors, I also intend to experiment with evasion attacks, similar to those that I have been researching for clustering algorithms, against Deep Neural Networks.