Current students


Section: Computer Science and Engineering

Major Research topic:
A framework for the aided design of High Performant Genome Analysis applications on heterogeneous architectures

Technological innovation and the declining cost of next-generation sequencing (NGS) have driven an explosion in the quantity of genomic research. However, analyzing the massive quantities of sequencing data generated from this research has revealed a great computational challenge. Indeed, rate of data generation in Genomics is outpacing the rate at which it can be computationally processed. For instance, the GenBank database, which is the largest public repository for all publicly available DNA sequences, is continuing to double in size nearly every year starting from 1990. However, CPU performance is not following the same trend as it is becoming extremely difficult to integrate more transistors on a chip. This forces computer architects to search for new architectural solutions as we are reaching the end of Moore’s Law predictions. Consequently High Performance Computing (HPC) applications like genomic algorithms demand more than what current processors can deliver. One promising solution is to use hardware accelerators to offload compute-intensive tasks from the main processor. Some examples of hardware accelerators are Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs), which showed promising results in the genomic field. To fully exploit the computing power of GPUs and FPGAs many genomic applications, need to be developed from the ground up targeting these architectures. Also, the process of developing high performant heterogeneous applications still requires both domain-specific knowledge and expertise to leverage the architecture effectively. State of the art solutions, like OpenCL, only partially solve this problem, by providing the end user with familiar C/C++ APIs, at the expense of performance. It is clear that there is a high need to have a unified platform that automates and assists the programmer while developing applications for heterogeneous platforms. For this reason, the purpose of my Ph.D. is to design a framework for the development of genomic applications exploiting high performance computing with heterogeneous hardware architectures. Indeed, the objective of this work will be to both provide a customizable fully hardware-accelerated genomic pipeline and help the end user when developing genomic applications. Users will be able to exploit the fully hardware accelerated pipeline, use only parts of it or implement their own algorithms with the aid of the framework itself. The key aspect of this work is to focus on flexibility. In fact, the integrated pipeline will provide multiple implementations for each pipeline stage. Furthermore, knowing the rapid change of scope of algorithms, genomic pipelines need to be frequently updated. Because of this, the framework enables users to implement their own applications. Finally, applications need to be flexible as well, and cannot be limited to execute only on a single architecture. Therefore, the algorithms implemented within the framework will also be compatible with a wide range of GPUs and FPGAs.