Section: Computer Science and Engineering
Tutor: PERNICI BARBARA
Advisor: CERI STEFANO Major Research topic
:Computational methods for the inference of DNA folding mechanisms: from data management to machine learningAbstract:
The human genome has an approximate length of two meters, and it must be folded to fit into the cell nucleus at the micrometer-scale. To remain accessible for gene transcription, the folded genome architecture must be organized properly and cannot be random.
Since 2009, the interest on the 3D genome architecture has rapidly grown due to its potential involvement in major biological regulation processes and disease formation. Being able to model, engineer and predict genome folding mechanisms is envisioned as a key towards the treatment of genetic diseases like cancer.
Sequencing data gathered from the analysis of 3D genome interactions poses hard challenges from the point of view of data management and analysis, due to its size, which can easily reach terabyte-scale dimensions for each single experiment, and complexity, given by its intrinsic noise. For these reasons, computational scientists are actively working both on the design of efficient data management solutions and on models for the prediction, denoising and analysis of 3D genome data.
My research focuses on both these aspects. In the context of the Genomic Computing ERC project, I am working on the extension of the GMQL genomic data management system towards scalable interactive computation on top of Spark, therefore enabling bioinformatic researchers to access and explore genomic data interactively without losing scalability. I am then applying my software stack to solve complex biological problems, with interest in computational models for the simulation and prediction of the 3D genomic architecture, with the objective of understanding what are its main drivers. I am actively working with international collaborators on this second phase of my project and deepening into the application of recent machine learning techniques to this problem.