Developing methods and tools for biological data analysis

With the enormous amount of biological data, the development of methods and tools to facilitate its collection, management and analysis becomes necessary. Among biological sources of huge interest, I have specialized on SARS-CoV-2 viral sequences which are produced daily. ViruSurf is the globally largest database of curated viral sequences and variants, integrated from deposition repositories COG-UK, GenBank, and GISAID, hosted at Politecnico di Milano. Different tools are being created dedicated to this project: ;
  1. CoV2K is a manually curated knowledge base providing a structured set of information about SARS-CoV-2 variants, extracted from the scientific literature; it features a taxonomy of variant impacts, organized according to three main categories (protein dynamics and kinetics, epidemiology, and immunology) and including levels for these effects (higher, lower, null) resulting from a coherent interpretation of research articles. CoV2K integration with ViruSurf allows the variants documented in CoV2K to be statistically analyzed and searched over large volumes of nucleotide and amino acid sequences.
  3. EpiSurf is a Web application for selecting viral populations of interest using specific viral meta-information following user-defined queries and analyzing how their amino acid changes are distributed inside the epitopes ranges, where epitopes are specific segments of the SARS-CoV-2 sequences with high affinity to antibodies, used in vaccines and for COVID-19 disease testing. EpiSurf supports user-friendly analyses of epitope conservancy within selected populations of interest, which can be relevant for designing vaccines, drugs, and serological assays.
; At the moment, these tools are focused on viral sequences in response to the current global pandemic. The broad vision of this research is to draw the connection between human genomic data and viral sequences, to amplify the usefulness of this research by considering the genome of those humans hosting sequenced viruses.