|GIACOMELLO EDOARDO||Cycle: XXXIV |
Section: Computer Science and Engineering
Tutor: TANCA LETIZIA
Advisor: LOIACONO DANIELE Major Research topic
:Privacy constrained design in distributed learning for e-health applicationsAbstract:
Adoption of Machine Learning techniques in Healthcare applications is an increasingly active research field thanks to the massive amount of data that each center is able to collect. A possible approach to develop assistive tools for clinicians is to exploit data coming from different centers, creating a Distributed Machine Learning (DML) model in which every center can benefit from data that is shared between institutions. Distributed Machine Learning is an essential tool for overcoming classical limitations in Machine Learning such as the availability of data and computational costs, but the application of this technique is limited by procedures heterogeneity and the different data policies which regulate sharing of data between institutions.
Although many institutes are collaborating to produce publicly available datasets, this is not a general adopted procedure and most healthcare centers could benefit from adopting a Machine Learning framework which is specialized on their own data and complies with their privacy policies.
In this research, we study the design space of a DDL model and the relative impact of possible privacy policies on the model performance using different datasets. For each policy, we propose one or more design solutions and techniques for exploiting the largest amount of information available under the data sharing constraints.For the most common cases, we also introduce in-depth analysis of complementary techniques such prediction negotiation in case of strict privacy regulations, domain adaptation to compensate scarce or missing data and visualization techniques to provide a better understanding of the outcomes generated from a DML model.
Scarce propension to share clinical data, due to complex data regulations and business motivations, often limits the diffusion of new knowledge that is fundamental to develop increasingly innovative models. The approach we propose aims to mitigate this negative effect, allowing all the nodes to benefit from a collaborative perspective of machine learning, while preserving each node policy.