Current students


Section: Computer Science and Engineering

Major Research topic:
Resource Allocation and Scheduling Problems in Computing Continua for Artificial Intelligence applications

The importance and pervasiveness of Artificial Intelligence (AI) and Deep Learning (DL) are increasing dramatically in these years. The Cloud Computing paradigm led this growth process, giving access to an ideally unlimited computational and storage power according to pay-to-go pricing models. However, nowadays the accelerated migration towards mobile computing and Internet of Things (IoT) is determining an evolution of AI and big data applications: data are generated by widespread end devices, thus having computing resources at the periphery of the network would reduce latency and bandwidth requirements and increase energy efficiency and privacy protection. An answer to this need is provided by Edge Computing, whose application to AI gave rise to the definition of “Edge Intelligence”. This paradigm generates a fragmented scenario, where computing and storage power are distributed among devices with highly heterogeneous capacities. In this scenario, the development of multilevel edge-to-cloud AI applications and of efficient component placement, resource allocation and scheduling algorithms is crucial to orchestrate at best the physical resources of the computing continuum, meeting DL model accuracy, application performance, security and privacy constraints. Moreover, Deep Neural Networks can be partitioned so that the training and inference processes are tackled partially at the edge level and partially on the cloud. Network partitioning, resource allocation and scheduling should be considered as integrated tasks, taking care both of the performance and accuracy constraints on the forward pass operations and of the resource and privacy constraints, strongly affected by the decision of deploying a portion of the model on the cloud or at the edge level.The research work will focus on the optimal scheduling and deployment of AI applications in the computing continuum, considering both the training and the inference phase. Efficient scheduling algorithms should be developed to support the training process in heterogeneous environments, characterized by the availability of disaggregated resources, where GPUs can be remotely accessed by different points of the network. The inference phase needs to be supported by developing effective components placement solutions in the computing continuum, both at design time and at runtime, where decisions should be carefully adapted to account for load variations.