Current students


PEVERELLI FRANCESCOCycle: XXXVII

Section: Computer Science and Engineering

Major Research topic:
Heterogeneous Architectures for Spatial Computing

Abstract:
Spatial computation [Rong et al. 2017, Valiant et al. 1990, Becker et al. 2016, Budiu et al. 2004], which refers in this context to a class of computer architectures focused on the spatial distribution of computation and data, is a promising direction to keep pace with the ever-increasing performance requirements, particularly for HPC applications, and overcome the limitations of classical Von-Neumann architectures. This computational pattern refers to a distributed and parallel structure where several Processing Elements (PEs), coupled with dedicated Memory Elements (MEs), operate concurrently without centralized control. This specialization provides notable performance and energy-efficiency benefits against traditional solutions like CPUs and GPUs [Thompson et al. 2021, Nourian et al. 2017].
In this research proposal, I aim to explore the design of novel architectural models to match the concept of spatial computation. Rather than implementing an architecture for each application, which limits the longevity of many hardware-accelerated solutions, I aim to identify the common patterns that applications of interest share and which computing model is best suited to execute such patterns efficiently. If we consider classic HPC benchmarks like the "13 Dwarfs" [Asanovic et al. 2006], we can notice how similar patterns appear multiple times. Some Dwarfs involve a high degree of parallelism, while others contain data-dependent parallelism and indirect memory accesses. Similarly, state-of-the-art libraries like GraphBLAS [Kepner et al. 2016] perform a wide range of operations through a small number of operators. This offers an opportunity to explore a specialized yet flexible architecture that can effectively accelerate a heterogeneous set of related tasks.
This research aims to identify a set of architectural templates to capture different computational patterns and build one or more architecture prototypes able to instantiate these templates. We define an architectural template as an arrangement of PEs and MEs that realizes a given computation model. Examples of architectural templates are systolic arrays, static dataflow architectures and vector processors. Architectural templates represent an efficient solution to abstract the system organization with a certain granularity, as already shown in the literature [Cheng et al. 2016, Rabozzi et al. 2017]. In this way, the resulting architectures can execute target workloads efficiently and yet be general enough to support future workloads exposing similar patterns.