|DEL SOZZO EMANUELE||Cycle: XXXI |
Section: Computer Science and Engineering
Tutor: BOLCHINI CRISTIANA Major Research topic
:A unified environment for hardware acceleration of stencil computations
Advisor: SANTAMBROGIO MARCO DOMENICOAbstract:
Nowadays, stencil computations are applied in many and different fields, from image processing to seismic simulations, from numerical methods to physical modelling. In such a computation, series of sweeps are performed over a regular grid, updating its points by means of a fixed nearest-neighbor pattern. Thanks to their regular computational structure, stencil algorithms are often described by means of Domain Specific Languages (DSLs). The state of the art in this field is Halide, a DSL and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. One of the main features of Halide, with respect to other DSLs, is a separate scheduling co-language for expressing when and where to perform the computation. On the other hand, stencil computations, thanks to their regular structure, are ideal candidates for automatic optimizations and hardware acceleration on GPUs (like Halide does), FPGAs and ASICs. In particular, FPGAs are a good candidate for such computations since they provide a good trade-off in terms of performance (higher than CPUs and similar to GPUs) and power consumption (lower than CPUs and GPUs), as well as a high level of flexibility (with respect to ASICs). Moreover, the possibility to implement a custom architecture (for instance, in terms of data precision) allows to tailor it to the target computation. However, the main drawback in using FPGAs is their steep learning curve, and, although High Level Synthesis (HLS) tools may ease the designer’s work, it is still complex to develop an efficient FPGA implementation. This work proposes FROST, a framework for the acceleration of stencil computation pipelines on FPGA. Such framework is designed as a backend for Halide and any other DSL based on Halide Intermediate Representation (IR). Indeed, FROST is based on an extension of Halide IR, and it is fully compatible with Halide. Differently from other frameworks or DSLs for the acceleration of stencil computations on FPGA, FROST, just like Halide, exploits a scheduling co-language to express the different types of optimizations (e.g. loop pipelining, array partitioning, vectorization), which can be combined with Halide scheduling. FROST takes as input the Abstract Syntax Tree generated by Halide, and analyzes and manipulates it in order to apply FPGA-oriented transformations and optimizations. FROST then generates a C/C++ implementation of the initial Halide code suitable for HLS tools. The output of HLS phase will then be synthesized and implemented on the target FPGA using Xilinx SDAccel toolchain.