INDIRLI FABRIZIO | Cycle: XXXVI |
Section: Computer Science and Engineering
Advisor: SILVANO CRISTINA
Tutor: ALIPPI CESARE
Major Research topic:
Hardware Acceleration of Neural Networks through In-Memory Computing
Abstract:
In recent years, Deep Neural Networks are quickly becoming leading edge solutions in a broad variety of computational tasks, such as image classification and speech recognition. The rapid proliferation of pervasive IoT devices and ubiquitous cognitive computing applications is pushing the industry towards performing Machine Learning inference on edge devices. These embedded platforms pose stringent constraints on power consumption, latency and memory footprint, which has led to the development of new training and mapping techniques for TinyML applications.
Furthermore, the scaling of performance and efficiency of traditional hardware architectures is being hindered by the slowdown of Moore’s Law; thus, the research focus in this field is shifting towards novel methodologies, such as Heterogeneous platforms and Neuromorphic computing. In this context, hardware accelerators based on FPGA platforms and ASICs (such as Google’s TPU) are emerging as promising solutions. However, these devices often have limited I/O bandwidths, which worsens the negative impact of the transactions between the processing units and the external memory. To overcome this problem, the In-Memory-Computing (IMC) paradigm proposes to perform the calculations in the same cells in which the data is stored, reducing the accesses to the off-chip RAM. This approach can be particularly beneficial for neural networks inference, where the same weights or activations can be reused across several Multiply-And-Accumulate (MAC) operations.
Further advancements in computational efficiency can be achieved by integrating analogic elements inside IMC accelerators, leveraging novel memristive technologies such as ReRAM or Phase-Changing Memory. These devices, however, pose new challenges because of their lower resolution and higher noise sensitivity, and they require ad-hoc training, quantization and deployment methodologies for optimal results.
This PhD thesis research aims to develop novel techniques and tools to optimally train and deploy neural networks on Heterogeneous and Neuromorphic hardware accelerators, in collaboration with STMicroelectronics SRA department (in Cornaredo). In particular, the following contributions are proposed: ;
Furthermore, the scaling of performance and efficiency of traditional hardware architectures is being hindered by the slowdown of Moore’s Law; thus, the research focus in this field is shifting towards novel methodologies, such as Heterogeneous platforms and Neuromorphic computing. In this context, hardware accelerators based on FPGA platforms and ASICs (such as Google’s TPU) are emerging as promising solutions. However, these devices often have limited I/O bandwidths, which worsens the negative impact of the transactions between the processing units and the external memory. To overcome this problem, the In-Memory-Computing (IMC) paradigm proposes to perform the calculations in the same cells in which the data is stored, reducing the accesses to the off-chip RAM. This approach can be particularly beneficial for neural networks inference, where the same weights or activations can be reused across several Multiply-And-Accumulate (MAC) operations.
Further advancements in computational efficiency can be achieved by integrating analogic elements inside IMC accelerators, leveraging novel memristive technologies such as ReRAM or Phase-Changing Memory. These devices, however, pose new challenges because of their lower resolution and higher noise sensitivity, and they require ad-hoc training, quantization and deployment methodologies for optimal results.
This PhD thesis research aims to develop novel techniques and tools to optimally train and deploy neural networks on Heterogeneous and Neuromorphic hardware accelerators, in collaboration with STMicroelectronics SRA department (in Cornaredo). In particular, the following contributions are proposed: ;
- ;
- Selection and study of applicable neuromorphic and/or heterogeneous hardware accelerators for neural networks; ;
- Identification of neural networks suitable for low-precision IMC acceleration, and development of ad-hoc quantization techniques and topological transformations to prepare the input models for lowering on the target hardware; ;
- Exploration of different scheduling and binding strategies to map the proposed models on the target neuromorphic accelerators; ;
- ;
- Comparison of the results of different hardware accelerators and of several mapping techniques, in terms of latency, accuracy, precision and power consumption; ;
- Design of novel acceleration kernels for specific nodes and prototyping on FPGAs. ;
Cookies
We serve cookies. If you think that's ok, just click "Accept all". You can also choose what kind of cookies you want by clicking "Settings".
Read our cookie policy
Cookies
Choose what kind of cookies to accept. Your choice will be saved for one year.
Read our cookie policy
-
Necessary
These cookies are not optional. They are needed for the website to function. -
Statistics
In order for us to improve the website's functionality and structure, based on how the website is used. -
Experience
In order for our website to perform as well as possible during your visit. If you refuse these cookies, some functionality will disappear from the website. -
Marketing
By sharing your interests and behavior as you visit our site, you increase the chance of seeing personalized content and offers.