#### Marco Vitone Development of innovative techniques and methodologies for analysis and testing of Storage Systems interfaces based on System on Chip Tutor: Prof. Nicola Petra co-Tutor: Eng. Claudio Giaccio > Year: 3 Cycle: XXXVI # **Background information** - MSc degree in Electronics Engineering at University of Naples Federico II - Research group/laboratory : VLSI Electronics - PhD start and end dates: 1/11/2020 31/10/2023 - Scholarship type: founded by Micron Semiconductor Italia S.R.L. - Cooperation : Micron Hardware Validation Tool Team # Summary of study activities - Ad hoc courses: Digital Forensics' methods practices and tools, Data Science for Patient Record Analysis, Scientific Programming and Visualization with Python, Statistical Data Analysis for Science and Engineering Research, Imprenditorialità Accademica, Cambridge English Preliminary (PET), Strategic Orientation for STEM Research & Writing. - MSc courses: FPGA per l'elaborazione dei segnali, Dispositivi e Sistemi Fotovoltaici. - **PhD School:** "Electronics for IoT", SIE Associazione Società Italiana di Elettronica, Trieste, 5-7/7/2021. #### Credits Summary: | PhD<br>Year | Courses | Seminars | Research | Tutoring /<br>Supplementary<br>Teaching | |-----------------|---------|----------|----------|-----------------------------------------| | 1 <sup>st</sup> | 32.9 | 10.7 | 23.6 | 0 | | 2 <sup>nd</sup> | 15 | 7.1 | 37.9 | 0 | | 3 <sup>rd</sup> | 0 | 0 | 60 | 0 | ### Research area - System on Chip is a complex IC that integrates CPU, on-chip memory, programmable logic (FPGA) on a single chipset. - Applications: signal processing, communication, networking, automotive, storage system management. #### Problems: - Validation: system level simulations, protocol interfaces analysis, etc. - Design: hardware accelerator, firmware development, peripherals management, etc. ### Research results - My research can be divided into two main activities. - Validation and debug of complex SoC designs: - Implementation of simulation environment for a complex SoC devices adopting the Universal Verification Methodology (UVM). - Development of an innovative hardware emulation technique that aims at introducing the hardware in the loop inside the validation of complex systems based on SoC platform. - Design of hardware accelerator for SoC devices: - Implementation of a novel FFA algorithm for efficient hardware implementation of convolution. - Development of a hardware data path for the acceleration of Convolutional Neural Networks (CNN). # Research products | | M.Vitone, N.Petra | |------|-----------------------------------------------------------------------------------| | [C1] | Reconfigurable Datapath for Hardware Acceleration of Convolutional Neural Network | | | 52nd Annual Meeting of Associazione Società Italiana di Elettronica (SIE) | | | Trieste, Italy, Jult 2021 | # PhD thesis overview: validation and debug #### Problem statement: Validation of the Micron system (digital, complex, SoC + ASIC) #### Objective: - Validation using Universal Verification Methodology (UVM) - UVM architecture with reduced execution time - Emulation architecture with hardware in the loop Micron Architecture: **PC + SoC + ASIC + UFS** #### **UVM ARCHITECTURE** 9 #### **UVM ARCHITECTURE** **Compliant standard UVM** #### **UVM ARCHITECTURE** **Compliant standard UVM** **Reduced simulation time** • A novel save and restart technique for Micron System: | | N° Scenario | N° Test | Regression<br>Time | % Simulation Time<br>Saved | % Test<br>Reduced | |----------|-------------|---------|--------------------|----------------------------|-------------------| | Micron | 82 | 82 | 14 h 39 m | - | - | | Proposed | 82 | 58 | 11 h 48 m | 19.5 | 29.3 | Marco Vitone 12 ### PhD Thesis: emulation ### PhD Thesis: emulation #### Motivations: - > Target coverage: reachable by means the evaluation of 20K scenarios. - **Estimated simulation time**: 5 minutes for each scenarios. - Emulation allows achieving both goals. - Experimental results in Micron Technology to evaluate 20K different scenarios: | | N°<br>Scenarios | Execution<br>Time | Target<br>Coverage | |--------------------|-----------------|-------------------|--------------------| | Software approach | 20 K | ~ 7 days | UNREACHABLE | | Emulation approach | 20 K | ~ 5 minutes | ACHIEVED | # PhD thesis overview: hardware accelerator for NN Main operation: finite impulse response linear computation $$y(n) = \sum_{i=0}^{k-1} x(n-i)h(i)$$ - Hardware implementation: multiply-accumulate (MAC) - Typical neural network requires a few millions MACs - Parallel computation and reuse of partial values: 15 ### PhD thesis: hardware accelerator for NN Alphabet chosen for shared partial values: $$P_{typeI}\left(\delta,\gamma,i\right) = \sum_{j_{1}=\delta}^{\delta+\gamma} x\left(k \cdot i - 1 - j_{1}\right) \cdot \sum_{j_{2}=k-(\delta+\gamma)-1}^{k-\delta-1} h\left(j_{2}\right)$$ $$P_{typeII}\left(\delta,\gamma,i\right) = \\ \left[x(k\cdot i-1-\delta) + x(k\cdot i-1-(\delta+\gamma))\right]\cdot \left[h(k-(\delta+\gamma)-1) + h(k-\delta-1)\right]$$ $$\delta, \gamma \in [0, k-1]$$ - Each partial value requires 1 multiplication. - Several possible alphabets: algorithmic search of a reduced size alphabet for each filter size. ### PhD Thesis: hardware accelerator for NN - Hardware implementation of the Fast FIR Algorithm applied to Alex-Net. - **Test chip** fabricated in 28 nm TSMC CMOS technology. - Reconfigurable hardware accelerator: same accelerator for different network layers. information technology ### PhD Thesis: hardware accelerator for NN # PhD Thesis- Test chip Layout | Technology | TSMC CMOS<br>28nm | |--------------------------|-------------------| | Area( $mm^2$ ) | 0.152 | | Power dissipation $(mW)$ | 68.24 | | Frequency (MHz) | 500 | | Layer | Kernel | Multiplications required | Multiplications required from | Proposed Datapath Latency | |-------|--------|---------------------------|-------------------------------|---------------------------| | | | from standard convolution | proposed algorithm | (clock cycles) | | 1 | 11x11 | 105×10 <sup>6</sup> | 76×10 <sup>6</sup> | 0.8×10 <sup>6</sup> | | 2 | 5x5 | 223×10 <sup>6</sup> | 143×10 <sup>6</sup> | 1.75×10 <sup>6</sup> | | 3 | 3x3 | 149×10 <sup>6</sup> | 115×10 <sup>6</sup> | 1.35×10 <sup>6</sup> | | 4 | 3x3 | 112×10 <sup>6</sup> | 86×10 <sup>6</sup> | 1×10 <sup>6</sup> | | 5 | 3x3 | 74×10 <sup>6</sup> | 57×10 <sup>6</sup> | 0.7×10 <sup>6</sup> | ### Thanks for the attention!