THESIS
2022
1 online resource (xi, 637 pages) : color illustrations
Abstract
Relying on the guidance of optimization directives, high-level synthesis, a part of EDA
flow compiling behavioral specifications in high-level languages into register-transfer level
structures, facilitates the ever-growing functional and structural complexity of hardware
accelerators. To deploy the large modern applications, and meet the chip yield in fabrication,
larger FPGAs with multiple dies emerge as times require. However, two factors
impede the implementation of a high-performance accelerator on a multi-die FPGA. On
the one hand, wires crossing die-boundaries incur excessively long delays and harm the
achievable frequency. On the other hand, traditional automatic design-space exploration
(DSE) of high-performance directive settings for an HLS design ignores the topology of
device...[
Read more ]
Relying on the guidance of optimization directives, high-level synthesis, a part of EDA
flow compiling behavioral specifications in high-level languages into register-transfer level
structures, facilitates the ever-growing functional and structural complexity of hardware
accelerators. To deploy the large modern applications, and meet the chip yield in fabrication,
larger FPGAs with multiple dies emerge as times require. However, two factors
impede the implementation of a high-performance accelerator on a multi-die FPGA. On
the one hand, wires crossing die-boundaries incur excessively long delays and harm the
achievable frequency. On the other hand, traditional automatic design-space exploration
(DSE) of high-performance directive settings for an HLS design ignores the topology of
devices, while a multi-die FPGA introduces separate resource constraints on each die.
In this work, we propose floorplan-aware DSE, a directive-floorplan co-searching framework
for efficiently finding the minimum-latency directive settings, and the corresponding
floorplan constraints for guiding the placement and routing of an HLS design.
We first formulate the directive-floorplan co-searching problem into a variant of the
multi-choice multi-dimensional bin-packing problem. Then, we model the latency of
dataflow and non-dataflow functions in HLS designs and propose a latency-bottleneck-guided
greedy algorithm for directive DSE. To legalize the result from directive DSE, we
implement an efficient incremental floorplan updating algorithm, which applies the best-fit strategy in the online bin-packing algorithm, with an offline repacking algorithm for
compacting the floorplan, followed by some heuristics in DSE.
On the three assembled benchmarks, assisted by incremental floorplanning, floorplan-aware
DSE reaches or almost reaches the minimum-achievable latency on the Pareto-frontier
164x ∼ 597x faster than using mixed-integer linear programming (MILP), meanwhile
yielding similar maximum achievable frequency after implementation on multi-die
FPGA.
Post a Comment