THESIS
2022
1 online resource (xii, 132 pages) : illustrations (some color)
Abstract
Most of the electronic design automation (EDA) solutions are designed for general
scenarios and when designers aim at the extreme quality of power, performance and area
(PPA) of a specific design, the general solutions might be insufficient. Moreover, the actual
optimization for a practical design requires inter-step techniques. Therefore, in this
thesis, we propose a chain of open-source frameworks, ranging from high-level synthesis
to mixed-size FPGA placement, providing optimization solutions to specific applications,
which realize local optimization techniques in the entire flow and meanwhile pave the
way of comprehensive integration in the future.
During the early stage of large-scale digital circuit design, tools of high-level synthesis
(HLS) are developed to allow designer to des...[
Read more ]
Most of the electronic design automation (EDA) solutions are designed for general
scenarios and when designers aim at the extreme quality of power, performance and area
(PPA) of a specific design, the general solutions might be insufficient. Moreover, the actual
optimization for a practical design requires inter-step techniques. Therefore, in this
thesis, we propose a chain of open-source frameworks, ranging from high-level synthesis
to mixed-size FPGA placement, providing optimization solutions to specific applications,
which realize local optimization techniques in the entire flow and meanwhile pave the
way of comprehensive integration in the future.
During the early stage of large-scale digital circuit design, tools of high-level synthesis
(HLS) are developed to allow designer to describe hardware designs in high-level language,
e.g. C/C++, for fast system evaluation, where the parameters of the system, e.g.,
local memory bandwidth, computation parallelism and clock frequency, can be tuned to
realize design goals. One typical type of digital design is dataflow accelerator, consisting
of modules processing data in a task-level pipeline. Targeting these applications, we
develop an HLS framework which integrates a compiler for application analysis and can
automatically make decisions of parameters for each system module, including clock frequencies,
to achieve maximized throughput.
In the high-level designs, we can notice many recurring design patterns introduced
by both human engineers and EDA tools. For these high-frequency local circuits, the
commonly-used standard cell libraries might be insufficient to reach the optimal PPA.
Accordingly, we presents AutoCellLibX, a circuit pattern mining engine for post-logic-synthesis
netlist to find out whether we can implement custom standard cells to replace
some local circuits and minimize the area demand of a design. The mining engine includes
a frequent-subgraph mining algorithm with the consideration of the constraints
when utilizing custom standard cell and an effective evaluation flow to identify promising
candidates of custom standard cells for a given digital design. The experiments show
that AutoCellLibX can generate a library extension for 31 benchmark designs and the resultant
extension of the standard cell library can save design area by 4.49% averagely.
However, the benefits of custom standard cells highly depends on the back-end design
flow, including placement and routing. In this thesis, we take FPGA as our initial target
to tackle the related placement problems, since in the netlist generated by commercial
logic synthesis tools of FPGA, there are many mixed-size instances with shape constraints
for the latest FPGA architectures, which cannot be handled by previous works. Therefore,
we develop AMF-Placer 1.0, the first open-source wirelength-driven FPGA placer
that can achieve the high-efficiency placement of the complex FPGA netlists with mixed-size
elements generated by modern commercial tools. Based on a set of the latest large
open-source benchmarks from various domains for Xilinx Ultrascale FPGAs, experimental
results indicate that AMF-Placer 1.0 can improve HPWL by 20.4%-89.3% and reduce
runtime by 8.0%-84.2%, compared to the baseline.
Furthermore, while clock frequencies are the core parameters of a digital design, there
are many challenges when handling real-world application design and large-scale FPGA
architectures. Accordingly, we propose AMF-Placer 2.0, a timing-driven mixed-size FPGA
placer standing upon the shoulders of AMF-Placer 1.0. AMF-Placer 2.0 is equipped with
a series of new techniques from global placement to detailed placement. Experimental
results indicate that critical path delays realized by AMF-Placer 2.0 are averagely 2.3%
and 0.69% higher than those achieved by commercial tool Xilinx Vivavo 2020.2 and 2021.2
respectively, with about 10% lower runtime. AMF-Placer 2.0 is the first open-source FPGA
placer which can handle the timing-driven mixed-size placement of practical complex
designs with various FPGA resources and achieves the competitive quality compared to
the latest commercial tools.
Post a Comment