Models and Methods for the Design of Massively Parallel (VLSI) Systems:
The PARO Design System Project
It is a known fact that 90% of the execution time of high performance
applications are spent in nested loop programs which offer a tremendous
potential of acceleration due to inherent parallelism. Numerous examples
from fields of signal processing, medical imaging, financial computing
require high performance computing. The FPGAs offer huge amounts of
resources for realization of massively parallel hardware accelerators.
The major goal of the PARO tool developed
at the University of Erlangen-Nuremberg is the automatic generation of
hardware accelerators for FPGAs from algorithm descriptions (especially nested loops).
The methodology for hardware generation is based on the intuitive and
efficient parallelization in the polytope model.
The design trajectory of PARO is shown below.
Starting from an algorithm description, a sequence of high level source
to source transformations, scheduling, and RTL generation is applied to
obtain a hardware accelerator in form of processor array.
The novelty of the PARO tool design flow is summarized as follows:
- For design entry, a new language (PAULA) for dataflow-based algorithm description
as communicating nested loops is used. The language can be used both for
behavioral description as well as architecture description.
- New multilevel partitioning technologies for balancing memory hierarchies and
communication requirements. Several other advanced transformations like
localization and standard compiler optimizations like common subexpression
elimination, and others are also available in the high level transformation
toolbox for obtaining amenable algorithm descriptions in terms of data reuse
and resource usage.
- Scheduling based on Mixed Integer Linear Programming (MILP) with modeling possibilities
of resource constraints, speculative execution, software and functional
- Functional simulation and Modelsim simulation at different levels of design flow
- The PARO backend produces an intermediate RTL
representation which is retargeted to VHDL. Also an automatic testbench generation
Advantages of the tool as compared to other jigh-level synthesis tools lies in
State-of-the-art medical imaging applications from industry have been successfully
synthesized with PARO.
- The PAULA languages enables a compact and intuitive representation (big
operators/reductions like SUM, PROD, MIN, MAX) of a given algorithm.
- Optimal hardware generation in terms of performance leveraging the available
parallelism in the algorithm with respect to resource constraints.
- Automatic generation of an I/O interface for the hardware accelerator for
integration in System-on-Chip on an FPGA.
Department of Computer Science
University of Erlangen-Nuremberg
91058 Erlangen, Germany
Fehler bei der Datenbankverbindung