Friedrich-Alexander-Universität DruckenUnivisEnglish FAU-Logo
Techn. Fakultät Willkommen am Department Informatik FAU-Logo
Codesign
Lehrstuhl für Informatik 12
PARO
Department Informatik  >  Informatik 12  >  Forschung  >  Dedizierte massiv parallele Systeme

Models and Methods for the Design of Massively Parallel (VLSI) Systems:

The PARO Design System Project

cube

Abstract

It is a known fact that 90% of the execution time of high performance applications are spent in nested loop programs which offer a tremendous potential of acceleration due to inherent parallelism. Numerous examples from fields of signal processing, medical imaging, financial computing require high performance computing. The FPGAs offer huge amounts of resources for realization of massively parallel hardware accelerators. The major goal of the PARO tool developed at the University of Erlangen-Nuremberg is the automatic generation of hardware accelerators for FPGAs from algorithm descriptions (especially nested loops). The methodology for hardware generation is based on the intuitive and efficient parallelization in the polytope model.

The design trajectory of PARO is shown below.

paro
Starting from an algorithm description, a sequence of high level source to source transformations, scheduling, and RTL generation is applied to obtain a hardware accelerator in form of processor array.

The novelty of the PARO tool design flow is summarized as follows:

  • For design entry, a new language (PAULA) for dataflow-based algorithm description as communicating nested loops is used. The language can be used both for behavioral description as well as architecture description.
  • New multilevel partitioning technologies for balancing memory hierarchies and communication requirements. Several other advanced transformations like localization and standard compiler optimizations like common subexpression elimination, and others are also available in the high level transformation toolbox for obtaining amenable algorithm descriptions in terms of data reuse and resource usage.
  • Scheduling based on Mixed Integer Linear Programming (MILP) with modeling possibilities of resource constraints, speculative execution, software and functional pipelining.
  • Functional simulation and Modelsim simulation at different levels of design flow for validation.
  • The PARO backend produces an intermediate RTL representation which is retargeted to VHDL. Also an automatic testbench generation is included.

Advantages of the tool as compared to other jigh-level synthesis tools lies in

  • The PAULA languages enables a compact and intuitive representation (big operators/reductions like SUM, PROD, MIN, MAX) of a given algorithm.
  • Optimal hardware generation in terms of performance leveraging the available parallelism in the algorithm with respect to resource constraints.
  • Automatic generation of an I/O interface for the hardware accelerator for integration in System-on-Chip on an FPGA.
State-of-the-art medical imaging applications from industry have been successfully synthesized with PARO.


Contact

Frank Hannig
Hardware/Software Co-Design
Department of Computer Science
University of Erlangen-Nuremberg
Cauerstr. 11
91058 Erlangen, Germany

Publications

2016
98 V. Bhadouria, A. Tanase, M. Schmid, F. Hannig, J. Teich and D. Ghoshal.
A Novel Image Impulse Noise Removal Algorithm Optimized for Hardware Accelerators.
Journal of Signal Processing Systems, 2016. ©1
[doi>10.1007/s11265-016-1187-5]
97 M. Witterauf, A. Tanase, F. Hannig and J. Teich.
Modulo Scheduling of Symbolically Tiled Loops for Tightly Coupled Processor Arrays.
In Proceedings of the 27th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), pp. 58-66, IEEE, London, United Kingdom, July 6-8, 2016. ©3
[doi>10.1109/ASAP.2016.7760773]
96 D. Koch, F. Hannig and D. Ziener.
FPGAs for Software Programmers.
327 pages, Springer, 2016, ISBN 978-3-319-26406-6. ©1
[doi>10.1007/978-3-319-26408-0]
95 F. Hannig.
A Quick Tour of High-Level Synthesis Solutions for FPGAs.
In Dirk Koch, Frank Hannig, and Daniel Ziener, editors, FPGAs for Software Programmers, chapter 3, pp. 49-59. Springer, 2016. ©1
[doi>10.1007/978-3-319-26408-0_3]
94 A. Tanase, M. Witterauf, E. Sousa, V. Lari, F. Hannig and J. Teich.
LoopInvader: A Compiler for Tightly Coupled Processor Arrays.
Presentation at the University Booth at Design, Automation and Test in Europe (DATE), Dresden, Germany, March 14-18, 2016. ©1
2015
93 V. Lari, J. Teich, A. Tanase, M. Witterauf, F. Khosravi and B. Meyer.
Techniques for On-Demand Structural Redundancy for Massively Parallel Processor Arrays.
Journal of Systems Architecture (JSA), 61(10):615–627, 2015. ©1
[doi>10.1016/j.sysarc.2015.10.004]
92 A. Tanase, M. Witterauf, J. Teich and F. Hannig.
Symbolic Loop Parallelization for Balancing I/O and Memory Accesses on Processor Arrays.
In Proceedings of the 13th ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE), pp. 188-197, IEEE, Austin, TX, USA, September 21-23, 2015. ©1
[doi>10.1109/MEMCOD.2015.7340486]
91 F. Hannig, D. Koch and D. Ziener.
Proceedings of the Second International Workshop on FPGAs for Software Programmers (FSP 2015).
104 pages, London, United Kingdom, 2015. arXiv: 1508.06320 [cs.AR]. ©1
90 A. Tanase, M. Witterauf, J. Teich, F. Hannig and V. Lari.
On-Demand Fault-Tolerant Loop Processing on Massively Parallel Processor Arrays.
In Proceedings of the 26th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), pp. 194-201, Toronto, Canada, July 27-29, 2015. ©1
[doi>10.1109/ASAP.2015.7245734]
89 M. Witterauf, A. Tanase, J. Teich, V. Lari, A. Zwinkau and G. Snelting.
Adaptive Fault Tolerance through Invasive Computing.
In Proceedings of the 2015 NASA/ESA Conference on Adaptive Hardware and Systems, Montreal, pp. 1-8, Canada, June 15-18, 2015. ©1
[doi>10.1109/AHS.2015.7231155]
88 V. Lari, A. Tanase, J. Teich, M. Witterauf, F. Khosravi, F. Hannig and B. Meyer.
A Co-Design Approach for Fault-Tolerant Loop Execution on Coarse-Grained Reconfigurable Arrays.
In Proceedings of the 2015 NASA/ESA Conference on Adaptive Hardware and Systems, Montreal, pp. 1-8, Canada, June 15-18, 2015. ©1
[doi>10.1109/AHS.2015.7231157]
2014
87 A. Tanase, M. Witterauf, J. Teich and F. Hannig.
Symbolic Inner Loop Parallelisation for Massively Parallel Processor Arrays.
In Proceedings of the 12th ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE), pp. 219-228, Lausanne, Switzerland, October 19-21, 2014. ©3
[doi>10.1109/MEMCOD.2014.6961865]
86 J. Teich, A. Tanase and F. Hannig.
Symbolic Mapping of Loop Programs onto Processor Arrays.
Journal of Signal Processing Systems, 77(1-2):31-59, 2014. ©1
[doi>10.1007/s11265-014-0905-0]
85 F. Hannig, V. Lari, S. Boppu, A. Tanase and O. Reiche.
Invasive Tightly-Coupled Processor Arrays: A Domain-Specific Architecture/Compiler Co-Design Approach.
ACM Transactions on Embedded Computing Systems (TECS), 2014. ©1
[doi>10.1145/2584660]
84 M. Schmid, A. Tanase, V. Badhouria, F. Hannig, J. Teich and D. Ghoshal.
Domain-Specific Augmentations for High-Level Synthesis.
In Proceedings of the 25th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), pp. 173-177, IEEE, Zurich, Switzerland, Jun. 18-20, 2014. ©1
[doi>10.1109/ASAP.2014.6868653]
83 M. Schmid, F. Hannig, A. Tanase and J. Teich.
High-Level Synthesis Revised - Generation of FPGA Accelerators from a Domain-Specific Language using the Polyhedron Model.
In Parallel Computing: Accelerating Computational Science and Engineering (CSE), volume 25 of Advances in Parallel Computing, pp. 497-506, IOS Press, 2014. ©1
[doi>10.3233/978-1-61499-381-0-497]
2013
82 M. Schmid, M. Blocherer, F. Hannig and J. Teich.
Real-Time Range Image Preprocessing on FPGAs.
In Proceedings of the International Conference on Reconfigurable Computing and FPGAs (ReConFig), pp.1-8, Cancun, Mexico, Dec. 09 - 11, 2013. ©1
81 F. Hannig.
High-Level Synthesis Revised: Generation of FPGA Accelerators from a Domain-Specific Language using the Polyhedron Model.
Keynote at Mini-Symposium on Parallel Computing with FPGAs (ParaFPGA) in conjunction with International Conference on Parallel Computing (ParCo), Munich, Germany, Sep. 10, 2013. ©1
80 E. Sousa, A. Tanase, F. Hannig and J. Teich.
A Prototype of an Adaptive Computer Vision Algorithm on MPSoC Architecture.
Proceedings of the Conference on Design and Architectures for Signal and Image Processing (DASIP), pp. 361-362, Cagliari, Italy, Oct. 8-10, 2013. ©1
79 E. Sousa, A. Tanase, F. Hannig and J. Teich.
Accuracy and Performance Analysis of Harris Corner Computation on Tightly-Coupled Processor Arrays.
Proceedings of the Conference on Design and Architectures for Signal and Image Processing (DASIP), pp. 88-95, Cagliari, Italy, Oct. 8-10, 2013. ©1
78 J. Teich, A. Tanase and F. Hannig.
Symbolic Parallelization of Loop Programs for Massively Parallel Processor Arrays.
Proceedings of the 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), 9 p., IEEE, Washington, D.C., USA, Jun. 5-7, 2013 [Best Paper Award]. ©3
77 S. Boppu, F. Hannig and J. Teich.
Loop Program Mapping and Compact Code Generation for Programmable Hardware Accelerators.
Proceedings of the 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), pp. 10–17. IEEE, Washington, D.C., USA, Jun. 5-7, 2013. ©3
76 F. Hannig, M. Schmid, V. Lari, S. Boppu and J. Teich.
System Integration of Tightly-Coupled Processor Arrays using Reconfigurable Buffer Structures.
Proceedings of the ACM International Conference on Computing Frontiers (CF), Ischia, Italy, May 14-16, 2013. ©1
75 E. Sousa, A. Tanase, V. Lari, F. Hannig, J. Teich, J. Paul, W. Stechele, M. Kröhnert and T. Asfour.
Acceleration of Optical Flow Computations on Tightly-Coupled Processor Arrays.
Proceedings of the 25th Workshop on Parallel Systems and Algorithms (PARS), pp. 80-89, Erlangen, Germany, April 11-12, 2013, volume 30 of Mitteilungen - Gesellschaft für Informatik e. V., Parallel-Algorithmen und Rechnerstrukturen, Gesellschaft für Informatik e. V., 2013. ©1
74 F. Hannig.
Resource-Aware Computing on Domain-Specific Accelerators.
Keynote Talk, In Proceedings of the 10st Workshop on Optimizations for DSP and Embedded Systems (ODES), 35 p., Shenzhen, China, Feb. 24, 2013. ©1
[doi>10.1145/2443608.2443616]
2011
73 J. Wasza, S. Bauer, S. Haase, M. Schmid, S. Reichert and J. Hornegger.
RITK: The Range Imaging Toolkit – A Framework for 3-D Range Image Stream Processing.
In Proceeding of International Workshop on Vision, Modeling and Visualization (VMV), pp. 57-64, Berlin, Oct. 2011. ©1
72 J. Cavallaro, M. Ercegovac, F. Hannig, P. Ienne, E. Swartzlander, Jr. and A. Tenca.
Proceedings of the 22nd IEEE International Conference on Application-specific Systems, Architectures, and Processors (ASAP).
IEEE Computer Society, 2011, ISBN 978-1-4577-1292-0. ©1
71 J. Teich, J. Henkel, A. Herkersdorf, D. Schmitt-Landsiedel, W. Schröder-Preikschat and G. Snelting.
Invasive Computing: An Overview.
In Multiprocessor System-on-Chip, M. Hübner and J. Becker (Eds.), Chapter 11, pages 241-268, Springer, 2011. ©1
2010
70 F. Hannig, M. Schmid, J. Teich and H. Hornegger.
A Deeply Pipelined and Parallel Architecture for Denoising Medical Images.
In Proceedings of the IEEE International Conference on Field Programmable Technology (FPT), pp. 485-490, Beijing, China, December 8-10, 2010. ©3
69 F. Hannig.
Communication Synthesis of Loop Accelerator Pipelines.
Talk, Workshop on Compiler-Assisted System-On-Chip Assembly (CASA), Embedded Systems Week (ESWEEK), Scottsdale, AZ, USA, October 28, 2010. ©1
68 F. Hannig.
Retargetable Mapping of Loop Programs on Coarse-grained Reconfigurable Arrays.
Talk, International Conference on Hardware-Software Codesign and System Synthesis (CODES+ISSS), Scottsdale, AZ, USA, October 26, 2010. ©1
67 T. Vander Aa, P. Raghavan, S. Mahlke, B. De Sutter, A. Shrivastava and F. Hannig.
Compilation Techniques for CGRAs: Exploring All Parallelization Approaches.
In Proceedings of the International Conference on Hardware-Software Codesign and System Synthesis (CODES+ISSS), pp. 185-186, Scottsdale, AZ, USA, October 24-29, 2010. ©1
66 J. Teich.
Invasive Computing - Basic Concepts and Foreseen Benefits.
Artist Network of Excellence on Embedded System Design Summer School Europe 2010, Autrans, France, September 7, 2010, Invited Tutorial. ©1
65 J. Teich.
Invasive Computing - An Overview.
Invited Talk, The University of Sydney, Australia, August 9, 2010. ©1
64 J. Teich.
Invasive Computing - A Novel Paradigm for Parallel Computing.
Presentation at the School of Computing, National University of Singapore (NUS), Singapore, August 6. 2010. ©1
63 F. Charot, F. Hannig, J. Teich and C. Wolinski.
Proceedings of the 21st IEEE International Conference on Application-specific Systems, Architectures, and Processors (ASAP).
IEEE Computer Society, 2010, ISBN 978-1-4244-6967-3. ©3
62 H. Dutta, F. Hannig, M. Schmid and J. Keinert.
Modeling and Synthesis of Communication Subsystems for Loop Accelerator Pipelines.
In Proceedings of the 21st IEEE International Conference on Application-specific Systems, Architectures, and Processors (ASAP), pp. 125-132, Rennes, France, July 7-9, pages 125-132, 2010. ©2
61 J. Teich.
Invasive Computing - A Novel Parallel Computing Paradigm.
Invited Talk, Workshop Multiprocessor System-On-Chip (MPSOC): Programmability, Run-Time Support and Hardware Platforms for High Performance Applications, 47th Design Automation Conference (DAC), Anaheim, USA, June 13, 2010. ©1
60 J. Teich.
Invasives Rechnen.
Eingeladener Vortrag, 30. Sitzung Leitungskreis der Fachgruppe RSS (Rechnergestützter Schaltungs- und Systementwurf), VDE, Frankfurt am Main, April 9, 2010. ©1
59 H. Dutta, F. Hannig and J. Teich.
PARO - A Design Tool for Synthesis of Hardware Accelerators for SoCs.
Tool Presentation at the University Booth at Design, Automation and Test in Europe (DATE), Dresden, Germany, March 8-12, 2010. ©1
2009
58 F. Hannig.
Scheduling Techniques for High-Throughput Loop Accelerators.
Dissertation, Hardware/Software Co-Design, Department of Computer Science, University of Erlangen-Nuremberg, Germany, August, 2009. ISBN 978-3-86853-220-3, Verlag Dr. Hut, Munich, Germany. ©1
57 H. Dutta, J. Zhai, F. Hannig and J. Teich.
Impact of Loop Tiling on the Controller Logic of Hardware Acceleration Engines.
Proceedings of 20th IEEE International Conference on Application-specific Systems, Architectures, and Processors (ASAP), pp. 161-168, Boston, MA, USA, July 7-9, 2009. ©1
56 J. Keinert, H. Dutta, F. Hannig, C. Haubelt and J. Teich.
Model-Based Synthesis and Optimization of Static Multi-Rate Image Processing Algorithms.
Proceedings of Design, Automation and Test in Europe (DATE 2009), IEEE Computer Society, Nice, France, April 20-24, 2009, pp. 135-140. ©1
55 F. Hannig, H. Dutta and J. Teich.
Parallelization Approaches for Hardware Accelerators - Loop Unrolling versus Loop Partitioning.
In Proceedings of the 22nd International Conference on Architecture of Computing Systems (ARCS), Delft, The Netherlands, pp. 16-27, March 10-13, 2009. ©1
54 H. Dutta, F. Hannig and J. Teich.
Performance Matching of Hardware Acceleration Engines for Heterogeneous MPSoC using Modular Performance Analysis.
In Proceedings of the 22nd International Conference on Architecture of Computing Systems (ARCS), Delft, The Netherlands, pp. 233-245, March 10-13, 2009. ©1
53 H. Dutta, D. Kissler, F. Hannig, A. Kupriyanov, J. Teich and B. Pottier.
A Holistic Approach for Tightly Coupled Reconfigurable Parallel Processors.
Microprocessors and Microsystems, 33(1):53-62, 2009. ©1
2008
52 J. Teich.
Invasive Algorithms and Architectures.
it - Information Technology, http://it-information-technology.de, Oldenbourg Wissenschaftsverlag, vol. 50(5):300-310, 2008. ©1
51 C. Wolinski, K. Kuchcinski, J. Teich and F. Hannig.
Area and Reconfiguration Time Minimization of the Communication Network in Regular 2D Reconfigurable Architectures.
Proceedings of the International Conference on Field Programmable Logic and Applications (FPL), pp. 391-396, Heidelberg, Germany, September 8-10, 2008. ©1
50 C. Wolinski, K. Kuchcinski, J. Teich and F. Hannig.
Communication Network Reconfiguration Overhead Optimization in Programmable Processor Array Architectures.
Proceedings of the 11th Euromicro Conference on Digital System Design (DSD), pp.345-352, Parma, Italy, September 3-5, 2008. ©1
49 R. Schaffer, R. Merker, F. Hannig and J. Teich.
Utilization of all Levels of Parallelism in a Processor Array with Subword Parallelism.
Proceedings of the 11th Euromicro Conference on Digital System Design (DSD), pp. 391-398, Parma, Italy, September 3-5, 2008. ©1
48 H. Dutta, F. Hannig and J. Teich.
PARO: A Design Tool for Automatic Generation of Hardware Accelerators..
In Proceedings of ACACES 2008 Poster Abstracts: Advanced Computer Architecture and Compilation for Embedded Systems,. ©1
47 C. Wolinski, K. Kuchcinski, J. Teich and F. Hannig.
Optimization of Routing and Reconfiguration Overhead in Programmable Processor Array Architectures.
In Proceedings of the 16th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 306-309, Palo Alto, CA, USA, April 14-15, 2008. ©1
46 J. Teich.
Invasion - A New Parallel Computing and Architecture Paradigm.
Dagstuhl Seminar No. 08141, Organic Computing - Controlled Self-organization, IBFI, March 31- April 4, 2008. ©1
45 F. Hannig, H. Ruckdeschel, H. Dutta and J. Teich.
PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications.
Proceedings of the Fourth International Workshop on Applied Reconfigurable Computing (ARC), Lecture Notes in Computer Science (LNCS), pp. 287-293, Springer, London, United Kingdom, March 26-28, 2008. ©1
44 J. Teich, F. Hannig, H. Dutta, D. Kissler and M. Hartl.
Domain-Specific Reconfigurable MPSoC-Systems - Challenges and Trends.
Talk at Friday Workshop, Reconfigurable Hardware: Emerging Trade-Offs through Granularity, Heterogeneity and Mixed-Signal Capability in Actual and Future Architectures, Design, Automation and Test in Europe (DATE), Munich, Germany, March 10-14, 2008. ©1
43 H. Dutta, F. Hannig and J. Teich.
The PARO Design Tool for Automatic Generation of Hardware Accelerators.
Interactive Presentation at Friday Workshop, The New Wave of the High-Level Synthesis, Design, Automation and Test in Europe (DATE), Munich, Germany, March 10-14, 2008. ©1
42 F. Hannig, H. Ruckdeschel and J. Teich.
The PAULA Language for Designing Multi-Dimensional Dataflow-Intensive Applications.
In Proceedings of the GI/ITG/GMM-Workshop - Methoden und Beschreibungssprachen zur Modellierung und Verifikation von Schaltungen und Systemen, pp. 129-138, Freiburg, Germany, March 3-5, 2008. ©1
41 F. Hannig, H. Dutta, H. Ruckdeschel and J. Teich.
Quantitative Evaluation of Behavioral Synthesis Approaches for Reconfigurable Devices.
In Proceedings of the 2nd HiPEAC Workshop on Reconfigurable Computing (WRC), pp. 73-82, Gothenburg, Sweden, January 27, 2008. ©1
2007
40 H. Dutta, F. Hannig, A. Kupriyanov, D. Kissler, J. Teich, R. Schaffer, S. Siegel, R. Merker and B. Pottier.
Massively Parallel Processor Architectures: A Co-design Approach.
Proceedings of the 3rd International Workshop on Reconfigurable Communication Centric System-on-Chips (ReCoSoC), pp. 61-68, Montpellier, France, June 18-20, 2007. ©1
39 J. Teich, F. Hannig, H. Ruckdeschel, H. Dutta, D. Kissler and A. Stravet.
A Unified Retargetable Design Methodology for Dedicated and Re-Programmable Multiprocessor Arrays: Case Study and Quantitative Evaluation.
In Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA), Invited paper, pp. 14-24, Las Vegas, NV, USA, June 25-28, 2007. ©1
38 H. Dutta, F. Hannig, H. Ruckdeschel and J. Teich.
Efficient Control Generation for Mapping Nested Loop Programs onto Processor Arrays.
In Journal of Systems Architecture, 53(5-6):300-309, 2007. ©1
2006
37 S. Siegel, R. Merker, F. Hannig and J. Teich.
Communication-conscious Mapping of Regular Nested Loop Programs onto Massively Parallel Processor Arrays.
In Proceedings of the 18th International Conference on Parallel and Distributed Computing and Systems (PDCS), pp. 71-76, Dallas, TX, USA, November 13-15, 2006. ©1
36 H. Dutta, F. Hannig and J. Teich.
Hierarchical Partitioning for Piecewise Linear Algorithms.
In Proceedings of the 5th International Symposium on Parallel Computing in Electrical Engineering (PARELEC), pp. 153-159, Bialystok, Poland, September 13-17, 2006. ©1
35 H. Dutta, F. Hannig, J. Teich, B. Heigl and H. Hornegger.
A Design Methodology for Hardware Acceleration of Adaptive Filter Algorithms in Image Processing.
In Proceedings of IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP), pp. 331-337, Steamboat Springs, CO, USA, September 11-13, 2006. ©1
34 F. Hannig, H. Dutta and J. Teich.
Mapping a Class of Dependence Algorithms to Coarse-grained Reconfigurable Arrays: Architectural Parameters and Methodology.
In International Journal of Embedded Systems, Vol. 2, Nos. 1/2, pp. 114-127, 2006. ©1
33 H. Dutta, F. Hannig and J. Teich.
A Formal Methodology for Hierarchical Partitioning of Piecewise Linear Algorithms.
Technical Report 04-2006, University of Erlangen-Nuremberg, Department of CS 12, Hardware-Software-Co-Design, Am Weichselgarten 3, 91058 Erlangen, Germany, April 2006. ©1
32 H. Dutta, F. Hannig and J. Teich.
Controller Synthesis for Mapping Partitioned Programs on Array Architectures.
In Proceedings of the 19th International Conference on Architecture of Computing Systems (ARCS), Frankfurt/Main, Germany, pp. 176-191, March 13-16, 2006. ©1
31 H. Dutta, F. Hannig and J. Teich.
Mapping of Nested Loop Programs onto Massively Parallel Processor Arrays with Memory and I/O Constraints.
In Friedhelm Meyer auf der Heide and Burkhard Monien, editors, Proceedings of the 6th International Heinz Nixdorf Symposium, New Trends in Parallel & Distributed Computing, volume 181 of HNI-Verlagsschriftenreihe, pp. 97-119, Paderborn, Germany, January 17-18, 2006. ©1
2005
30 H. Dutta, F. Hannig and J. Teich.
Control Path Generation for Mapping Partitioned Dataflow-dominant Algorithms onto Array Architectures.
Technical Report 03-2005, University of Erlangen-Nuremberg, Department of CS 12, Hardware-Software-Co-Design, Am Weichselgarten 3, 91058 Erlangen, Germany, November 2005. ©1
29 H. Ruckdeschel, H. Dutta, F. Hannig and J. Teich.
Automatic FIR Filter Generation for FPGAs.
In Proceedings of the International Workshop on Embedded Computer Systems, Architectures, Modeling, and Simulation (SAMOS), Samos, Greece, pp. 51-61, July 18-20, 2005. ©1
28 F. Hannig and J. Teich.
Output Serialization for FPGA-based and Coarse-grained Processor Arrays.
In Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA), Las Vegas, NV, USA, pp. 78-84, June 27-30, 2005. ©1
27 F. Hannig, H. Dutta, A. Kupriyanov, J. Teich, R. Schaffer, S. Siegel, R. Merker, R. Keryell, B. Pottier and D. Chillet, D. Ménard, O. Sentieys.
Co-Design of Massively Parallel Embedded Processor Architectures.
In Proceedings of the first ReCoSoC Workshop. Montpellier, France, June 27-29, 2005. ©1
2004
26 M. Bednara.
Design Automation for Massively Parallel Processor Arrays: Transforming Regular Algorithms to Reconfigurable Hardware.
PhD Thesis, University of Erlangen-Nuremberg, Department of Computer Science 12, Erlangen, Germany, 2004. ©1
25 F. Hannig and J. Teich.
Resource Constrained and Speculative Scheduling of an Algorithm Class with Run-Time Dependent Conditionals.
In Proceedings of the 15th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2004), pp. 17-27, Galveston, TX, USA, September 27-29, 2004. ©1
24 A. Kupriyanov, F. Hannig and J. Teich.
Automatic and Optimized Generation of Compiled High-Speed RTL Simulators.
In Proceedings of the Workshop on Compilers and Tools for Constrained Embedded Systems (CTCES 2004). Washington, DC, U.S.A., September 22, 2004. ©1
23 F. Hannig and J. Teich.
Dynamic Piecewise Linear/Regular Algorithms.
In Proceedings of the Fourth International Conference on Parallel Computing in Electrical Engineering (PARELEC 2004), pp. 79-84, Dresden, Germany, September 7-10, 2004. ©1
22 A. Kupriyanov, F. Hannig and J. Teich.
High-Speed Event-Driven RTL Compiled Simulation.
In Proceedings of the International Workshop on Systems, Architectures, Modeling and Simulation (SAMOS'04), Samos, Greece, July 19-21, published as Springer Lecture Notes in Computer Science (LNCS), volume 3133, pages 519-529, 2004. ©1
21 F. Hannig and J. Teich.
Resource Constrained and Speculative Scheduling of Dynamic Piecewise Regular Algorithms.
Technical Report 01-2004, University of Erlangen-Nuremberg, Department of CS 12, Hardware-Software-Co-Design, Am Weichselgarten 3, 91058 Erlangen, Germany, June 2004. ©1
20 F. Hannig, H. Dutta and J. Teich.
Regular Mapping for Coarse-grained Reconfigurable Architectures.
In Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), Vol. V, pp. 57-60, Montréal, Quebec, Canada, May 17-21, 2004. ©1
19 F. Hannig, H. Dutta and J. Teich.
Mapping of Regular Nested Loop Programs to Coarse-grained Reconfigurable Arrays -- Constraints and Methodology.
In Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), Santa Fe, NM, USA, April 26-30, 2004. ©1
18 F. Hannig and J. Teich.
Energy Estimation and Optimization for Piecewise Regular Processor Arrays.
In Shuvra S. Bhattacharyya, Ed F. Deprettere and Jürgen Teich (eds.). Chapter 6 in Domain-Specific Processors: Systems, Architectures, Modeling, and Simulation, pages 107-126. Number 20 in Signal Processing and Communication Series, Marcel Dekker, New York, U.S.A., 2004. ©1
2002
17 F. Hannig and J. Teich.
Energy Estimation of Nested Loop Programs.
Proceedings 14th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA 2002), Winnipeg, Manitoba, Canada, August 10-13, 2002. ©1
16 F. Hannig and J. Teich.
Energy Estimation for Piecewise Regular Processor Arrays.
In Proceedings of the Second International Samos Workshop on Systems, Architectures, Modeling, and Simulation (SAMOS 2002). Island of Samos, Greece, July 22-25, 2002. ©1
15 M. Bednara and J. Teich.
Interface Synthesis for FPGA Based VLSI Processor Arrays.
In Proc. of The International Conference on Engineering of Reconfigurable Sytsems and Algorithms (ERSA02), Las Vegas, Nevada, U.S.A., June 24-27, 2002 . ©1
14 J. Teich and L. Thiele.
Exact Partitioning of Affine Dependence Algorithms.
In Embedded Processor Design Challenges, E. Deprettere, J. Teich, and S. Vassiliadis, editors, Lecture Notes in Computer Science (LNCS), Vol. 2268, pp. 135-151, Springer, Berlin, Germany, March 2002. ©1
13 M. Bednara, F. Hannig and J. Teich.
Generation of Distributed Loop Control.
In Embedded Processor Design Challenges, E. F. Deprettere, J. Teich, and S. Vassiliadis, editors, Lecture Notes in Computer Science (LNCS), Vol. 2268, pp. 154-170, Springer, Berlin, Germany, March 2002. ©1
2001
12 M. Bednara, F. Hannig and J. Teich.
Boundary Control: A new Distributed Control Architecture for Space-Time Transformed (VLSI) Processor Arrays.
Proc. 35th IEEE Asilomar Conf. on Signals, Systems and Computers, Pacific Grove,California, USA, November 2001. ©1
11 F. Hannig and J. Teich.
Design Space Exploration for Massively Parallel Processor Arrays.
In Proc. of the Sixth International Conference on Parallel Computing Technologies (PaCT-2001), Novosibirsk, Russia, September 3-7, 2001. ©1
10 J. Teich.
Exact Partitioning of Affine Dependence Algorithms.
Proc. SAMOS - Systems, Architectures, Modeling and Simulation Workshop, Island of Samos, Greece, July 13-16, 2001. ©1
9 M. Bednara and J. Teich.
Synthesis of FPGA Implementations from Loop Algorithms.
In Proc. of the First International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA?01), pp. 1-7, Las Vegas, Nevada, U.S.A., June 25-28, 2001. ©1
8 M. Bednara, O. Beyer, J. Teich and R. Wanka.
Hardware Supported Sorting: Design and Tradeoff Analysis.
In System Design Automation, R. Merker and W. Schwarz, editors, Kluwer Academic Publishers, pp. 97-107, 2001. ©1
2000
7 F. Cieslok, H. Esau and J. Teich.
EXPLORA - Generic Design Space Exploration During Embedded System Synthesis.
Proc. DIPES 2000, Int. IFIP Workshop on Distributed and Parallel Embedded Systems. Schloss Eringerfeld, Germany, October 2000. In Architecture and Design of Distributed Embedded Systems, B. Kleinjohann, editor, Kluwer Academic Publishers, pp. 215-225, June 2001. ©1
6 M. Bednara, O. Beyer, J. Teich and R. Wanka.
Tradeoff Analysis and Architecture Design of a Hybrid Hardware/Software Sorter.
Proc. ASAP'00, the Int. Conf. on Application Specific Systems, Architectures, and Processors, pp. 299-308, Boston, MA, U.S.A. IEEE Computer Society Press, July 2000. ©1
5 S. Bhattacharyya, J. Teich and E. Zitzler.
Optimizing the Efficiency of Parameterized Local Search within Global Search:.
Proc. of CEC'2000, the Int. Conf. on Evolutionary Computation, La Jolla, CA, U.S.A., pp. 365-372, July 2000. ©1
4 J. Teich.
Embedded System Synthesis and Optimization.
Invited paper, Proc. Workshop on System Design Automation - SDA 2000, pp. 9-22, Rathen, Germany. VDE-Verlag, March 2000. ©1
3 C. Böke, C. Ditze, W. Hardt, B. Kleinjohann, F. Rammig, A. Rettberg, J. Stroop and J. Teich.
IP-based System Design with the PARADISE Design Environment.
Accepted, J. Euromicro, March 2000. ©1
2 M. Bednara, W. Hardt, A. Rettberg and J. Teich.
Automated Design Space Exploration on System Level for Embedded Systems.
Proc. Ninth Annual International HDL Conference and Exhibition (HDL Conf. 2000), San Jose, CA, U.S.A., March 2000. ©1
1 M. Bednara, O. Beyer, J. Teich and R. Wanka.
Hardware-Supported Sorting: Design and Tradeoff Analysis.
Workshop on System Design Automation - SDA 2000, pp.37-44, Rathen, Germany. VDE-Verlag, March 2000. ©1

Studienarbeiten and Diploma Theses

2009
23 J. Zhai.
Compiler-based Application Engine Synthesis from Dataflow Description.
Lehrstuhl Hardware-Software-Co-Design, Universität Erlangen-Nürnberg, Juli 2009.. ©1
2008
22 S. Malipatlolla.
Automatic Interface Generation for Integration of Hardware Accelerators.
Hardware/Software Co-Design, Dept of Computer Science-12, University of Erlangen-Nuremberg. ©1
21 J. Zhai.
Konzeption und Implementierung einer Eingabe/Ausgabe-Schnittstelle für dedizierte parallele Prozessorfelder.
Studienarbeit, Lehrstuhl Hardware-Software-Co-Design, Universität Erlangen-Nürnberg, Juni 2008.. ©1
2007
20 A. Stravet.
Konzeption und Implementierung einer leistungsoptimierten Rekonfigurationssteuerung für schwachprogrammierbare Prozessorfelder.
Diploma thesis (Diplomarbeit), Department of Computer Science, University of Erlangen-Nuremberg, Germany, December 2007.. ©1
19 A. Stravet.
Digitale Signalverarbeitung auf parallelen Prozessorfeldern.
Studienarbeit, Lehrstuhl Hardware-Software-Co-Design, Universität Erlangen-Nürnberg, Oktober 2007. ©1
18 S. Nehls.
Zwischencodeoptimierungen auf Schleifenprogrammen in Single-Assignment-Code-Sprachen.
Dept. of Computer Science 12, Hardware-Software-Co-Design, University of Erlangen-Nuremberg . ©1
2006
17 H. Ruckdeschel.
Prozessorfeld-Synthese für partitionierte verschachtelte Schleifenprogramme.
Diplomarbeit, Lehrstuhl Hardware-Software-Co-Design, Universität Erlangen-Nürnberg, Oktober 2006. ©1
16 A. Gnezdilov.
Evaluierung eines Werkzeugs zur System-Synthese am Beispiel eines Audiodekoders.
Dept. of Computer Science 12, Hardware-Software-Co-Design, University of Erlangen-Nuremberg . ©1
15 N. Tchoukio.
Conception and Implementation of a Reference Imaging Pipeline on an FPGA.
Lehrstuhl für Informatik-12 Hardware-Software-Co-Design. ©1
14 T. Tchoukio.
Conception and Implementation of a Reference Medical Imaging Pipeline on an FPGA.
Studienarbeit, Lehrstuhl Hardware-Software-Co-Design, Universität Erlangen-Nürnberg, Juni 2006. . ©1
2005
13 H. Ruckdeschel.
Entwicklung eines VHDL-Codegenerators für hierarchisch partitionierte FIR-Filter.
Studienarbeit, Lehrstuhl Hardware-Software-Co-Design, Universität Erlangen-Nürnberg, November 2005. ©1
12 S. Kerschbaum.
Ein Simulator für stückweise lineare Algorithmen mit laufzeitabhängigen Konditionalen.
Studienarbeit, Lehrstuhl Hardware-Software-Co-Design, Universität Erlangen-Nürnberg, August 2005. ©1
11 M. Mitzlaff.
Instruktionssatzentwurf und Codegenerierung für schwachprogrammierbare Prozessoren.
Studienarbeit, Lehrstuhl Hardware-Software-Co-Design, Universität Erlangen-Nürnberg, August 2005. ©1
10 H. Ruckdeschel.
Entwicklung eines VHDL-Codegenerators für hierarchisch partitionierte FIR-Filter.
Dept. of Computer Science 12 Hardware-Software-Co-Design University of Erlangen-Nuremberg. ©1
2004
9 H. Dutta.
Mapping of Hierarchically Partitioned Regular Algorithms onto Processor Arrays.
Diplomarbeit, Lehrstuhl Hardware-Software-Co-Design, Universität Erlangen-Nürnberg, Oktober 2004. ©1
2002
8 O. Beyer.
Implementierung eines Verfahrens zur Parallelisierung geschachtelter C-Schleifenprogramme.
Diplomarbeit, Datentechnik, Fachbereich Elektrotechnik und Informationstechnik, Universität Paderborn, März 2002. ©1
7 M. Daldrup.
Entwurf eines FPGA-basierten Koprozessors zur Kryptographie mit Elliptischen Kurven.
Diplomarbeit, Datentechnik, Fachbereich Elektrotechnik und Informationstechnik, Universität Paderborn, Januar 2002. ©1
2001
6 D. Konstantinidis.
Automatische Generierung von Platzierungsdaten für massiv parallele Prozessorfelder.
Diplomarbeit, Datentechnik, Fachbereich Elektrotechnik und Informationstechnik, Universität Paderborn, Juni 2001. ©1
5 C. Grabbe.
Synthese regelmäßiger Strukturen für FPGAs mittels der JBits-API.
Diplomarbeit, Datentechnik, Fachbereich Elektrotechnik und Informationstechnik, Universität Paderborn, Juni 2001. ©1
2000
4 F. Hannig.
Exploration von Raum- und Zeittransformationen für Algorithmen mit uniformen Datenabhängigkeiten.
Diplomarbeit, Datentechnik, Fachbereich Elektrotechnik und Informationstechnik, Universität Paderborn, Oktober 2000. ©1
3 D. Konstantinidis.
Implementierung eines VHDL-Testbenchgenerators als Erweiterung des Entwurfssystems PARO.
Studienarbeit, Datentechnik, Fachbereich Elektrotechnik und Informationstechnik, Universität Paderborn, Juli 2000. ©1
2 O. Beyer.
Sortieren in Hard- und Software - Analyse und Synthese.
Studienarbeit, Datentechnik, Fachbereich Elektrotechnik und Informationstechnik, Universität Paderborn, Juli 2000. ©1
1 H. Esau.
Design and implementation of a program for generic solution space exploration.
Diplomarbeit, Datentechnik, Fachbereich Elektrotechnik und Informationstechnik, Universität Paderborn, Mai 2000. ©1

  Impressum Stand: 27 March 2015.   F.H.