A Quick Introduction into the ReCoBus Technology
ReCoBus Technology in a Nutshell
- Highly optimized communication architecture for FPGAs
- Supports Buses (shared memory),
module-to-module communication, and
- High bus throughput (more than one gigabyte per second possible)
- Finest demonstrated placement grid (for reducing internal fragmentation)
- Suitable for two dimensional placement for improving block RAM and multiplier utilization
- Low resource overhead
- Simple system generation with the tool ReCoBus-Builder
Flexible Module Placement
Up to now, almost all reconfigurable systems follow an island style placement allowing to integrate
one reconfigurable module per island into the system at a point of time.
Opposed to this, the ReCoBus technology allows sharing one reconfigurable resource area to multiple
modules at the same time that may be placed in any 1D (slot-style) or even 2D (grid-style) fashion.
As revealed in the figure, more sophisticated reconfiguration styles help substantially at reducing a
waste of FPGA resources, because bounding boxes can be adjusted tighter to the entire resource requirements.
An outstanding property of the ReCoBus technology is the capability to provide very narrow resource slots
(as narrow as one CLB column on Xilinx FPGAs) for increased flexibility and reducing the effects of internal fragmentation.
Even providing such a level of placement flexibility, the latency and the logic cost for the communication
between the static part and the reconfigurable modules can compete with static only systems.
Note that ReCoBus-based systems can efficiently deal with dedicated resources, such as dedicated RAM-Blocks or multipliers.
Buses for Runtime Reconfigurable Systems on FPGAs
Buses and point-to-point links are sufficient for providing virtually any kind of communication in an FPGA-based system.
All bus signals can be classified into four groups distinguishing if a signal
is dedicated or shared among multiple modules and if a signal is used for read or write operations:
- Dedicated Write: chip select signals or bus grants from an arbiter
- Shared Write: write data, the address, or shared control signals (like write enable)
- Dedicated Read: interrupt requests or bus requests (from a master to an arbiter).
- Shared Read: read data or the address (from a master within the reconfigurable area to access a slave)
If we now want to integrate reconfigurable modules at runtime, we have to
assign a certain region of the FPGA for the dynamic part of the system.
This dynamic region has to be tiled in resource slots and a module may occupy multiple consecutive
slots as shown in the following figure.
We have to provide a predefined communication architecture as we cannot perform
any routing at runtime.
Comparable to a backplane bus on a PCB, the connections to a specific signal must
be accessible within all resource slots at exactly the same relative position.
Furthermore, the routing of the communication architecture has to be arranged
in an uniformed manner as well.
This helps preventing conflicts between the communication architecture and the routing of the partial modules.
The following figure illustrates how a typical bus multiplexer (let's say for read data line D0)
can be implemented in such a regular manner.
As we want to fit the modules in tight bounding boxes, the resource slots should be narrow
for reducing internal fragmentation.
As a consequence, the resource slot count will rise.
But implementing such a chained structure would lead to an enormous
logic overhead - and even worse - to a long and therefore slow combinatory path.
To overcome this issue a ReCoBus can interleave multiple independent chains as shown next.
Back to top
In this simple ReCoBus example, two modules located within the reconfigurable area are connected
to a bus containing four independent interleaved multiplexer chains
(highlighted in red, blue, green, and brown).
Despite the fact that the system provides eight resource slots, the logic depth within the reconfigurable
sub-system is just two LUTs deep.
For a better understanding, let us assume that each chain represents a set of eight signal lines.
As these chains are independent, the complete ReCoBus may feature a complete 32 bit wide bus.
In order to access all 32 signal lines, a module must be at least N resource slots wide when
the ReCoBus is built upon N interleaved chains.
The Figure illustrates this for Module 1 that is four resource slots wide in a system containing a ReCoBus
with an interleaving factor of N=4.
In other words, the interface grows with the size (=complexity) of the module.
In order to allow a free module placement, we have to adjust the four chain ends according to
the present position of the module that is currently selected.
This is achieved by the help of an additional alignment multiplexer.
The scaling of the interface to multiple slots matches ideal to many practical systems.
For an example, a simple UART is most often connected by an 8 bit wide interface to the system bus,
while a more complex Ethernet core typically demands an 32 bit wide interface.
The interleaving reflects only a few details about the highly optimized ReCoBus architecture.
Further techniques deal with efficient methodologies for providing dedicated signals from or to a module.
The tool ReCoBus-Builder is capable to generate such complex hard macros in a highly parametrizable fashion.
These macros contain most of the bus logic as well as a complete homogeneous routing from resource slot to resource slot.
The tool is designed to hide most FPGA low level stuff from you.
Interested in more? Then proceed to our download
I/O Bars: The Flexible Way for Providing Dedicated Links in Reconfigurable Systems
Back to top
Beside buses, the ReCoBus-Builder tool allows to build dedicated communication macros for
connecting I/O pins as well as for providing dedicated module-to-module channels.
Consequently, virtually all systems may be designed in a modular fashion suitable
for partial runtime reconfiguration.
The following figure gives an example of a system containing a ReCoBus and two separate I/O bars -
one I/O bar for video data and another one for audio data.
The camera and the microphone is connected to the right hand side of the system and the monitor and the speaker
to the left hand side, respectively.
In all resource slots, the I/O bar is either implemented by a bypass primitive or by a connection primitive.
The latter one allows the connected module to directly access the connection bar for read and/or write operations.
Read and write access is available at the same time.
Consequently, a module may read an input stream, modifying it, and after this, sending it further along the same I/O bar
to another module performing some other modifications on the stream.
This matches ideal to distribute the data streams of multimedia or network processing systems.
Note that partial reconfiguration can be used to selectively bypass or connect a specific resource slot or module at runtime.
This ensures that the I/O bar stream is not interrupted during the reconfiguration process of a module.
The bypass primitive is implemented directly within the switching resources of the FPGA, thus, omitting
the use of extra LUTs within all bypassed slots.
Only the connection primitive comes along with a little overhead.
However, the ReCoBus-Builder allows the highest possible logic utilization by independently using the LUTs
and the flip-flops for the connections.
With this optimizations, the ReCoBus-Builder is able to provide an input access port to an I/O bar together with an output port
for up to eight bit signals on both ports at the same time within only one single CLB of a Xilinx Virtex FPGA.
See also our FAQs for further information on the ReCoBus technology