Fifos and Ring Buffers

First draft, 9-December-2001, Chuck Benz, updated 15-jan-2002

The fifo is an often used logic/memory structure with broad utility. It is often used to pass wide data between asynchronous clock domains, and also as a storage element within a single clock domain.

A similar, related structure is a ring buffer used for the specific purposes of delaying data by a fixed number of clock cycles, or for passing data between two clock domains that are frequency locked but not phase locked. As clock frequencies increase, source clocking, also called clock-forwarding is often used between chips, and that is often a case in which the frequency is locked, but the phase may vary.

A basic fifo design consists of a memory array (usually a RAM with two independent ports – one write and one read port, or an array of flops or latches) and read and write indexes (pointers) to array elements. (A rare alternative would be two or more single ported RAMs – hyperlink to reference ATM PHY design example). As a first-in-first-out function, the fifo is a memory element with no visible addressing – the indexes are local.

Usually there is empty and/or full indication that reflects the indexes indirectly.

Most fifos are designed to operate synchronously, either on a single clock, or two asynchronous clocks for the read and write ports. The write functionality is generally that a write enable is asserted in the same cycle as the write data. The read operation may be either FWFT (first word fall through) in which data written to an empty fifo appears on the read port before or concurrent with any non-empty indication, or a read enable may need to be asserted to read the first data from what was an empty fifo.

Verilog code of a typical fifo is shown below. This single clock design uses a memory that allows asynchronous reads – any change on the read address propagates immediately to the read data. The design allows up to 16 entries in the fifo, and operates as a FWFT fifo.

module basicfifo (clk, reset_l, wren, wrdata, rden,

rddata, empty, full) ;

input clk ; // clock

input reset_l ; // asynchronous reset, low to reset

input wren ; // assert high to write data

input rden ; // assert low to read data

input [7:0] wrdata ; // data to write port - use 1 byte width for this example

output [7:0] rddata ; // data on read port

output empty ;

output full ;

reg full, empty ; // full/empty indications are flopped.

// locally used flops

reg [3:0] wrptr, rdptr ; // write and read pointers

wire [3:0] rd_addr ; // read address to memory

wire ram_wren ; // write enable to memory

// basic 2 port memory – for this small size, assume read is asynch

vendorXY_2portram_16words_8bits URAM

(.wrclk(clk), .wraddr(wrptr), .wren(ram_wren), .wrdata(wrdata),

.rdaddr(rdptr), .rddata(rddata)) ;

assign ram_wren = wren & ! full ;

assign new_wrptr = ram_wren ? (wrptr + 1) : wrptr ;

assign new_rdptr = (rden & ! empty) ? (rdptr + 1) : rdptr ;

always @ (posedge clk or negedge reset_l)

if ( ! reset_l) begin // async reset

wrptr <= 0 ;

rdptr <= 0 ;

full <= 0 ;

empty <= 1 ;

end

else begin

wrptr <= new_wrptr ;

rdptr <= new_rdptr ;

if (ram_wren & (new_wrptr == new_rdptr))

full <= 1 ;

else if (rden)

full <= 0 ;

if (rden & (new_wrptr == new_rdptr))

empty <= 1 ;

else if (wren)

empty <= 0 ;

end

endmodule

The boundary conditions should be explained. If wren is asserted when the fifo is full, the attempt to write will be ignored and the pointers kept as they were. If rden is asserted when the fifo is empty, the read pointer will be kept as it was. This design allows that the read and write pointers will be equal when the fifo is full or empty – the full/empty flags actually are state variables to differentiate the fifo state in that condition. Often designers will sacrifice one position in the fifo, so that condition only occurs when the fifo is empty and then full/empty flags need not be used as state variables.

There are several ways that the design could vary:

Separate (asynchronous) write and read clocks: the full/empty flags become more complicated. Instead of simple binary counters, grey code counters are used so that the pointers can be passed safely through synchronizers to propagate to full and empty status. Link to asynchronous fifo examples… Synchronizers add latency to full/empty status and similarly can result in smaller effective capacity. The use of synchronizers requires that metastability is possible and the probability of failure should be calculated.
Not FWFT – require read enable be asserted for the first read. Reset values would be 1 for the write pointer and 0 for the read pointer. Data could still be available in the same cycle as the read enable is asserted as long as the memory used has asynchronous reads. If no read enable, old data continues to be read from RAM, so cannot be overwritten – this constrains writes to not use location, so just 15 entries if memory is 16 words. Link to example.
Memory with synchronous read – this constrains how the read port can operate to be either FWFT, or there must be a 1 cycle delay for access after read enable asserts. Alternately, adding flops and logic after memory would allow immediate access. Link to examples. Note that empty flag logic must ensure that any propagation through memory and also any flops+logic.
Some write data may be written tentatively then committed later – an example would be a network packet: if the checksum at the end of the packet is incorrect, then it’s desirable to backup. This can be done by saving the write pointer at the start of the packet, and restoring it if the backup is needed. The read side’s empty flag should use the saved pointer. That use of a saved pointer will prevent reads before the packet is complete, which is sometimes desired even when the backup function is not needed.

Ring buffers differ from fifos. Both terms are often exchanged and confused. As defined for this discussion, a ring buffer is like a fifo, except that it is written and read in every clock cycle (a fifo is a memory without addresses; a ring buffer is a fifo without enables). The read and write pointers of a ring buffer are offset (at reset) and maintain that offset (more or less, depending on variations between the clocks – that variation being one of the main reasons to use a ring buffer).

Some designers fail to recognize that synchronizers can not be used with clocks that are frequency locked but not phase locked. They will work most of the time, but the failure probability calculation for synchronizers presumes independent clocks (random edge location, in a sense), and instead these clocks are closely related. If the phase difference approaches the synchronizer resolution window (within the setup/hold window of the flop), then the clocks will continue to “dance around” that window, dangerously. A ring buffer provides an alternative that has no failure mechanism as long as the phase difference can be somehow bounded (allowed to be many times the clock period). A less perfect alternative is to introduce a third truly asynchronous clock domain between the frequency locked clocks.

The expected phase drift that is possible with the two clocks determines the depth of a ring buffer. To detect whether the phase ever changes by more than the depth of the fifo, a “phase” bit can be added to the fifo data. One each pass through the fifo, the phase value flips. The read side should not check the phase until it has completed one pass, as the initial contents of the ring buffer would be indeterminate. The phase bit value is like an extra bit of address. If the phase is checked for read address+1 and read address-1 then any phase failure will be detected before it is close enough to cause corrupt data. Many circumstances may be adequately served by just checking the phase at the current (read) address.

There are some particular circumstances in which two clocks may be frequency locked, or may just be very close. SONET is one application in which clocks may be all locked to a single source, or may be independent but with no more than 20 (or 40?) ppm difference. A hybrid between a fifo and a ring buffer may be applied to this case. Ordinarily, it would operate as a ring buffer, but the phase detection described above could be used to introduce slips – most readily during SONET overhead intervals. The detection must use synchronizers (when the drift occurs, it will “slide” past the danger point of metastability). A hybrid of this sort would need slip mechanisms on both read and write sides of the buffer/fifo – deleting/skipping entries on the write side usually requires something like SONET overhead (with an indication of the skip, perhaps in an extra bit) or inter-packet idles. Otherwise the read side may require a method to handle two entries in one cycle. Another way to detect the phase drift would be to grey code encode the pointers and use compare logic (passing the final result bit through synchronizers; and the grey code values must come from flops to be glitchless).

Another useful hybrid construct is where a fifo is needed at the boundary that calls for a ring buffer – the fifo can be constructed so that instead of having grey code pointers that are passed through synchronizers (unsafe), the write and read pointers are passed to the opposite side using a pair of side ring buffers. Or instead, the side ring buffers might just pass write and read indications.