Quick Start with Xilinx EDK and XPS for verilog users

Xilinx and IBM have made it VERY easy to put a processor subsystem in your FPGA design.  Their interest is in selling silicon, so the development tools are quite inexpensive and include a lot of pieces that you can quickly assemble. Even better, I'll be trying to come up with a way that you can work with it without buying the tools, by preparing a simple processor subsystem and showing how to apply GNU tools to target your code to it. Update: The EDK is built on GNU tools, which might be available separately under just GPL, but I haven't yet found a non-EDK distribution.

The EDK is biased towards VHDL, but it's very manageable to integrate an EDK processor subsystem into a verilog design without needing a mixed mode simulator.

I got started using EDK without taking a class, and with these notes, I hope you can be up and running even more quickly than I did! I will show you how to make a simple processor design that runs C code to blink LEDs, then I will show how to add a block of registers and logic.

In a little more detail, the steps we'll go through will define a simple processor subsystem with a simple C application, build verilog code for the subsystem, put a simple design around it, simulate it, synthesize and implement it, then change the C code and rebuild it.

It's good to work with a development board. Even a $79 XESS SpartanII-100 board can be used for a MicroBlaze (with tiny memory). Enterpoint and Digilent also are good sources of inexpensize Xilinx development platforms, though I haven't tried any from Digilent, yet.

To begin, install Xilinx ISE and EDK. I'll assume that you're running on a PC under windows.  If you haven't purchased EDK, but have ISE, check back here in May (2005) - I hope to have a subsystem you can use and instructions on how to install the same Cygwin/GNU tools that Xilinx includes in the EDK (the EDK has much more than those tools to make it's very reasonable price worthwhile, though). Update: I've realized that to get a copy of MicroBlaze, you have to accept Xilinx's license terms, so I can only provide a subsystem to users that have accepted a Xilinx license to it, for example, the license you have to go through to get the files for Xilinx App note 575, for the "UltraController" which is based on the PowerPC.  An example of that license is (currently) at http://www.xilinx.com/bvdocs/appnotes/Design_License.pdf.  If you can show me that you've registered for the MicroBlaze or UltraController lounge, then I can share any of my subsystems with you. Email me at chuckbenz.com, username chuck.

1. Defining the processor subsystem

Xilinx Platform Studio (XPS) is an integrated tool for specifying and building both hardware and software embedded development. You will complete the embedded processor part of your design under XPS, then use the Xilinx ISE tools to implement the design in the FPGA. Start XPS (Start -> Programs -> Xilinx Platform Studio -> Xilinx Platform Studio).

1.1 Base System Builder to make the starting definition

Create a new project using "Base System Builder", choosing an appropriate work directory (the default is likely the desktop, which wouldn't be a good idea, instead, use something like c:\edk_example\mb_subsys). 

You will be presented with a choice of development boards - you can choose any of them (even if you're going to target a SpartanII device, you can choose a VirtexII Pro board). You're choice of board is used just to select which periperals you'll be shown initially, as well as default pin assignments.

Next, choose MicroBlaze as our processor and proceed to the next page. The next choices will be processor specifics, take the defaults (do make sure that the local data and instruction memory is set the minimum, 8KB).

Now you will see a choice of IO interfaces. Keep just the RS232 UART, deselect all of the others (you may see several pages). Don't add any extra devices, and take the defaults when the Software Configuration menu is presented. Accept the next few pages presented, and a basic system will be defined.

We will change the default settings for the project - under the Options pulldown menu, select Project Options. On the device tab, choose the target device. On the Hierarchy tab, select that this will be a submodule and name the instance "system". On the HDL tab, select Structural for the simulation models (at the bottem) then select Verilog.

The hardware definition of the processor subsystem is now in a file system.mhs - you might want to make a copy of this under another name so you can see what changes the next step makes.

1.2 Adding the LEDs

We'll add a GPIO peripheral for the LEDs (I'm assuming evey development board has some LEDs we can flash). From the Project pulldown menu, select Add/Edit Cores. 
  1. We start with the Peripherals tab, choosing opb_gpio from the list of peripherals on the right, and clicking on "Add". The added block will appear at the end of the list at left. Change the base address, perhaps to 0x90000000 (it is a 32 bit value). Change the high address to 0x01ff higher.  If your target is actually the Spartan2-100, also reduce the bram address ranges for the dlmb and ilmb to 0x00000fff (4KB instead of 8KB).
  2. On the Bus Connection tab, connect the opb_gpio instance to the mb_opb bus by clicking in the box at the intersection of the mb_opb column and the opb_gpio row. A small s will appear, similar to other peripherals, indicating that it is a slave on the bus.
  3. On the Ports tab, find the OPB_clk signal for the GPIO instance in the list on the right side, and click on "Add". Do the same for GPIO_d_out. Change the signal name for that OPB_clk in the table on the left (you will have to scroll down) to sys_clk_s (add the _s), and change the Scope to Internal.  Click on "Add Port" to add a port named "leds_out" as an output, changing "connected to" to "leds_out". After adding the port, add [0:7] to the range column for the port in the table on the left.  Change the signal name for GPIO_d_out to leds_out and add [0:7] to the range column for that connection.
  4. On the Parameters tab, select the C_GPIO_WIDTH parameter for opb_gpio, click on "<< Add", then change the value in the left table to 8.
  5. Select OK to finish.
You can compare the system.mhs file now to the version you saved.

1.3 Some simple C code

XPS created a simple C application for us, but it simply prints strings to the UART. We'll replace that with some toggling of the LEDs.  The basic application is found in TestApp\src\TestApp.c.  We'll replace the main module with (you can cut and paste it from here):
volatile int * leds = (int *) 0x90000000 ;
int main (void) {
  int i = 0 ;
  int ledval = 0 ;
  while (1) {
    ledval = ~ledval ;
   
leds[0] = ledval ;
    i = 0 ;
   
while (i < 20)
      i++ ;
    }
  return 0 ;
}
If you didn't use 0x90000000 for the LEDs base address, substitute your value instead.
(Yes, this will toggle the led quite rapidly, but first we're going to simulate, so the high speed means quick simulation. Later we'll slow it down).

1.4 Generating the subsystem

If you now select the Tools pulldown menu and click on Sim Model Generation, everything needed for the further steps will be created:

2 A simple design

Usually you would have significant logic to wrap around your processor subsystem. But since we're making this a simple demo that has just one GPIO port to drive LEDs, we'll just simply instantiate our subsystem, like so:

module edk_demo (clk, resetl, leds, rs232_tx, rs232_rx) ;
  input  clk, rstl, rs232_rx ;
  output [7:0] leds ;
  output rs232_tx ;
  wire clk_bufg ;
  BUFG CLKB (.I(clk), .O(clk_bufg)) ;
  system mb_subsys
     (.sys_clk(clk),
      .sys_rst(resetl),
      .leds_out(leds),
      .RS232_TX(rs232_tx),
      .RS232_RX(rs232_rx)) ;
`ifdef MEM_INIT
 `include "mem_init.v"
`endif
endmodule

The clock buffer is explicitly instantiated because the synthesis tool sees system as a black box, so there's no indication that the clk signal is actually a clock. The conditioned include will be used just for simulation.  Note that the reset to the processor is asserted low, even though the signal name does not have any level indication.

3 Simulation

The advantage of using Microblaze as our example processor is that we can simulate it with the free Icarus simulator. Designs using the PowerPC processor usually need to use the SmartModel PPC model from Synopsys.

A simple testbed to provide clock and reset to our example would be so:
`timescale 1ns/1ns
module edo_demo_tb ;
  reg clk, resetl ;
  wire [7:0] leds ;
  initial clk = 1 ;
  always #5 clk = ~clk ;
  initial begin
    #dumpvars(0);
    resetl = 1 ;
    #10 resetl = 0 ;
    #95 resetl = 1 ;  // wait long enough for Xilinx glbl GSR to deassert
    #3000 $finish ;
  end

  edk_demo dut
    (.clk(clk),
     .resetl(resetl),
     .leds(leds),
     .rs232_rx(1'b0),
     .rs232_tx()) ;
  always @ (leds) $display ("LEDS change to %h at %d", leds, $stime);
endmodule

To compile and run this under the Icarus Verilog simulation, the commands would be:
% grep defparam ../mb_subsys/simulation/structural/system_init.v > mem_init.v
% iverilog -D MEM_INIT -y ../rtl -y ../mb_subsys/simulation/structural \
  -y $XILINX/verilog/src/simprims -y $XILINX/verilog/src/unisims \
  edk_demo_tb.v $XILINX/verilog/src/glbl.v ../mb_subsys/hdl/system_module.v
  ../mb_subsys/hdl/elaborate/*/hdl/verilog/*.v
% vvp a.out
assuming that the current directory is a simulation area, ../rtl is the relative path to the design rtl directory where edk_demo.v will be found, and ../mb_subsys is the relative path to the directory where XPS was run to build the processor subsystem.

Doing so, we see that the LEDs toggle every 710ns. (This is with a 100 MHz processor clock).

Debug note: if nothing happens in simulation, check (1) the reset polarity, and (2) that the memory initialization is working by checking that the bram outputs in the MicroBlaze subsystem have non-zero data.

4 Synthesis and Implementation

The ISE tools suite can perform necessary synthesis and implementation tasks.  An example XST script might be a file edk_demo.xst:
run -ifn edk_demo_xst.v -ifmt VERILOG -ofn edk_demo -p xc2s100tq144-5 -hierarchy_seperator /
which would be run as:
% xst -ifn edk_demo.xst
The verilog input file for XST, edk_demo_xst.v, would just include:
`include "../rtl/edk_demo.v"
`include "../mb_subsys/hdl/system.v"

The remaining process would be something like:
% cp ../mb_subsys/implementation/system_stub.bmm .
% ngdbuild -bm system_stub.bmm -sd ../mb_subsys/implementation -uc edk_demo.ucf edk_demo.ngc
% map -pr b edk_demo
% par edk_demo edk_demo
% bitgen edk_demo

But this would be the design without the application C code present in the block rams.  The tool that manages that is data2mem:
% data2mem -bm system_stub_bd.bmm -bd ../mb_subsys/TestApp/executable.elf -bt edk_demo.bit -o b edk_demo_code.bit
with the final bit file being the one with the target code present.

5 Incremental changes

The most common change will be to the target C code. When this is done, it's not necessary to run the full XPS process again. Instead, the compiler can be run in the mb_subsys directory under the Xilinx "Xygwin" shell (if your work area is c:\edk_example, then under Xygwin, cd to /xygdrive/c/edk_example/mb_subsys):
% make -f system.make program
Then the simulation mem_init.v file would be rebuilt with a data2mem command, run in the simulation directory:
% data2mem -bm ../mb_subsys/implementation/system_stub.bmm \
  -bd ..
/mb_subsys/TestApp/executable.elf -u -o v mem_init.v
And the bit file can be rebuilt using just the last command of the implementation process.

In simulation, we saw that the LEDs toggled every 710ns with the test being "i < 20". Changing the test to "i < 40", the interval increases by 600ns, so each count takes 30ns at 100 MHz.  With Spartan2, 25 MHz is a more realistic processor clock speed, so it would be 120ns per loop count.  A loop test of 4,000,000 will toggle every 480ms, so for a build for our demo board, we should use that value.

\chuck benz
April, 2005