Quick
Start with Xilinx EDK
and XPS for verilog
users
Xilinx and IBM have made it VERY easy to put a processor subsystem in
your FPGA design. Their interest is in selling silicon, so
the
development tools are quite inexpensive and include a lot of pieces
that you can quickly assemble.
Even
better, I'll be trying to
come up with a way that you can work with it without buying the tools,
by preparing a simple processor subsystem and showing how to apply GNU
tools to target your code to it.
Update: The EDK is built on GNU tools, which might be available
separately under just GPL, but I haven't yet found a non-EDK
distribution.
The EDK is biased towards VHDL, but it's very manageable to integrate
an EDK processor subsystem into a verilog design without needing a mixed mode
simulator.
I got started using EDK without taking a class, and with these notes, I
hope you can be up and running even more quickly than I did! I will
show you how to make a simple processor design that runs C code to
blink LEDs, then I will show how to add a block of registers and logic.
In a little more detail, the steps we'll go through will define a
simple processor subsystem with a simple C application, build verilog
code for the subsystem, put a simple design around it, simulate it,
synthesize and implement it, then change the C code and rebuild it.
It's good to work with a development board. Even a $79 XESS
SpartanII-100
board can be
used for a MicroBlaze (with tiny memory). Enterpoint and Digilent
also are
good sources of inexpensize Xilinx development platforms, though I
haven't tried any from Digilent, yet.
To begin, install Xilinx ISE and EDK. I'll assume that you're running
on a PC under windows. If
you haven't purchased EDK, but have
ISE, check back here in May (2005) -
I hope to
have a subsystem
you can
use
and instructions on how
to install the same Cygwin/GNU tools that
Xilinx includes in the EDK (the EDK has much more than those tools to
make it's very reasonable price worthwhile, though). Update: I've
realized that to get a copy of MicroBlaze, you have to accept Xilinx's
license terms, so I can only provide a subsystem to users that have
accepted a Xilinx license to it, for example, the license you have to
go through to get the files for Xilinx App note 575, for the
"UltraController" which is based on the PowerPC.
An example of that license is (currently) at http://www.xilinx.com/bvdocs/appnotes/Design_License.pdf.
If you can show me that you've registered for the MicroBlaze or
UltraController lounge, then I can share any of my subsystems with you.
Email me at chuckbenz.com, username chuck.
1. Defining the processor
subsystem
Xilinx Platform Studio (XPS) is an integrated tool for specifying and
building both hardware and software embedded development. You will
complete the embedded processor part of your design under XPS, then use
the Xilinx ISE tools to implement the design in the FPGA. Start XPS
(Start -> Programs -> Xilinx Platform Studio ->
Xilinx
Platform Studio).
1.1 Base System Builder to
make the starting definition
Create a new project using "Base System Builder", choosing an
appropriate work directory (the default is likely the desktop, which
wouldn't be a good idea, instead, use something like
c:\edk_example\mb_subsys).
You will be presented with a choice of development boards - you can
choose any of them (even if you're going to target a SpartanII device,
you can choose a VirtexII Pro board). You're choice of board is used
just to select which periperals you'll be shown initially, as well as
default pin assignments.
Next, choose MicroBlaze as our processor and proceed to the next page.
The next choices will be processor specifics, take the defaults (do
make sure that the local data and instruction memory is set the
minimum, 8KB).
Now you will see a choice of IO interfaces. Keep just the RS232 UART,
deselect all of the others (you may see several pages). Don't add any
extra devices, and take the defaults when the Software Configuration
menu is presented. Accept the next few pages presented, and a basic
system will be defined.
We will change the default settings for the project - under the Options
pulldown menu, select Project Options. On the device tab, choose the
target device. On the Hierarchy tab, select that this will be a
submodule and name the instance "system". On the HDL tab,
select Structural for the simulation models (at the bottem) then select
Verilog.
The hardware definition of the processor subsystem is now in a file
system.mhs - you might want to make a copy of this under another name
so you can see what changes the next step makes.
1.2 Adding the LEDs
We'll add a GPIO peripheral for the LEDs (I'm assuming evey development
board has some LEDs we can flash). From the Project pulldown menu,
select Add/Edit Cores.
- We start with the
Peripherals tab, choosing opb_gpio from the
list of peripherals on the right, and clicking on "Add". The added
block will appear at the end of the list at left. Change the base
address, perhaps to 0x90000000 (it is a 32 bit value). Change the high
address to 0x01ff higher. If your target is actually the
Spartan2-100, also reduce the bram address ranges for the dlmb and ilmb
to 0x00000fff (4KB instead of 8KB).
- On the Bus Connection tab,
connect the opb_gpio instance to the
mb_opb bus by clicking in the box at the intersection of the mb_opb
column and the opb_gpio row. A small s will appear, similar to other
peripherals, indicating that it is a slave on the bus.
- On the Ports tab, find the
OPB_clk signal for the GPIO instance
in the list on the right side, and click on "Add". Do the same for
GPIO_d_out. Change the
signal name for that OPB_clk in the table on the left (you will have to
scroll down) to sys_clk_s (add the _s),
and change the Scope to Internal.
Click on "Add Port" to add a port named "leds_out" as an output,
changing "connected to" to "leds_out". After adding the port, add [0:7]
to the range column for the port in the table on the left.
Change
the signal name for GPIO_d_out to leds_out and add [0:7] to the
range column for that connection.
- On the Parameters tab,
select the C_GPIO_WIDTH parameter for
opb_gpio, click on "<< Add", then change the value in the
left
table to 8.
- Select OK to finish.
You can compare the system.mhs file now to the version you saved.
1.3 Some simple C code
XPS created a simple C application for us, but it simply prints strings
to the UART. We'll replace that with some toggling of the
LEDs.
The basic application is found in TestApp\src\TestApp.c.
We'll
replace the main module with (you can cut and paste it from here):
volatile
int * leds = (int
*) 0x90000000 ;
int main (void) {
int i = 0 ;
int ledval = 0 ;
while (1) {
ledval = ~ledval ;
leds[0]
= ledval ;
i = 0 ;
while
(i < 20)
i++ ;
}
return 0 ;
}
If you didn't use 0x90000000 for the LEDs base address, substitute your
value instead.
(Yes, this will toggle the led quite rapidly, but first we're going to
simulate, so the high speed means quick simulation. Later we'll slow it
down).
1.4 Generating the subsystem
If you now select the Tools pulldown menu and click on Sim Model
Generation, everything needed for the further steps will be created:
- The files in
simulation/structural and in in
hdl/elaborate/*/hdl/verilog will be used for simulation.
- The files in implementation
will be used for Xilinx ISE
implementation tools.
- The file hdl/system.v will
be used for simulation, synthesis, and
implementation. Edit this file to separate the system module from the
shells defined for submodules: save all of the shells in one file
(system_boxes.v), and save the system module only in a separate file
(system_module.v - these names preserve the files so they aren't
overwritten except by repeating thes edits).
2 A simple design
Usually you would have significant logic to wrap around your processor
subsystem. But since we're making this a simple demo that has just one
GPIO port to drive LEDs, we'll just simply instantiate our subsystem,
like so:
module
edk_demo (clk,
resetl, leds, rs232_tx, rs232_rx) ;
input clk, rstl, rs232_rx ;
output [7:0] leds ;
output rs232_tx ;
wire clk_bufg ;
BUFG CLKB (.I(clk), .O(clk_bufg)) ;
system mb_subsys
(.sys_clk(clk),
.sys_rst(resetl),
.leds_out(leds),
.RS232_TX(rs232_tx),
.RS232_RX(rs232_rx))
;
`ifdef MEM_INIT
`include "mem_init.v"
`endif
endmodule
The clock buffer is explicitly instantiated because the synthesis tool
sees system as a black box, so there's no indication that the clk
signal is actually a clock. The conditioned include will be used just
for simulation. Note
that the reset to the processor is asserted low, even though the signal
name does not have any level indication.
3 Simulation
The advantage of using Microblaze as our example processor is that we
can simulate it with the free Icarus simulator. Designs using the
PowerPC processor usually need to use the SmartModel PPC model from
Synopsys.
A simple testbed to provide clock and reset to our example would be so:
`timescale
1ns/1ns
module
edo_demo_tb ;
reg clk, resetl ;
wire [7:0] leds ;
initial clk = 1 ;
always #5 clk = ~clk ;
initial begin
#dumpvars(0);
resetl = 1 ;
#10 resetl = 0 ;
#95 resetl = 1 ; // wait long
enough for
Xilinx glbl GSR to deassert
#3000 $finish ;
end
edk_demo dut
(.clk(clk),
.resetl(resetl),
.leds(leds),
.rs232_rx(1'b0),
.rs232_tx()) ;
always @ (leds) $display ("LEDS change to %h at %d", leds,
$stime);
endmodule
To compile and run this under the Icarus Verilog simulation, the
commands would be:
% grep defparam
../mb_subsys/simulation/structural/system_init.v > mem_init.v
% iverilog -D MEM_INIT -y ../rtl -y ../mb_subsys/simulation/structural \
-y $XILINX/verilog/src/simprims -y
$XILINX/verilog/src/unisims \
edk_demo_tb.v $XILINX/verilog/src/glbl.v
../mb_subsys/hdl/system_module.v
../mb_subsys/hdl/elaborate/*/hdl/verilog/*.v
% vvp a.out
assuming that the current directory is a simulation area, ../rtl
is the
relative path to the design rtl directory where edk_demo.v will be
found, and ../mb_subsys
is the relative path to the directory where XPS was run to build the
processor subsystem.
Doing so, we see that the LEDs toggle every 710ns. (This is with a 100
MHz processor clock).
Debug note: if nothing happens in simulation, check (1) the reset
polarity, and (2) that the memory initialization is working by checking
that the bram outputs in the MicroBlaze subsystem have non-zero data.
4 Synthesis and Implementation
The ISE tools suite can perform necessary synthesis and implementation
tasks. An example XST script might be a file edk_demo.xst:
run -ifn
edk_demo_xst.v -ifmt
VERILOG -ofn edk_demo -p xc2s100tq144-5 -hierarchy_seperator /
which would be run as:
% xst -ifn
edk_demo.xst
The verilog input file for XST, edk_demo_xst.v, would just include:
`include
"../rtl/edk_demo.v"
`include "../mb_subsys/hdl/system.v"
The remaining process would be something like:
% cp ../mb_subsys/implementation/system_stub.bmm
.
%
ngdbuild -bm system_stub.bmm -sd ../mb_subsys/implementation -uc
edk_demo.ucf edk_demo.ngc
% map
-pr b edk_demo
% par edk_demo edk_demo
% bitgen edk_demo
But this would be the design without the application C code present in
the block rams. The tool that manages that is data2mem:
% data2mem -bm
system_stub_bd.bmm
-bd ../mb_subsys/TestApp/executable.elf -bt edk_demo.bit -o b
edk_demo_code.bit
with the final bit file being the one with the target code present.
5 Incremental changes
The most common change will be to the target C code. When this is done,
it's not necessary to run the full XPS process again. Instead, the
compiler can be run in the mb_subsys directory under the Xilinx
"Xygwin" shell (if your work area is c:\edk_example, then under Xygwin,
cd to /xygdrive/c/edk_example/mb_subsys):
% make -f
system.make program
Then the simulation mem_init.v file would be rebuilt with a data2mem
command, run in the simulation directory:
% data2mem -bm
../mb_subsys/implementation/system_stub.bmm \
-bd ../mb_subsys/TestApp/executable.elf -u -o v
mem_init.v
And the bit file can be rebuilt using just the last command of the
implementation process.
In simulation, we saw that the LEDs toggled every 710ns with the test
being "i < 20". Changing the test to "i < 40", the
interval
increases by 600ns, so each count takes 30ns at 100 MHz. With
Spartan2, 25 MHz is a more realistic processor clock speed, so it would
be 120ns per loop count. A loop test of 4,000,000 will toggle
every 480ms, so for a build for our demo board, we should use that
value.
\chuck benz
April, 2005