PCI Express/Linux FPGA Hacking

Needed an excuse to write a Linux device driver so I’ve decided to have a crack at building a custom IO board using the Spartan 6 donated by @csirac2_ (thanks Paul). The board has a PCI Express edge connector, am hoping to get working data exchange between it and a host computer running Linux over the PCI bus.

Have managed to create a CompactFlash card containing the Xilinx self test routines. Was able to run all self tests except the DVI test (no monitor handy). An excellent start.

2 Likes

" An excellent start" … indeed!

Thanks Spencer.

Have been making good progress, thanks to the Xilinx engineers. It was almost trivial to get the FPGA sample app built, loaded and running with the board inside the PC.

Linux is recognising the board:
sjdavies@mhv755:~$ sudo lspci -vs 01:00.0
01:00.0 RAM memory: Xilinx Corporation Default PCIe endpoint ID
Subsystem: Xilinx Corporation Default PCIe endpoint ID
Flags: fast devsel, IRQ 16
Memory at fe800000 (32-bit, non-prefetchable) [size=1M]
Capabilities: [40] Power Management version 3
Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [58] Express Endpoint, MSI 00
Capabilities: [100] Device Serial Number 00-00-00-01-01-00-0a-35`

Have written a basic PCI driver that loads up ok. The hard part so far has been figuring out what functionality Xilinx put into the demo. Answer, an 8k RAM block that responds to read/write transactions.

More reading required to figure out how to map it into the OS space and test the read/write function.

1 Like

PCI devices interact with the BIOS/OS through a series of configuration registers that are read during power up. This permits discovery of the resources required by the device e.g. io memory and interrupts.

A device can have up to 6 base address registers (BAR0-5) which serve 2 purposes:

  1. Inform BIOS/OS about IO memory blocks (buffers, control registers etc.)
  2. BIOS/OS writes memory map values to BAR(s) during startup

Example - Xilinx Memory Board
Device has a single base register (BAR0) configured. It indicates the device has a 1Mi memory block to map into the processors address space. Address space is allocated on startup and the base address is written to BAR0 i.e. making it available to hardware on the device itself and OS device drivers.

This allocation is readily visible in /proc/iomem:
fe800000-fe8fffff : PCI Bus 0000:01
fe800000-fe8fffff : 0000:01:00.0

One thing I learned is that it is possible to read/write device memory without a device driver being present. You don’t need to write any kernel code. The device can be accessed directly from user space via the sysfs file system.

Linux uses a virtual file system (think memory disk) to make values in the kernel readily available to user space. Each attached device has a directory under /sys. For example, the Xilinx device is nested under /sys/bus/pci/devices/0000:01:00.0.

foo@mhv755:/sys/bus/pci/devices/0000:01:00.0$ ls -l
total 0
-rw-r--r-- 1 root root    4096 Nov 30 14:31 broken_parity_status
-r--r--r-- 1 root root    4096 Nov 30 14:31 class
-rw-r--r-- 1 root root    4096 Nov 30 10:16 config
-r--r--r-- 1 root root    4096 Nov 30 14:31 consistent_dma_mask_bits
-rw-r--r-- 1 root root    4096 Nov 30 14:31 d3cold_allowed
-r--r--r-- 1 root root    4096 Nov 30 14:31 device
-r--r--r-- 1 root root    4096 Nov 30 14:31 dma_mask_bits
-rw-r--r-- 1 root root    4096 Nov 30 14:31 driver_override
-rw-r--r-- 1 root root    4096 Nov 30 14:31 enable
-r--r--r-- 1 root root    4096 Nov 30 10:16 irq
-r--r--r-- 1 root root    4096 Nov 30 14:31 local_cpulist
-r--r--r-- 1 root root    4096 Nov 30 14:31 local_cpus
-r--r--r-- 1 root root    4096 Nov 30 14:31 modalias
-rw-r--r-- 1 root root    4096 Nov 30 14:31 msi_bus
-rw-r--r-- 1 root root    4096 Nov 30 14:31 numa_node
drwxr-xr-x 2 root root       0 Nov 30 14:31 power
--w--w---- 1 root root    4096 Nov 30 14:31 remove
--w--w---- 1 root root    4096 Nov 30 14:31 rescan
--w------- 1 root root    4096 Nov 30 14:31 reset
-r--r--r-- 1 root root    4096 Nov 30 14:31 resource
-rw------- 1 root root 1048576 Nov 30 14:31 resource0
lrwxrwxrwx 1 root root       0 Nov 30 10:15 subsystem -> ../../../../bus/pci
-r--r--r-- 1 root root    4096 Nov 30 14:31 subsystem_device
-r--r--r-- 1 root root    4096 Nov 30 14:31 subsystem_vendor
-rw-r--r-- 1 root root    4096 Nov 30 10:15 uevent
-r--r--r-- 1 root root    4096 Nov 30 10:15 vendor

Many of the files listed contain a single text value. The file ‘irq’ contains the text ‘16’, indicating that the device has been allocated IRQ line #16. Handy to know when you want to set up the interrupt handler.

File ‘resource0’ represents the previously mentioned memory block. Rather that having to know its physical memory address we can access it by its sysfs logical filename.

The basic idea from user space is to:

int fd = open("/sys/bus/pci/devices/0000:01:00.0/resource0", O_RDWR | O_SYNC);
unsigned char * map = (unsigned char *)(mmap(NULL, 16384, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0));

map[0] = 0x5a;

munmap(map, 16384);

This makes 16kiB of the total 1MiB available to the program as an array of unsigned char. The assignment map[0] = 0x5a; is writing to memory physically located on the Xilinx device.

Pretty neat huh?

2 Likes

FYI,

An interesting board if a DIY co-processor is your kind of thing.

Pricey but in their defence the Element 14 list price for the FPGA alone is AUD484.

1 Like

It was never unusual to have a CPU and an FPGA together. After all, each has different strengths and weaknesses. However, newer devices like the Xilinx Zynq have both a CPU and an FPGA in the same package. That means your design has to span hardware, FPGA configurations, and software. [Mitchell Orsucci] was using a Zynq device on a ArtyZ7-20 board and decided he wanted to use Linux to operate the ARM processor and provide user-space tools to i[nterface with the FPGA and reconfigure it dynamically
Shareit https://get-vidmateapk.com

Hi Barry,
welcome to the MHV community.

I have an ArtyZ7-20 here somewhere. It’s a very big hammer in search of a nail. I originally bought it for my Hammond organ project. Found that I could get by with an Artix-7 and Microblaze processor so I never really investigated it fully.

Thanks for the link, there are some interesting projects listed.

I was under the impression that partial reconfiguration of the FPGA required a non-webpack licence. Had a quick look through his git project. Do you know how this was being done?

Cheers,
SteveD

I’m interested in using FPGAs as an accelerator for computing like CUDA/GPU computing. Basically write a computer program, off load all the compute heavy parts to an FPGA. From what I understand so far is they use direct memory access which is similar to GPU computing.

Hi Smith,
can certainly do that sort of thing with an FPGA. You’re probably going to need a board with some sort of PCIe interface. You will likely run into IP licensing issues ($$$) depending on your application. It would be worth the research effort before buying a board.

You could look into smaller FPGA boards like the Aller mentioned above or the Arduino MKR Vidor 4000.

Cheers,
SteveD