FPGA Overview
General Overview
The project uses two FPGAs, each of which is independent from the other. The FPGA’s purpose is to capture the raw pixel data in YUV422 QVGA format from the OV7670 camera, and then store the bitmasked frame of Y-values of pixel data based on a pre-determined threshold by the user. This bitmasked image is stored in SPRAM frame buffer in the FPGA, before being sent to the MCU over SPI.
System Architecture:
OV7670 Camera → Threshold Module → Frame Buffer → Serial Interface → MCU
(YUV422 QVGA) (Y-only, 1-bit/pixel) (SPRAM) (SPI)
RTL Block Diagram
The block diagram describing our RTL, which can be seen below.

OV7670 Camera Pixel Capture and Bitmasking
YUV422 Format and Y-value (luminosity) of pixel extraction
The OV7670 camera supports sending data in multiple formats, but for our project, as we realized we only care about differentiating between two colours of the lines and the track, we could easily do this by using YUV format output, and just using the Y values of the pixel data which represent the luminosity, while ignoring the U/V values which encode the chroma/colour data. Given we chose black line on white background for our testing purposes, using Y values would be sufficient, as they would have different intensity levels.
YUV422 Data Format:
The YUV data sent by the camera is in the format of 8-bit Y, 8-bit U or 8-bit Y, 8-bit V, therefore, we only care about every other byte of pixel data which represent the Y values for all the 76,800 pixels we receive in the QVGA format.
Data Stream: Y0 U0 Y1 V0 Y2 U1 Y3 V1 ...
Bytes Captured: Y0 -- Y1 -- Y2 -- Y3 -- ...
This also helps us save lot of memory as we reduce the frame data by 94%
- Full YUV422: 76,800 pixels × 2 bytes = 153,600 bytes
- Y-only binary: 76,800 pixels × 1 bit = 9,600 bytes
Threshold Module Operation
The camera_capture_threshold module runs entirely in the camera pixel clock domain and processes incoming YUV422 data in real-time.
Key Signals:
// Inputs
cam_pclk: Pixel clock from OV7670
cam_vsync: Frame sync (LOW = active frame)
cam_href: Line valid (HIGH = pixel data valid)
cam_data: 8-bit YUV422 data bus
threshold: 8-bit comparison value (our chosen value 141)
// Outputs
wr_addr: 17-bit pixel address (0-76,799)
wr_data: 1-bit thresholded result (bright/dark)
wr_en: Write enable pulse
frame_done: Frame complete pulseOperation:
The module uses a priority-based control structure with four operations:
- Frame Start (VSYNC falling edge) - Reset counters, begin capture
- Frame End (VSYNC rising edge) - Signal completion
- Pixel Capture (during frame with HREF valid) - Threshold and write Y bytes
- End of Line (HREF falling edge) - Reset line counters
Byte Selection Logic:
A simple toggle mechanism parses the YUV422 stream:
if (byte_select == 0) begin // Y byte
wr_en <= 1;
wr_data <= (data > threshold);
byte_select <= 1; // Next is U/V
end else begin
byte_select <= 0; // Skip U/V, next is Y
endThis alternates between capturing Y bytes (luminance) and skipping U/V bytes (chroma).
SPRAM Frame Buffer and SPI Transfer
Memory Architecture
The iCE40 UP5K provides 128 KB of SPRAM (Single-Port RAM), which we use for double-buffered frame storage:
- Each frame: 76,800 bits = 4,800 words (16 bits/word)
- Bank 0: Addresses 0x0000-0x12BF
- Bank 1: Addresses 0x1300-0x257F
- Total usage: 9,600 words (15% of SPRAM capacity)
Bit Packing
Since the threshold module outputs 1 bit per pixel, data is packed into 16-bit words:
Pixel Address [16:0]:
Bits [16:4] → Word address (0-4,799)
Bits [3:0] → Bit position (0-15)
When 16 bits accumulate in the camera domain, they’re latched and transferred to the system clock domain (48 MHz) for SPRAM write.
Double Buffering (Ping-Pong)
The frame buffer uses ping-pong buffering to enable simultaneous camera write and MCU read:
- Camera writes to Bank A, MCU reads from Bank B
- On frame completion, banks swap roles
- Camera immediately starts writing to Bank B
- MCU continues reading completed Bank A
- Process repeats continuously
This ensures no frame drops at 6 fps, even if the MCU lags behind by one frame.
Clock Domain Crossing (CDC)
The system operates across two asynchronous clock domains:
- Camera domain: 2.5 MHz (from OV7670)
- System domain: 48 MHz (internal oscillator)
All cross-domain signals use 3-stage synchronizers to prevent metastability.
Data is latched in the source domain before the control signal toggles, ensuring stability during crossing.
Serial Interface to MCU
The FPGA implements a SPI slave interface:
Protocol:
- FPGA asserts frame_ready when new frame available
- MCU generates SCK pulses (3-48 MHz recommended)
- FPGA outputs one bit per SCK falling edge
- After 76,800 bits, FPGA deasserts frame_ready
The module uses priority-based logic:
- New frame event - Reset address, load first word from SPRAM
- Load complete - Capture SPRAM data into shift register
- Shift on MCU clock - Output bits serially, fetch next word when needed
Timing:
At 10 MHz SCK: 76,800 bits / 10 MHz = 7.68 ms per frame
Camera frame period: 9ms ms per frame( at 6 fps)
Pin Count
- Camera interface: 11 pins (pclk, vsync, href, data[7:0])
- MCU interface: 3 pins (sck, mosi, frame_ready)
- System: 1 pin (reset)
- Total: 14 I/O pins
Performance Summary
Camera Capture:
- Frame rate: 6 fps
- Pixel rate: .46 Mpixels/sec
- Bit rate: .46 Mbits/sec (after thresholding)
Testing Results
The system has been validated with:
- Testbench to verify the module
- Continuous frame capture at QVGA resolution
- Correct YUV422 byte selection (Y-only)
- Threshold discrimination (black line detection)
- Bank swapping without corruption
- Proper Serial readout over SPI
The hardware has been tested extensively and performs reliably in the line-following application.