Borg - European Graphics Processing Unit

Foundational workflow for an open-source GPU

The Borg (Bring yer Own GRaphics) project aims to establish the complete foundational workflow for an open-source GPU using entirely free and open Electronic Design Automation (EDA) tools. Recognizing that full GPU development is highly complex, the initiative capitalizes on recent advances in low-cost chip manufacturing to make individual tape-outs feasible for small teams.

📖 Read the Borg GPU Book for detailed documentation.

Architecture

The design is a TinyQV RISC-V SoC with the Borg FP16 shader processor as a memory-mapped peripheral, targeting both iCE40 FPGAs (pico-ice) and ASIC (IHP SG13G2 via Tiny Tapeout).

Borg Shader Processor

A minimal programmable shading unit with:

FP16 Fused Multiply-Add (FMA) — IEEE-754 compliant HardFloat unit supporting ADD, MUL, FMA, FNEG, FSTEP, and FRCP operations
32 general-purpose FP16 registers (r0–r31, expanding to 64), MMIO-accessible from the CPU
32-word instruction memory for shader programs
Hardware FP16 reciprocal (RCP) — LUT + linear interpolation for perspective division
4-cycle pipeline with automatic halt-on-zero-instruction

Rendering Pipeline

The firmware implements a full triangle rendering pipeline:

Vertex Shader — 4×4 MVP matrix multiply with hardware perspective division, executed as a single shader pass on the Borg FPU
Screen-Space Translation — NDC to pixel coordinates with configurable framebuffer resolution (up to 64×64)
Rasterization — Hardware-iterator driven edge evaluation with native FP16 coordinate expansion and FSM auto-chaining
Fragment Shader — Unified pass (compiled via linear scan allocator) performing barycentric interpolation for RGB, Z, and UV simultaneously
Z-Buffer — Per-pixel depth testing with texture mapping from PSRAM
Framebuffer Output — Results written to PSRAM, read by host (RP2040) for display

SPIR-B Shader Format

Shaders are compiled from GLSL-like source to a compact binary format (SPIR-B) and loaded at runtime from PSRAM — no firmware reflash needed to change shaders.

TinyQV CPU

Based on Michael Bell's TinyQV, an RV32I RISC-V core with nibble-serial processing designed for Tiny Tapeout. The original Verilog was rewritten in Chisel and heavily modified — including expanded register file support (RV32E → RV32I), integrated Borg peripheral bus, and adapted pipeline for QSPI flash/PSRAM and UART.

Prerequisites

Building and Testing

Run all tests (Chisel + RTL cocotb)

make test-all

Individual test targets

make test-chisel-borg          # Borg FPU unit tests (Chisel)
make test-chisel-core          # TinyQV CPU tests (Chisel)
make test-cocotb-soc-core-rtl  # CPU SoC integration tests (cocotb)
make test-cocotb-soc-borg-rtl  # Borg peripheral tests (cocotb)

Cycle-Accurate C++ Simulation

Fast C++ simulators for RTL validation, rendering frames locally without an FPGA.

cd simulation/verilator    # or cd simulation/arcilator
make triangle              # Build simulator and render a triangle frame

FPGA (pico-ice)

Prerequisites: pico-ice FPGA + Raspberry Pi debug probe.

cd fpga
make burn           # Build bitstream and upload to FPGA
make triangle       # Run triangle rendering (vertex shader on FPGA, display on RP2040)

ASIC (Tiny Tapeout)

make gds            # Full RTL-to-GDS flow via LibreLane/OpenROAD

Milestones

Task	Status
FPU on software simulator (Chisel + cocotb)	✅ Done
FPU integrated into TinyQV SoC	✅ Done
Vertex shader on FPGA	✅ Done
Triangle rasterization + fragment shading	✅ Done
SPIR-B runtime shader loading	✅ Done
Per-vertex color interpolation	✅ Done
Dynamic framebuffer resolution	✅ Done
Tiny Tapeout TTIHP26a submission	✅ Submitted
32-bit RISC-V instructions & 32-entry register file	✅ Done
Hardware perspective projection (4×4 MVP shader)	✅ Done
Hardware FP16 reciprocal (FRCP)	✅ Done
Back-face culling & depth-correct vkcube	✅ Done
Hardware fragment interpolation	✅ Done
Cycle-accurate C++ simulation (Arcilator & Verilator)	✅ Done
Test manufactured chip	⏳ Pending
Vulkan driver	📋 Planned

Software Bill of Materials

Component	Description	License
Chisel	Hardware construction language (Scala → Verilog)	Apache-2.0
TinyQV	RV32I RISC-V CPU core (rewritten in Chisel)	Apache-2.0
Berkeley HardFloat	IEEE-754 floating-point units (FMA)	BSD-3-Clause
LibreLane	RTL-to-GDS ASIC flow orchestrator	Apache-2.0
Yosys	RTL synthesis	ISC
OpenROAD	Place and route	BSD-3-Clause
Magic	Layout tool, DRC, GDS export	MIT
KLayout	GDS viewer and DRC	GPL-2.0
IHP SG13G2 PDK	IHP 130nm process design kit	Apache-2.0
cocotb	Python-based RTL simulation and testing	BSD-3-Clause
Icarus Verilog	Verilog simulation (cocotb backend)	GPL-2.0
Verilator	Verilog linting and simulation	LGPL-3.0
nextpnr	FPGA place and route (iCE40)	ISC
IceStorm	iCE40 FPGA bitstream tools	ISC
Netgen	LVS (Layout vs. Schematic)	MIT
GCC	RISC-V cross-compiler (`riscv32-embedded`)	GPL-3.0
Mill	Scala build tool	MIT
Tiny Tapeout Tools	Build and submission orchestrator	Apache-2.0
Nix	Reproducible development environment	LGPL-2.1
CIRCT/firtool	Chisel → Verilog compiler (FIRRTL)	Apache-2.0 (LLVM)
Arcilator	Cycle-accurate FIRRTL C++ simulator	Apache-2.0 (LLVM)
OpenJDK	Java runtime for Chisel/Mill	GPL-2.0 + CE

Name		Name	Last commit message	Last commit date
Latest commit History 483 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.vscode		.vscode
LICENSES		LICENSES
data		data
docs		docs
fpga		fpga
hardware		hardware
scripts		scripts
simulation		simulation
software		software
src		src
test		test
tt @ b7acfdc		tt @ b7acfdc
.antigravityrules		.antigravityrules
.envrc		.envrc
.gitignore		.gitignore
.gitmodules		.gitmodules
.plan		.plan
.scalafmt.conf		.scalafmt.conf
Makefile		Makefile
README.md		README.md
build.mill		build.mill
flake.lock		flake.lock
flake.nix		flake.nix
info.template.yaml		info.template.yaml
lint.vlt		lint.vlt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Borg - European Graphics Processing Unit

Foundational workflow for an open-source GPU

Architecture

Borg Shader Processor

Rendering Pipeline

SPIR-B Shader Format

TinyQV CPU

Prerequisites

Building and Testing

Run all tests (Chisel + RTL cocotb)

Individual test targets

Cycle-Accurate C++ Simulation

FPGA (pico-ice)

ASIC (Tiny Tapeout)

Milestones

Software Bill of Materials

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Borg - European Graphics Processing Unit

Foundational workflow for an open-source GPU

Architecture

Borg Shader Processor

Rendering Pipeline

SPIR-B Shader Format

TinyQV CPU

Prerequisites

Building and Testing

Run all tests (Chisel + RTL cocotb)

Individual test targets

Cycle-Accurate C++ Simulation

FPGA (pico-ice)

ASIC (Tiny Tapeout)

Milestones

Software Bill of Materials

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages