Skip to content

Commit 1e51ea0

Browse files
committed
feat: numerous tools additions + firm make support + fixes!
* tools: init pica disasm dvl/raw, now dvl (shbin's) and raw instruction streams can be disassembled. * tools: init firm dump and firm make, firm make has been tested and has built sucessfully a Luma3DS firm (even booted!) * tools/Ncch: more misc changes, still not finished. * hardware/pica: finish shader assembler, only missing literally *1* diagnostic. * horizon/fmt: finish dvl format description. * docs/zpsh: document the zpsh format, still not final. * ci: update to 0.15.2
1 parent 282b090 commit 1e51ea0

37 files changed

Lines changed: 2034 additions & 642 deletions

.github/workflows/cd.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ jobs:
2222
- name: Setup zig
2323
uses: mlugg/setup-zig@v2
2424
with:
25-
version: 0.15.1
25+
version: 0.15.2
2626

2727
- name: Build docs
2828
run: zig build docs

.github/workflows/ci.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ jobs:
1818
- name: Setup zig
1919
uses: mlugg/setup-zig@v2
2020
with:
21-
version: 0.15.1
21+
version: 0.15.2
2222

2323
- name: Test
2424
run: zig build test

README.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,10 @@ Currently there are multiple examples in the `demo/` directory. To build them, y
8080
- [flappy](demo/flappy) is a simple fully functional flappy bird clone written entirely with software blitting.
8181
- [gpu](demo/gpu/) is a playground for [mango](src/mango.zig), bleeding edge features are tested there. Not really an example per-se.
8282

83+
---
84+
85+
You can (and are encouraged) to look at the `tools` directory as it is a good example of how to use the API's `zitrus` provides outside (and inside!) of a 3DS environment. Almost all tools are self-contained and span 50-300 LOC.
86+
8387
# Coverage
8488

8589
### Legend
@@ -109,8 +113,9 @@ Currently there are multiple examples in the `demo/` directory. To build them, y
109113
- 🟢 3dsx (tools/3dsx): Make / Dump
110114
- 🟢 Pica (tools/Pica): Assemble / Disassemble
111115
- 🟢 Assemble: Only **Z**itrus**P**ica**Sh**aders are implemented as an output format.
112-
- 🟢 Disassemble: Outputs **Z**itrus**P**ica**A**sse**m**bly, only ZPSH's can be disassembled currently.
113-
- 🟡 Firm (tools/Firm): Info
116+
- 🟢 Disassemble: Outputs **Z**itrus**P**ica**A**sse**m**bly. Either RAW instructions, ZPSH's or DVL's (.shbin) can be disassembled.
117+
- 🟢 Firm (tools/Firm): Make / Info / Dump
118+
- 🟢 Make: Confirmed to build (and boot!) Luma3DS from source, however needs more testing as the firm is not 1:1.
114119
- 🟡 Ncch (tools/Ncch): Dump (/ Info)
115120
- 🟢 ExeFS (tools/ExeFs): Make / List / Dump
116121
- 🟢 RomFS (tools/RomFs): Make / List / Dump
@@ -197,7 +202,7 @@ Whether register bits are present and/or relevant tooling (assemblers, disassemb
197202

198203
## Why?
199204
I wanted to learn arm and always wanted to develop 3DS homebrew, also I searched and I haven't found any kind of zig package that doesn't use libctru, so I just started reading a lot and doing things. Furthermore, using only zig has a lot of advantages:
200-
- Really simplified and easy development. You don't need complex toolchains, you just need the `zig` executable, that's it! (However, obviously it is recommended that you use devkitPRO's tools as I'm sure you'll need them. You want to use gdb, don't you?)
205+
- Really simplified and easy development. You don't need complex toolchains, you just need the `zig` executable, that's it. The tools `zitrus` provides also have no dependencies, they'll work on any platform that zig supports! You can still use devkitPRO's binutils if you need.
201206
- Safety in `Debug` and `ReleaseSafe` modes. Zitrus currently uses the `ErrDisp` port to report panics and returned errors. The only missing thing is reporting return traces with debugging symbols (Currently only addresses are logged)
202207
- Really useful and simple build-system (as you've seen the example `build.zig` is really small and makefiles are really arcane)
203208

docs/pica/zpsh.md

Lines changed: 112 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,115 @@
1-
# Zitrus PICA200 shader
1+
# PICA200 shader
22

3-
This file format is very minimalist and is the default output of a zpsm when assembled by zitrus.
3+
**This document is marked as a DRAFT, it is NOT final**
4+
5+
This new file format is very minimalist and is the default output format for a file assembled by zitrus.
6+
7+
It has been designed from scratch with iterative changes based on real world use-cases (mango)
48

5-
## Format
9+
All fields are little endian unless explicitly told otherwise and all types are described with `zig` syntax.
10+
11+
## Header
12+
13+
The main header of the binary, starting at offset `0x00`
14+
15+
| Field | Type | Notes |
16+
|:---------:|:------------------------:|----------------------------|
17+
| `magic` | `[4]u8` | Must be `ZPSH` |
18+
| `shader` | `bitpacked struct(u32)` | Starting from the LSb, the first `u12` is the number of entrypoints, the next `u12` is the number of instructions **minus one**, a valid PICA200 shader must have at least one instruction, `end`. The last `u8` is the number of instruction operand descriptors, **the value must be no larger than 128**|
19+
| `entry_string_table_size` | `u16` | Size **in `u32`s** of the entrypoint string table |
20+
| `flags` | `bitpacked struct(u8)` | Reserved |
21+
| `header_size` | `u8` | Real size **in `u32`s** of the **Header**, allows for *forward compatibility* |
22+
23+
#### `shader` bit layout
24+
25+
| 31...24 | 23...12 | 11...0 |
26+
|:-------:|:-------:|:------:|
27+
| descriptors (`u8`) | instructions_minus_one (`u12`) | entrypoints (`u12`) |
28+
29+
## Instruction & Operand Descriptors
30+
31+
Instructions (`shader.instructions_minus_one` + 1) and operand descriptors (`shader.descriptors`), starting at offset `header_size * @sizeOf(u32)`
32+
33+
## Entrypoint string table
34+
35+
This string table stores unique `0`-terminated `UTF-8` strings for entrypoint names, starting at offset `(header_size + shader.instructions_minus_one + 1 + shader.descriptors) * @sizeOf(u32)`
36+
37+
## Entrypoint Header
38+
39+
Entrypoints follow, they're dynamically sized as constants follow this header.
40+
41+
Starting at offset `(header_size + shader.instructions_minus_one + 1 + shader.descriptors + entry_string_table_size) * @sizeOf(u32)`
42+
43+
| Field | Type | Notes |
44+
|:--------------------:|:------------------------:|------------------------------------------------------|
45+
| `name_string_offset` | `u32` | Offset of the name in the entrypoint string table |
46+
| `instruction_offset` | `u16` | Entry instruction offset, must be less than 4096 |
47+
| `info` | `bitpacked struct(u16)` | Shader type and parameters, see below for the layout |
48+
| `boolean_constant_mask` | `bitpacked struct(u16)` | Each bit represents the state of the constant boolean register `bX` |
49+
| `integer_constant_mask` | `bitpacked struct(u16)` | Each bit represents whether a constant for the integer register `iX` follows, the remaining 12-bits are reserved and must be zeroed. |
50+
| `floating_constant_mask` | `extern struct` | Each bit represents represents whether a constant for the floating register `fX` follows, the remaining bits are reserved and must be zeroed. |
51+
| `output_mask` | `bitpacked struct(u16)` | Each bit represents represents whether an output map for the output register `oX` follows. |
52+
53+
#### `boolean_constant_mask` bit layout
54+
55+
Each bit represents the constant value for the register at that bit index.
56+
57+
| 16 | ... | 0 |
58+
|:---:|:---:|:--:|
59+
| b15 | bN | b0 |
60+
61+
#### `integer_constant_mask` bit layout
62+
63+
Each bit represents whether a constant value for the register at that bit index follows after the header.
64+
65+
Constants are sorted for each bit, that is, if bits `0` and `3` are set, the first constant will be for register `i0`
66+
and the second for `i3`.
67+
68+
| 16...4 | 3 | 2 | 1 | 0 |
69+
|:--------:|:--:|:--:|:--:|:--:|
70+
| reserved | i3 | i2 | i1 | i0 |
71+
72+
#### `floating_constant_mask` layout
73+
74+
Each bit represents whether a constant value for the register at that bit index follows after the header.
75+
76+
The `extern struct` has this layout:
77+
78+
| Field | Type | Notes |
79+
|:--------------------:|:------------------------:|------------------------------------------------------------|
80+
| `low` | `bitpacked struct(u32)` | Whether a constant value follows for registers `f0`-`f31` |
81+
| `mid` | `bitpacked struct(u32)` | Whether a constant value follows for registers `f32`-`f63` |
82+
| `high` | `bitpacked struct(u32)` | Whether a constant value follows for registers `f64`-`f95` |
83+
84+
Bit layout for each field follows the same structure as `boolean_constant_mask` and `integer_constant_mask`.
85+
86+
#### `output_mask` bit layout
87+
88+
Each bit represents whether an output map for the register at that bit index follows after the header.
89+
90+
| 16 | ... | 0 |
91+
|:---:|:---:|:--:|
92+
| o15 | oN | o0 |
93+
94+
### Entrypoint constants and output maps
95+
96+
Each entrypoint can be considered as a different `shader module` that shares instructions with others. As such, they have different sets
97+
of constants and output maps.
98+
99+
The layout is the following:
100+
- Entrypoint Header
101+
- Integer constants, one for each bit set in `integer_constant_mask`
102+
- Floating constants, one for each bit set in `floating_constant_mask`
103+
- Output maps, one for each bit set in `output_mask`
104+
105+
#### Integer constant
106+
107+
A `[4]u8` representing a 4-component vector with layout `xyzw`.
108+
109+
#### Floating constant
110+
111+
A packed 4-component `F7_16` vector in the format required by the PICA200. It can be seen as storing all `F7_16` components packed in an `[3]u32` and doing a `swap` of the first and last `u32`.
112+
113+
#### Output map
6114

7-
You can see the format types in `fmt/zpsh.zig` and how it is written in `tools/pica/main.zig`.
8-
If you've never read zig, [here's the official reference](https://ziglang.org/documentation/master/)
115+
An map describing output semantics for each component of the specified output. They're in the format required by the PICA200.

docs/pica/zpsm.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,15 +23,15 @@ A `Limited source register` or `src_limited` refers to an `Input` or `Temporary`
2323

2424
A `Destination register` or `dst` refers to an `Output` or `Temporary` register.
2525

26+
A `Floating constant` register may be addressed at runtime starting at a base register fX, in that case you may use `a.x`, `a.y` or `a.l` to do so. E.g: `fX[a.x]`
27+
2628
## Basic Syntax
2729

2830
Each basic unit is a line, operands must be in the same line as the mnemonic/directive,
2931
one exception to this rule is an instruction preceded by a label, as it is perfectly valid.
3032

3133
Comments start with ';', only single line comments are supported.
3234

33-
There's one TODO left, relative `Floating constant` source addressing with the address register.
34-
3535
### Directives
3636

3737
`.entry <label> <shader> [parameters]`

src/fmt.zig

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,22 @@ pub const @"3dsx" = @import("fmt/3dsx.zig");
55

66
pub const zpsh = @import("fmt/zpsh.zig");
77
pub const z3ds = @import("fmt/z3ds.zig");
8+
9+
pub fn fixedArrayFromSlice(comptime T: type, comptime n: usize, slice: []const T) [n]T {
10+
std.debug.assert(slice.len <= n);
11+
var buf: [n]T = undefined;
12+
@memcpy(buf[0..slice.len], slice);
13+
@memset(buf[slice.len..], std.mem.zeroes(T));
14+
return buf;
15+
}
16+
17+
comptime {
18+
_ = firm;
19+
20+
_ = code;
21+
_ = @"3dsx";
22+
_ = zpsh;
23+
_ = z3ds;
24+
}
25+
26+
const std = @import("std");

src/fmt/3dsx.zig

Lines changed: 46 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,21 @@
44

55
pub const magic = "3DSX";
66

7+
const Segment = enum {
8+
text,
9+
rodata,
10+
data,
11+
12+
pub fn fromCodeSegment(seg: code.Segment.Kind) Segment {
13+
return switch (seg) {
14+
.text => .text,
15+
.rodata => .rodata,
16+
.data => .data,
17+
else => unreachable,
18+
};
19+
}
20+
};
21+
722
pub const Header = extern struct {
823
magic: [magic.len]u8 = magic.*,
924
header_size: u16,
@@ -47,13 +62,25 @@ pub const MakeOptions = struct {
4762
/// Asserts that the `text` segment address is the base address and entrypoint, the segments are sequential,
4863
/// and that the only segment with differing file/memory sizes is `data`.
4964
pub fn make(writer: *std.Io.Writer, reader: *std.fs.File.Reader, info: code.Info, gpa: std.mem.Allocator, options: MakeOptions) !void {
50-
std.debug.assert(info.findNonSequentialSegment() == null);
51-
std.debug.assert(info.findNonDataSegmentWithBss() == null);
65+
std.debug.assert(info.findNonSequentialPhysicalSegment(.fromByteUnits(zitrus.horizon.heap.page_size)) == null);
5266

53-
// They may be sorted already but we never know...
54-
std.mem.sort(u32, info.relocations.items, {}, comptime std.sort.asc(u32));
67+
const segments = info.segments;
68+
const header_size: u16 = if (options.smdh != null or options.romfs != null) @sizeOf(Header) + @sizeOf(ExtendedHeader) else @sizeOf(Header);
5569

56-
var processed_relocations = try processRelocations(info, gpa);
70+
const base_address = segments[0].virtual_address;
71+
const text_size = segments[0].memory_size;
72+
const rodata_size, const data_size, const bss_size = switch (segments.len) {
73+
1 => .{ 0, 0, 0 },
74+
2 => switch (segments[1].kind) {
75+
.rodata => .{ segments[1].file_size, 0, 0 },
76+
.data => .{ 0, segments[1].memory_size, segments[1].memory_size - segments[1].file_size },
77+
else => unreachable,
78+
},
79+
3 => .{ segments[1].file_size, segments[2].memory_size, segments[2].memory_size - segments[2].file_size },
80+
else => unreachable, // NOTE: Cannot happen
81+
};
82+
83+
var processed_relocations = try processRelocations(info, rodata_size, (data_size - bss_size), gpa);
5784
defer {
5885
var it = processed_relocations.iterator();
5986

@@ -62,13 +89,6 @@ pub fn make(writer: *std.Io.Writer, reader: *std.fs.File.Reader, info: code.Info
6289
}
6390
}
6491

65-
const header_size: u16 = if (options.smdh != null or options.romfs != null) @sizeOf(Header) + @sizeOf(ExtendedHeader) else @sizeOf(Header);
66-
67-
const base_address = info.segments.get(.text).?.address;
68-
const text_size = info.segments.get(.text).?.memory_size;
69-
const rodata_size = if (info.segments.get(.rodata)) |rodata| rodata.memory_size else 0;
70-
const data_size, const bss_size = if (info.segments.get(.data)) |data| .{ data.memory_size, data.memory_size - data.file_size } else .{ 0, 0 };
71-
7292
try writer.writeStruct(Header{
7393
.header_size = header_size,
7494
.relocation_header_size = @sizeOf(RelocationHeader),
@@ -84,7 +104,7 @@ pub fn make(writer: *std.Io.Writer, reader: *std.fs.File.Reader, info: code.Info
84104
const executable_end: u32 = @sizeOf(Header) + @sizeOf(ExtendedHeader) + (3 * @sizeOf(RelocationHeader)) + text_size + rodata_size + (data_size - bss_size) + (tot_reloc: {
85105
var total: u32 = 0;
86106

87-
for (std.enums.values(code.Segment)) |segment| {
107+
for (std.enums.values(Segment)) |segment| {
88108
total += @intCast(@sizeOf(Relocation) * processed_relocations.get(segment).items.len);
89109
}
90110

@@ -102,21 +122,19 @@ pub fn make(writer: *std.Io.Writer, reader: *std.fs.File.Reader, info: code.Info
102122
}, .little);
103123
}
104124

105-
for (std.enums.values(code.Segment)) |segment| {
125+
for (std.enums.values(Segment)) |segment| {
106126
try writer.writeStruct(RelocationHeader{
107127
.absolute_relocations = @intCast(processed_relocations.get(segment).items.len),
108128
.relative_relocations = 0,
109129
}, .little);
110130
}
111131

112-
var info_rw = info;
113-
var segment_it = info_rw.segments.iterator();
114-
while (segment_it.next()) |seg| {
115-
const segment_relocs = processed_relocations.get(seg.key);
132+
for (segments) |seg| {
133+
const segment_relocs = processed_relocations.get(.fromCodeSegment(seg.kind));
116134

117135
var patched: usize = 0;
118136

119-
try reader.seekTo(seg.value.file_offset);
137+
try reader.seekTo(seg.file_offset);
120138
for (segment_relocs.items) |rc| {
121139
try reader.interface.streamExact(writer, rc.words_to_skip * @sizeOf(u32));
122140

@@ -129,10 +147,10 @@ pub fn make(writer: *std.Io.Writer, reader: *std.fs.File.Reader, info: code.Info
129147
patched += (rc.words_to_skip + @as(usize, rc.words_to_patch)) * @sizeOf(u32);
130148
}
131149

132-
try reader.interface.streamExact(writer, (seg.value.file_size - patched));
150+
try reader.interface.streamExact(writer, (seg.file_size - patched));
133151
}
134152

135-
for (std.enums.values(code.Segment)) |segment| {
153+
for (std.enums.values(Segment)) |segment| {
136154
for (processed_relocations.get(segment).items) |reloc| {
137155
try writer.writeStruct(reloc, .little);
138156
}
@@ -147,26 +165,26 @@ pub fn make(writer: *std.Io.Writer, reader: *std.fs.File.Reader, info: code.Info
147165
}
148166
}
149167

150-
fn processRelocations(info: code.Info, gpa: std.mem.Allocator) !std.EnumArray(code.Segment, std.ArrayList(Relocation)) {
151-
var processed: std.EnumArray(code.Segment, std.ArrayList(Relocation)) = .initFill(.empty);
168+
fn processRelocations(info: code.Info, rodata_file_size: u32, data_file_size: u32, gpa: std.mem.Allocator) !std.EnumArray(Segment, std.ArrayList(Relocation)) {
169+
var processed: std.EnumArray(Segment, std.ArrayList(Relocation)) = .initFill(.empty);
152170
errdefer {
153171
var it = processed.iterator();
154172
while (it.next()) |relocs| {
155173
relocs.value.deinit(gpa);
156174
}
157175
}
158176

159-
const text, const text_size = .{ info.segments.get(.text).?.address, info.segments.get(.text).?.memory_size };
160-
const rodata, const rodata_size = if (info.segments.get(.rodata)) |rodata| .{ rodata.address, rodata.memory_size } else .{ text + text_size, 0 };
161-
const data, const data_size = if (info.segments.get(.data)) |data| .{ data.address, data.memory_size } else .{ rodata + rodata_size, 0 };
162-
const top = data + data_size;
177+
const text, const text_size = .{ info.segments[0].virtual_address, info.segments[0].memory_size };
178+
const rodata = std.mem.alignForward(u32, text + text_size, zitrus.horizon.heap.page_size);
179+
const data = std.mem.alignForward(u32, rodata + rodata_file_size, zitrus.horizon.heap.page_size);
180+
const top = std.mem.alignForward(u32, data + data_file_size, zitrus.horizon.heap.page_size);
163181

164182
const base_addresses: []const u32 = &.{ text, rodata, data, top };
165183

166184
var last_relocation_address: u32 = text;
167185
var current_base: u8 = 0;
168186

169-
const relocs = info.relocations.items;
187+
const relocs = info.relocations;
170188

171189
// NOTE: relocations are already sorted
172190
var current_absolute: usize = 0;

0 commit comments

Comments
 (0)