Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,8 +119,11 @@ Six packages, one pipeline:
| `cs[i][j] ? x` | `x = <-cs[i][j]` (multi-dim channel index) |
| `[3][4]INT grid:` | `grid := make([][]int, 3)` + nested init loops |
| `grid[i][j] := 42` | `grid[i][j] = 42` (multi-dim array index) |
| `VAL [][2]INT x IS [[1,2]]:` | `var x [][]int = [][]int{{1, 2}}` (mixed-dim abbreviation) |
| `VAL [][]INT x IS [[1,2]]:` | `var x [][]int = [][]int{{1, 2}}` (multi-dim open abbreviation) |
| `PROC f([]CHAN OF INT cs)` | `func f(cs []chan int)` |
| `PROC f([][]CHAN OF INT cs)` | `func f(cs [][]chan int)` (multi-dim chan array) |
| `PROC f(VAL [][2]BYTE cfg)` | `func f(cfg [][]byte)` (mixed-dim param) |
| `PROC f([]CHAN OF INT cs?)` | `func f(cs []chan int)` (direction dropped for array params) |
| `PROC f([]CHAN OF INT cs!)` | `func f(cs []chan int)` (direction dropped for array params) |
| `PROC f(CHAN OF INT c?)` | `func f(c <-chan int)` (input/receive-only) |
Expand Down Expand Up @@ -191,7 +194,7 @@ Typical workflow for a new language construct:

## What's Implemented

Preprocessor (`#IF`/`#ELSE`/`#ENDIF`/`#DEFINE`/`#INCLUDE` with search paths, include guards, include-once deduplication, `#COMMENT`/`#PRAGMA`/`#USE` ignored), module file generation from SConscript (`gen-module` subcommand), SEQ, PAR, PRI PAR, IF, WHILE, CASE, ALT, PRI ALT (with guards, timer timeouts, multi-statement bodies with scoped declarations, and replicators using `reflect.Select`), SKIP, STOP, variable/array/channel/timer declarations, abbreviations (`VAL INT x IS 42:`, `INT y IS z:`, `VAL []BYTE s IS "hi":`, untyped `VAL x IS expr:`), assignments (simple and indexed), channel send/receive, channel arrays (`[n]CHAN OF TYPE` with indexed send/receive, `[]CHAN OF TYPE` proc params, and multi-dimensional `[n][m]CHAN`/`[n][m]TYPE`/`[][]CHAN`/`[][]TYPE`), PROC (with VAL, RESULT, reference, CHAN, []CHAN, open array `[]TYPE`, fixed-size array `[n]TYPE`, and shared-type params), channel direction restrictions (`CHAN OF INT c?` → `<-chan int`, `CHAN OF INT c!` → `chan<- int`, call-site annotations `out!`/`in?` accepted), multi-line parameter lists and expressions (lexer suppresses INDENT/DEDENT/NEWLINE inside parens/brackets and after continuation operators), FUNCTION (IS and VALOF forms with multi-statement bodies, including multi-result `INT, INT FUNCTION` with `RESULT a, b`), multi-assignment (`a, b := func(...)` including indexed targets like `x[0], x[1] := x[1], x[0]`), KRoC-style colon terminators on PROC/FUNCTION (optional), INLINE function modifier (accepted and ignored), replicators on SEQ/PAR/IF/ALT (with optional STEP), arithmetic/comparison/logical/AFTER/bitwise operators, type conversions (`INT expr`, `INT16 expr`, `INT32 expr`, `INT64 expr`, `BYTE expr`, `BOOL expr`, `REAL32 expr`, `REAL64 expr`, including BOOL↔numeric via `_boolToInt` helper and `!= 0` comparison, and ROUND/TRUNC qualifiers for float↔int conversions), INT16/INT32/INT64 types, REAL32/REAL64 types, hex integer literals (`#FF`, `#80000000`), string literals, byte literals (`'A'`, `'*n'` with occam escape sequences), built-in print procedures, protocols (simple, sequential, and variant), record types (with field access via bracket syntax), SIZE operator, array slices (`[arr FROM n FOR m]` and shorthand `[arr FOR m]` with slice assignment), array literals (`[1, 2, 3]`), nested PROCs/FUNCTIONs (local definitions as Go closures), MOSTNEG/MOSTPOS (type min/max constants for INT, INT16, INT32, INT64, BYTE, REAL32, REAL64), INITIAL declarations (`INITIAL INT x IS 42:` — mutable variable with initial value), checked (modular) arithmetic (`PLUS`, `MINUS`, `TIMES` — wrapping operators), RETYPES (bit-level type reinterpretation: `VAL INT X RETYPES X :` for float32→int, `VAL [2]INT X RETYPES X :` for float64→int pair), transputer intrinsics (LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT — implemented as Go helper functions), CAUSEERROR (maps to `panic("CAUSEERROR")`).
Preprocessor (`#IF`/`#ELSE`/`#ENDIF`/`#DEFINE`/`#INCLUDE` with search paths, include guards, include-once deduplication, `#COMMENT`/`#PRAGMA`/`#USE` ignored), module file generation from SConscript (`gen-module` subcommand), SEQ, PAR, PRI PAR, IF, WHILE, CASE, ALT, PRI ALT (with guards, timer timeouts, multi-statement bodies with scoped declarations, and replicators using `reflect.Select`), SKIP, STOP, variable/array/channel/timer declarations, abbreviations (`VAL INT x IS 42:`, `INT y IS z:`, `VAL []BYTE s IS "hi":`, untyped `VAL x IS expr:`), assignments (simple and indexed), channel send/receive, channel arrays (`[n]CHAN OF TYPE` with indexed send/receive, `[]CHAN OF TYPE` proc params, and multi-dimensional `[n][m]CHAN`/`[n][m]TYPE`/`[][]CHAN`/`[][]TYPE`/`[][n]TYPE`), PROC (with VAL, RESULT, reference, CHAN, []CHAN, open array `[]TYPE`, fixed-size array `[n]TYPE`, and shared-type params), channel direction restrictions (`CHAN OF INT c?` → `<-chan int`, `CHAN OF INT c!` → `chan<- int`, call-site annotations `out!`/`in?` accepted), multi-line parameter lists and expressions (lexer suppresses INDENT/DEDENT/NEWLINE inside parens/brackets and after continuation operators), FUNCTION (IS and VALOF forms with multi-statement bodies, including multi-result `INT, INT FUNCTION` with `RESULT a, b`), multi-assignment (`a, b := func(...)` including indexed targets like `x[0], x[1] := x[1], x[0]`), KRoC-style colon terminators on PROC/FUNCTION (optional), INLINE function modifier (accepted and ignored), replicators on SEQ/PAR/IF/ALT (with optional STEP), arithmetic/comparison/logical/AFTER/bitwise operators, type conversions (`INT expr`, `INT16 expr`, `INT32 expr`, `INT64 expr`, `BYTE expr`, `BOOL expr`, `REAL32 expr`, `REAL64 expr`, including BOOL↔numeric via `_boolToInt` helper and `!= 0` comparison, and ROUND/TRUNC qualifiers for float↔int conversions), INT16/INT32/INT64 types, REAL32/REAL64 types, hex integer literals (`#FF`, `#80000000`), string literals, byte literals (`'A'`, `'*n'` with occam escape sequences), built-in print procedures, protocols (simple, sequential, and variant), record types (with field access via bracket syntax), SIZE operator, array slices (`[arr FROM n FOR m]` and shorthand `[arr FOR m]` with slice assignment), array literals (`[1, 2, 3]`), nested PROCs/FUNCTIONs (local definitions as Go closures), MOSTNEG/MOSTPOS (type min/max constants for INT, INT16, INT32, INT64, BYTE, REAL32, REAL64), INITIAL declarations (`INITIAL INT x IS 42:` — mutable variable with initial value), checked (modular) arithmetic (`PLUS`, `MINUS`, `TIMES` — wrapping operators), RETYPES (bit-level type reinterpretation: `VAL INT X RETYPES X :` for float32→int, `VAL [2]INT X RETYPES X :` for float64→int pair), transputer intrinsics (LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT — implemented as Go helper functions), CAUSEERROR (maps to `panic("CAUSEERROR")`).

## Course Module Testing

Expand Down
2 changes: 1 addition & 1 deletion TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
### Data Types & Declarations
- **INT, INT16, INT32, INT64, BYTE, BOOL, REAL, REAL32, REAL64** — Scalar types (INT16/32/64 map to int16/32/64, REAL/REAL64 map to float64, REAL32 maps to float32)
- **Variable declarations** — `INT x, y, z:`
- **Arrays** — `[n]TYPE arr:` with index expressions; multi-dimensional `[n][m]TYPE` with nested init loops
- **Arrays** — `[n]TYPE arr:` with index expressions; multi-dimensional `[n][m]TYPE` with nested init loops; mixed-dimension abbreviations `[][n]TYPE` and `[][]TYPE`
- **Channels** — `CHAN OF TYPE c:` with send (`!`) and receive (`?`); `CHAN BYTE` shorthand (without `OF`)
- **Channel arrays** — `[n]CHAN OF TYPE cs:` with indexed send/receive; multi-dimensional `[n][m]CHAN OF TYPE` with nested init loops; `[]CHAN`, `[][]CHAN`, etc. proc params
- **Channel direction** — `CHAN OF INT c?` (receive-only) and `CHAN OF INT c!` (send-only); direction annotations at call sites (`out!`, `in?`) accepted and ignored
Expand Down
15 changes: 7 additions & 8 deletions ast/ast.go
Original file line number Diff line number Diff line change
Expand Up @@ -491,14 +491,13 @@ func (se *SliceExpr) TokenLiteral() string { return se.Token.Literal }

// Abbreviation represents an abbreviation: VAL INT x IS 42:, INT y IS z:, or INITIAL INT x IS 42:
type Abbreviation struct {
Token lexer.Token // VAL, INITIAL, or type token
IsVal bool // true for VAL abbreviations
IsInitial bool // true for INITIAL declarations
IsOpenArray bool // true for []TYPE abbreviations (e.g. VAL []BYTE)
IsFixedArray bool // true for [n]TYPE abbreviations (e.g. VAL [8]INT)
Type string // "INT", "BYTE", "BOOL", etc.
Name string // variable name
Value Expression // the expression
Token lexer.Token // VAL, INITIAL, or type token
IsVal bool // true for VAL abbreviations
IsInitial bool // true for INITIAL declarations
OpenArrayDims int // number of [] dimensions (1 for []BYTE, 2 for [][2]BYTE or [][]INT, etc.)
Type string // "INT", "BYTE", "BOOL", etc.
Name string // variable name
Value Expression // the expression
}

func (a *Abbreviation) statementNode() {}
Expand Down
42 changes: 36 additions & 6 deletions codegen/codegen.go
Original file line number Diff line number Diff line change
Expand Up @@ -296,16 +296,18 @@ func (g *Generator) Generate(program *ast.Program) string {
g.write("\n")
} else {
goType := g.occamTypeToGo(abbr.Type)
if abbr.IsOpenArray || abbr.IsFixedArray {
goType = "[]" + goType
if abbr.OpenArrayDims > 0 {
goType = strings.Repeat("[]", abbr.OpenArrayDims) + goType
}
g.builder.WriteString("var ")
g.write(fmt.Sprintf("%s %s = ", goIdent(abbr.Name), goType))
// Wrap string literals with []byte() when assigned to []byte variables
if _, isStr := abbr.Value.(*ast.StringLiteral); isStr && abbr.IsOpenArray && abbr.Type == "BYTE" {
if _, isStr := abbr.Value.(*ast.StringLiteral); isStr && abbr.OpenArrayDims > 0 && abbr.Type == "BYTE" {
g.write("[]byte(")
g.generateExpression(abbr.Value)
g.write(")")
} else if al, isArr := abbr.Value.(*ast.ArrayLiteral); isArr && abbr.OpenArrayDims > 1 {
g.generateTypedArrayLiteral(al, goType)
} else {
g.generateExpression(abbr.Value)
}
Expand Down Expand Up @@ -1198,18 +1200,20 @@ func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
if abbr.Type != "" {
goType := g.occamTypeToGo(abbr.Type)
if abbr.IsOpenArray || abbr.IsFixedArray {
goType = "[]" + goType
if abbr.OpenArrayDims > 0 {
goType = strings.Repeat("[]", abbr.OpenArrayDims) + goType
}
g.write(fmt.Sprintf("var %s %s = ", goIdent(abbr.Name), goType))
} else {
g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
}
// Wrap string literals with []byte() when assigned to []byte variables
if _, isStr := abbr.Value.(*ast.StringLiteral); isStr && abbr.IsOpenArray && abbr.Type == "BYTE" {
if _, isStr := abbr.Value.(*ast.StringLiteral); isStr && abbr.OpenArrayDims > 0 && abbr.Type == "BYTE" {
g.write("[]byte(")
g.generateExpression(abbr.Value)
g.write(")")
} else if al, isArr := abbr.Value.(*ast.ArrayLiteral); isArr && abbr.OpenArrayDims > 1 {
g.generateTypedArrayLiteral(al, strings.Repeat("[]", abbr.OpenArrayDims)+g.occamTypeToGo(abbr.Type))
} else {
g.generateExpression(abbr.Value)
}
Expand Down Expand Up @@ -2949,6 +2953,32 @@ func (g *Generator) generateArrayLiteral(al *ast.ArrayLiteral) {
g.write("}")
}

// generateTypedArrayLiteral emits a typed Go slice literal with the given Go type.
// For nested arrays (e.g. [][]int), inner array literals use bare {e1, e2} syntax
// (Go composite literal elision).
func (g *Generator) generateTypedArrayLiteral(al *ast.ArrayLiteral, goType string) {
g.write(goType + "{")
for i, elem := range al.Elements {
if i > 0 {
g.write(", ")
}
if innerArr, ok := elem.(*ast.ArrayLiteral); ok {
// Inner array: use bare composite literal {e1, e2, ...}
g.write("{")
for j, inner := range innerArr.Elements {
if j > 0 {
g.write(", ")
}
g.generateExpression(inner)
}
g.write("}")
} else {
g.generateExpression(elem)
}
}
g.write("}")
}

// generateRetypesDecl emits code for a RETYPES declaration.
// VAL INT X RETYPES X : — reinterpret float32/64 bits as int(s)
// When source and target share the same name (shadowing a parameter), the parameter
Expand Down
51 changes: 51 additions & 0 deletions codegen/e2e_array_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -405,3 +405,54 @@ SEQ
t.Errorf("expected %q, got %q", expected, output)
}
}

func TestE2E_MixedDimAbbreviation(t *testing.T) {
// [][2]INT abbreviation with nested array literals
occam := `SEQ
VAL [][2]INT pairs IS [[10, 20], [30, 40]]:
print.int(pairs[0][0])
print.int(pairs[0][1])
print.int(pairs[1][0])
print.int(pairs[1][1])
`
output := transpileCompileRun(t, occam)
expected := "10\n20\n30\n40\n"
if output != expected {
t.Errorf("expected %q, got %q", expected, output)
}
}

func TestE2E_MixedDimProcParam(t *testing.T) {
// PROC with [][2]INT parameter
occam := `PROC print.pairs(VAL [][2]INT pairs)
SEQ i = 0 FOR SIZE pairs
SEQ
print.int(pairs[i][0])
print.int(pairs[i][1])
:
SEQ
VAL [][2]INT data IS [[10, 20], [30, 40], [50, 60]]:
print.pairs(data)
`
output := transpileCompileRun(t, occam)
expected := "10\n20\n30\n40\n50\n60\n"
if output != expected {
t.Errorf("expected %q, got %q", expected, output)
}
}

func TestE2E_MultiDimOpenAbbreviation(t *testing.T) {
// [][]INT abbreviation
occam := `SEQ
VAL [][]INT matrix IS [[1, 2], [3, 4]]:
print.int(matrix[0][0])
print.int(matrix[0][1])
print.int(matrix[1][0])
print.int(matrix[1][1])
`
output := transpileCompileRun(t, occam)
expected := "1\n2\n3\n4\n"
if output != expected {
t.Errorf("expected %q, got %q", expected, output)
}
}
83 changes: 53 additions & 30 deletions parser/parser.go
Original file line number Diff line number Diff line change
Expand Up @@ -327,31 +327,33 @@ func (p *Parser) parseAbbreviation() ast.Statement {

p.nextToken()

// Check for []TYPE (open array abbreviation)
isOpenArray := false
if p.curTokenIs(lexer.LBRACKET) && p.peekTokenIs(lexer.RBRACKET) {
isOpenArray = true
p.nextToken() // consume ]
p.nextToken() // move to type
}

// Check for [n]TYPE (fixed-size array, used in RETYPES)
isArray := false
// Count bracket dimensions: [] (open) and [n] (fixed) in any combination
// e.g. []BYTE = 1 dim, [][2]BYTE = 2 dims, [][]INT = 2 dims, [8]INT = 1 dim (fixed, for RETYPES)
dims := 0
isFixedArray := false
var arraySize ast.Expression
if !isOpenArray && p.curTokenIs(lexer.LBRACKET) {
// Could be [n]TYPE name RETYPES ...
isArray = true
p.nextToken() // move past [
arraySize = p.parseExpression(LOWEST)
if !p.expectPeek(lexer.RBRACKET) {
return nil
for p.curTokenIs(lexer.LBRACKET) {
if p.peekTokenIs(lexer.RBRACKET) {
// Open dimension: []
dims++
p.nextToken() // consume ]
p.nextToken() // past ]
} else {
// Fixed dimension: [n]
dims++
isFixedArray = true
p.nextToken() // past [
arraySize = p.parseExpression(LOWEST)
if !p.expectPeek(lexer.RBRACKET) {
return nil
}
p.nextToken() // past ]
}
p.nextToken() // move to type
}

// Check for untyped VAL abbreviation: VAL name IS expr :
// Detect: curToken is IDENT and peekToken is IS (no type keyword)
if !isOpenArray && !isArray && p.curTokenIs(lexer.IDENT) && p.peekTokenIs(lexer.IS) {
if dims == 0 && p.curTokenIs(lexer.IDENT) && p.peekTokenIs(lexer.IS) {
name := p.curToken.Literal
p.nextToken() // consume IS
p.nextToken() // move to expression
Expand Down Expand Up @@ -395,7 +397,7 @@ func (p *Parser) parseAbbreviation() ast.Statement {
Token: token,
IsVal: true,
TargetType: typeName,
IsArray: isArray,
IsArray: isFixedArray,
ArraySize: arraySize,
Name: name,
Source: source,
Expand All @@ -417,13 +419,12 @@ func (p *Parser) parseAbbreviation() ast.Statement {
}

return &ast.Abbreviation{
Token: token,
IsVal: true,
IsOpenArray: isOpenArray,
IsFixedArray: isArray,
Type: typeName,
Name: name,
Value: value,
Token: token,
IsVal: true,
OpenArrayDims: dims,
Type: typeName,
Name: name,
Value: value,
}
}

Expand Down Expand Up @@ -2150,7 +2151,8 @@ func (p *Parser) parseProcParams() []ast.ProcParam {
p.nextToken()
}

// Check for []...CHAN OF <type>, []...TYPE (open array), or [n]TYPE (fixed-size array)
// Check for []...CHAN OF <type>, []...TYPE (open array), [n]TYPE (fixed-size array),
// or mixed [][n]TYPE (open+fixed dimensions)
if p.curTokenIs(lexer.LBRACKET) {
if p.peekTokenIs(lexer.RBRACKET) {
// Open array: [][]...CHAN OF TYPE or [][]...TYPE
Expand All @@ -2161,6 +2163,16 @@ func (p *Parser) parseProcParams() []ast.ProcParam {
p.nextToken() // consume ]
p.nextToken() // move past ]
}
// After open [] pairs, check for trailing [n] fixed dims (e.g. [][2]TYPE)
for p.curTokenIs(lexer.LBRACKET) && !p.peekTokenIs(lexer.RBRACKET) {
dims++
p.nextToken() // past [
// skip size expression tokens until ]
for !p.curTokenIs(lexer.RBRACKET) && !p.curTokenIs(lexer.EOF) {
p.nextToken()
}
p.nextToken() // past ]
}
if p.curTokenIs(lexer.CHAN) {
// []...CHAN OF <type> or []...CHAN <type> (channel array parameter)
param.IsChan = true
Expand Down Expand Up @@ -2189,7 +2201,8 @@ func (p *Parser) parseProcParams() []ast.ProcParam {
return params
}
} else {
// Fixed-size array: [n]TYPE
// Fixed-size array: [n]TYPE — mapped to open array (slice) param
dims := 1
p.nextToken() // move past [
if !p.curTokenIs(lexer.INT) {
p.addError(fmt.Sprintf("expected array size, got %s", p.curToken.Type))
Expand All @@ -2199,7 +2212,17 @@ func (p *Parser) parseProcParams() []ast.ProcParam {
if !p.expectPeek(lexer.RBRACKET) {
return params
}
p.nextToken() // move to type
p.nextToken() // move past ]
// Check for additional [n] dims after the first (e.g. [3][4]TYPE)
for p.curTokenIs(lexer.LBRACKET) && !p.peekTokenIs(lexer.RBRACKET) {
dims++
p.nextToken() // past [
for !p.curTokenIs(lexer.RBRACKET) && !p.curTokenIs(lexer.EOF) {
p.nextToken()
}
p.nextToken() // past ]
}
_ = dims // ArraySize already set; dims only relevant for open arrays
if isTypeToken(p.curToken.Type) {
param.Type = p.curToken.Literal
} else if p.curTokenIs(lexer.IDENT) && p.recordNames[p.curToken.Literal] {
Expand Down
Loading