Skip to content

Latest commit

 

History

History
325 lines (217 loc) · 10.6 KB

File metadata and controls

325 lines (217 loc) · 10.6 KB

AGENTS.md

This document defines how automated coding assistants (“agents”) should interact with this repository.

1. Repository Overview

This repository contains the WurstScript compiler. Its main code lives in:

de.peeeq.wurstscript/

Other directories like WurstPack and HelperScripts exist but are largely deprecated and should not be modified unless explicitly requested.

Compiler layout

Inside de.peeeq.wurstscript:

  • src/main/antlr/de/peeeq/wurstscript/antlr/ Contains the ANTLR grammars (.g4) for Wurst and Jass. These produce concrete syntax trees (CSTs).

  • parserspec/ Contains .parseq grammars for abstractsyntaxgen (https://github.com/peterzeller/abstractsyntaxgen). These define the AST structure used by the compiler. Code is generated via the Gradle task:

    ./gradlew :gen
    
  • src/main/java/de/peeeq/ Main compiler sources:

    • Parsing and AST infrastructure
    • Type checking
    • Intermediate language (IM)
    • Jass and Lua backends
    • Interpreter for executing IM at compile time (used for specific compile-time evaluations)

Compilation pipeline (simplified)

  1. Parse Wurst/Jass with ANTLR → CST
  2. Abstractsyntaxgen → AST
  3. Transform AST → IM
  4. Optionally: Run IM in the interpreter, Optimize
  5. Transform IM → Backend (Jass or Lua)

Language and tooling

  • Java 25
  • Gradle (9.2.1)
  • Unit tests define many entry points and expected behaviors.

2. Agent Expectations

Changes made by agents must follow these principles:

Compatibility first

  • All existing tests must continue to pass.
  • If behavior changes intentionally, provide new tests that define the updated semantics.

Minimal, well-scoped edits

Agents should:

  • Fix concrete bugs with small, local patches.
  • Add missing null-checks, defensive checks, or diagnostics where appropriate.
  • Add tests when resolving issues or implementing requested features.

Agents should avoid:

  • Large refactors (renaming packages, structural moves, mass rewrites).
  • Modifying deprecated folders unless explicitly instructed.
  • Altering public semantics or language rules without tests demonstrating the intended outcome.

Test-driven

  • Any new behavior requires tests showing failure before the change and success after.
  • Use existing test style and harnesses.

Generated code

  • Do not modify files generated by :gen.

  • If modifying .parseq files or grammars, regenerate via:

    ./gradlew :gen
    

3. Coding Guidelines

Follow existing style

Use the conventions already present in the file you edit. Avoid introducing new patterns without reason.

Compiler structure expectations

  • The IM is the central intermediate representation.
  • Transformations should keep IM consistent and valid.
  • Backends (Jass/Lua) expect well-formed IM; avoid breaking invariants.
  • Interpreter should remain deterministic and side-effect free.

Error handling

  • Prefer explicit, descriptive diagnostic messages.
  • Avoid silent fallbacks or suppressed exceptions.
  • Don’t change the meaning of existing error messages unless required.

Performance

  • Avoid algorithmic regressions in parsing, type checking, or transforms.
  • Consider memory impact when manipulating large ASTs or IM graphs.

4. Allowed vs. Disallowed Changes

Allowed

  • Fix a crash or incorrect behavior in a specific compiler pass.
  • Add a regression test that demonstrates a known issue.
  • Improve clarity of error messages.
  • Add a small new feature when fully specified by the user and backed by tests.
  • Update Gradle/JDK usage only if part of a requested task.

Disallowed

  • Unsolicited rewrites of ANTLR grammars.
  • Modifying deprecated folders.
  • Changing code generation semantics without explicit tests.
  • Changing IM behavior without test coverage.
  • Introducing new external dependencies unless requested.

5. How to Run Tests and Code Generation

Inside

de.peeeq.wurstscript/

run:

Run all tests

./gradlew test

Run a specific test

./gradlew test --tests "tests.wurstscript.tests.GenericsWithTypeclassesTests.identity"

Generate AST code via ANTLR & abstractsyntaxgen

./gradlew :gen

Build the compiler

./gradlew build

6. Summary for Agents

  • Keep changes minimal, compatible, and tested.
  • The authoritative behavior is defined by the existing test suite.
  • The compiler architecture relies on CST → AST → IM → Backend; treat each stage carefully.
  • Never modify generated files; modify the sources that generate them instead.
  • New behavior must be documented through tests.

7. LSP Structure and Build Pipelines

This repository has multiple entry points that may trigger compilation/build behavior:

  • Language Server runtime de.peeeq.wurstio.languageserver.*
  • LSP build request de.peeeq.wurstio.languageserver.requests.BuildMap
  • CLI compiler entry point de.peeeq.wurstio.Main
  • CLI map build request de.peeeq.wurstio.languageserver.requests.CliBuildMap

LSP architecture (high-level)

  • WurstLanguageServer wires LSP protocol handlers.
  • LanguageWorker serializes requests and file-change reconciliation.
  • ModelManagerImpl owns project model state (wurst files, dependencies, diagnostics).
  • User actions like build/start/tests are implemented in languageserver.requests.*.

Build-map pipeline (centralized)

Map build behavior is centralized in:

  • MapRequest.executeBuildMapPipeline(...)

Both:

  • BuildMap (VSCode/LSP build command), and
  • CliBuildMap (CLI -build, used by grill)

must use that shared backend flow.

This pipeline handles:

  1. map/cached-map preparation
  2. script extraction/config application
  3. compilation (Jass/Lua)
  4. script + map data injection (including imports/w3i)
  5. final output map write + MPQ compression finalization

Lock handling policy

  • BuildMap (LSP/UI) may use interactive retry/rename behavior for locked output files.
  • CliBuildMap must fail fast with a clear error for locked files (non-interactive environments).

Agent guardrails for future changes

  • Do not reintroduce separate build-map logic in Main or other call sites.
  • If map build behavior changes, update the shared MapRequest pipeline first, then keep wrappers thin.
  • Ensure CLI and LSP builds remain behaviorally aligned unless a difference is explicitly required and tested.

8. Backend Parity and Lua Guardrails

Recent fixes established additional rules for backend work. Follow these for all future changes:

Jass/Lua feature parity

  • New language/compiler features must be validated for both Jass and Lua backends.
  • Behavior should be as close as possible across backends.
  • If behavior differs, treat it as intentional only when:
    • the reason is backend/runtime-specific, and
    • the difference is documented in tests.

Error behavior parity expectations

  • Prefer matching Jass behavior semantically in Lua output.
  • Be explicit that Lua is stricter in some runtime cases where Jass may silently default/swallow invalid operations.
  • Do not rely on Lua strictness as a substitute for correct lowering/translation.

Lua inliner safety: callback/function-reference boundaries

  • On Lua target, do not inline across callback/function-reference-heavy sites (IM ImFuncRef-containing callees).
  • This avoids breaking callback context semantics (e.g. wrapper/xpcall/callback-native interactions such as force/group enum callbacks).
  • This is a structural rule, not a name-based exclusion.

Lua locals limit fallback (>200 locals)

  • Lua has a hard local-variable limit per function.
  • When a function exceeds the safe local threshold, rewrite locals to a locals-table fallback.
  • Requirements for fallback correctness:
    • locals-table declaration must be at function top before first use,
    • rewritten accesses must target the declared table (no global fallback),
    • nested block local initializations must be preserved,
    • use deterministic numeric slot indices (tbl[1], tbl[2], ...) rather than string keys.

Regression testing requirements

  • Any backend parity fix must add/adjust regression tests in tests.wurstscript.tests.*.
  • Include tests that check:
    • generated backend output shape for the affected backend,
    • no behavioral regression in the other backend when relevant,
    • known fragile cases (dispatch binding, inlining boundaries, locals spilling).

9. Virtual Slot Binding and Determinism (New Generics + Lua)

Recent regressions showed that virtual-slot binding can silently degrade to base/no-op implementations in generated Lua while still compiling. Follow these rules for all related changes:

Root-slot correctness is mandatory

  • For FSM-style dispatch (currentState.<rootSlot>(...)), each concrete subclass must bind that same root slot to its own most-specific implementation.
  • Never accept mappings where a subclass has its own update method but the dispatched root slot still points to NoOpState_* (or another base implementation).
  • When verifying generated Lua, always inspect both:
    • the slot invoked at call-site (FSM_*update), and
    • class table assignments for each sibling state class.

Override-chain integrity (wrapper/bridge cases)

  • If override wrappers/bridges are created, preserve transitive override links (wrapper -> real override) so deeper subclasses remain reachable during slot/name normalization.
  • Avoid transformations that disconnect root methods from concrete overrides in the method union graph.

Deterministic Lua emission requirements

  • Lua output must be deterministic for identical input (same input -> byte-identical output in test harness).
  • Any iteration over methods/supertypes/union groups used for naming or table assignment must be deterministic (stable ordering).
  • If multiple candidate methods exist for the same slot in a class, selection must be deterministic and must prefer the most specific non-abstract implementation for that class.

Required regression tests for slot fixes

  • Add a repro with:
    • State<T:>, NoOpState<T:>, FSM<T:>,
    • multiple sibling NoOpState<Owner> subclasses (including at least 4+ siblings),
    • early constant state instantiation,
    • root-slot call through State<T>.
  • In generated Lua assertions:
    • extract the actual dispatched slot name from FSM_*update call-site,
    • assert each concrete sibling class binds that slot to its own implementation,
    • assert no sibling binds that dispatched slot to NoOpState_*.
  • Add a compile-twice determinism assertion for the same repro input.