This document provides a high-level overview of the Aburiscript compiler's architecture, outlining how source code is transformed into an executable or object file.
The compiler follows a traditional frontend pipeline design, lowering source code through several distinct stages before generating LLVM IR.
- Parses command-line arguments.
- Determines compilation mode (e.g.,
-E,-S,-c, or full compilation/linking). - Configures language options (
-std,-x) and target ABIs. - Routes input files: C/C++ files go to the frontend, object files go to the linker, and unsupported languages can be passed through to the system compiler.
- Resolves
#includedirectives. - Handles conditional compilation (
#ifdef,#if, etc.). - Expands macros.
- Converts source text into a stream of tokens for the parser.
- Parser: Consumes the token stream and builds an Clang-like Abstract Syntax Tree (AST) using recursive descent. It attempts tentative parsing to resolve ambiguities (especially prevalent in C++).
- Sema (
Collect): Tightly coupled with the parser, it performs on-the-fly semantic checks, type resolution, scoping, and AST node creation. If there is a semantic error, compilation halts with diagnostic messages. - AST Layer (
ast/*): Owns the AST node definitions, clone helpers, side tables, and the tightly-coupled type/symbol support used by parsing, semantic analysis, and codegen. - Deep Dive: Read Parser and Collect Internals for a detailed look at scope management, rollback, and C++ resolution.
- Evaluates compile-time expressions (e.g., array sizes,
_Static_assertpredicates, enum values).
- Translates the typed, semantic AST into LLVM Intermediate Representation (IR).
- Manages target-specific ABI lowering, including struct layouts, bitfields, and mangling conventions (targeting the Itanium C++ ABI).
- Handles the generation of complex control flow (like exceptions and
switchstatements) into LLVM blocks. - Deep Dive: Read Code Generation Internals for a detailed look at the C++ object model lowering, vtable emission, and exception handling.
- Passes the generated LLVM IR to LLVM's optimization passes (depending on the
-Oflag). - Emits target assembly (
.s) or object code (.o). - Invokes the system linker (e.g.,
/usr/bin/cc) to produce the final executable, unless disabled.