The parser component is responsible for transforming source code text into an Abstract Syntax Tree (AST). It is implemented using the nom parser combinator library and follows a modular design pattern, breaking down the parsing logic into several specialized modules.
The parser is organized into the following modules:
parser.rs: The main entry point that coordinates the parsing processparser_common.rs: Common parsing utilities and shared functionsparser_expr.rs: Expression parsing functionalityparser_type.rs: Type system parsingparser_stmt.rs: Statement and control flow parsing
The main parser module that provides the entry point for parsing complete programs:
pub fn parse(input: &str) -> IResult<&str, Vec<Statement>>Common parsing utilities used across other modules:
pub fn is_string_char(c: char) -> bool
pub fn separator<'a>(sep: &'static str) -> impl FnMut(&'a str) -> IResult<&'a str, &'a str>
pub fn keyword<'a>(kw: &'static str) -> impl FnMut(&'a str) -> IResult<&'a str, &'a str>
pub fn identifier(input: &str) -> IResult<&str, &str>Expression parsing functionality:
pub fn parse_expression(input: &str) -> IResult<&str, Expression>
pub fn parse_actual_arguments(input: &str) -> IResult<&str, Vec<Expression>>Type system parsing:
pub fn parse_type(input: &str) -> IResult<&str, Type>Statement and control flow parsing:
pub fn parse_statement(input: &str) -> IResult<&str, Statement>The parser supports various types of statements:
- Variable declarations and assignments
- Control flow (if-else, while, for)
- Function definitions
- Assert statements
- ADT (Algebraic Data Type) declarations
- Syntax:
for <identifier> in <expression>: <block> end - Supported iterables at parse time (resolved at runtime/type-check):
- Lists:
[e1, e2, ...] - Strings: "abc" (iterates over characters as 1-length strings)
- Tuples:
(e1, e2, ...)(see tuple literals below)
- Lists:
- Scoping: the loop variable is bound in an inner scope created for the loop body. It is not visible outside the loop.
- The variable is considered immutable by the type checker; each iteration rebinds it.
Handles different types of expressions:
- Arithmetic expressions
- Boolean expressions
- Function calls
- Variables
- Literals (numbers, strings, booleans)
- ADT constructors and pattern matching
The parser supports tuple literals and distinguishes them from parenthesized groupings:
- Empty tuple:
()->Expression::Tuple([]) - Single-element tuple:
(x,)->Expression::Tuple([x]) - Multi-element tuple:
(x, y, z)->Expression::Tuple([x, y, z]) - Grouping (no comma):
(expr)-> parsed asexpr(not a tuple)
Tuples may be nested, e.g., ((1, 2), (3, 4)).
Supports a rich type system including:
- Basic types (Int, Real, Boolean, String, Unit, Any)
- Complex types (List, Tuple, Maybe)
- ADT declarations
- Function types
The parser extensively uses the nom parser combinator library. Here are the key combinators used:
tag: Matches exact string patternschar: Matches single charactersdigit1: Matches one or more digitsalpha1: Matches one or more alphabetic charactersspace0/space1: Matches zero or more/one or more whitespace characters
tuple: Combines multiple parsers in sequencepreceded: Matches a prefix followed by a valueterminated: Matches a value followed by a suffixdelimited: Matches a value between two delimiters
alt: Tries multiple parsers in ordermap: Transforms the output of a parseropt: Makes a parser optional
many0/many1: Matches zero or more/one or more occurrencesseparated_list0: Matches items separated by a delimiter
Here's an example of how the parser handles a simple assignment statement:
x = 42This is parsed using the following combinators:
fn parse_assignment_statement(input: &str) -> IResult<&str, Statement> {
map(
tuple((
preceded(multispace0, identifier),
preceded(multispace0, tag("=")),
preceded(multispace0, parse_expression),
)),
|(var, _, expr)| Statement::Assignment(var.to_string(), Box::new(expr)),
)(input)
}The parser produces an Abstract Syntax Tree (AST) with the following main types:
pub enum Statement {
VarDeclaration(Name),
ValDeclaration(Name),
Assignment(Name, Box<Expression>),
IfThenElse(Box<Expression>, Box<Statement>, Option<Box<Statement>>),
While(Box<Expression>, Box<Statement>),
For(Name, Box<Expression>, Box<Statement>),
Block(Vec<Statement>),
Assert(Box<Expression>, Box<Expression>),
FuncDef(Function),
Return(Box<Expression>),
ADTDeclaration(Name, Vec<ValueConstructor>),
// ... other variants
}pub enum Type {
TInteger,
TReal,
TBool,
TString,
TList(Box<Type>),
TTuple(Vec<Type>),
TMaybe(Box<Type>),
TResult(Box<Type>, Box<Type>),
TFunction(Box<Option<Type>>, Vec<Type>),
// ... other variants
}The parser implements error handling through the nom error system:
pub enum ParseError {
IndentationError(usize),
UnexpectedToken(String),
InvalidExpression(String),
}The parser includes a comprehensive test suite in tests/parser_tests.rs that verifies:
- Simple assignments
- Complex expressions
- Control flow structures
- Type annotations
- Complete programs
- Error handling
- Whitespace handling
Documentation Generation Note
This documentation was automatically generated by Claude (Anthropic), an AI assistant, through analysis of the codebase. While the content accurately reflects the implementation, it should be reviewed and maintained by the development team. Last generated: June 2025.