Implementing a Simple Expression Parser with syn
This challenge focuses on building a robust parser for a simple arithmetic expression language using the syn crate in Rust. syn is a powerful library for parsing Rust code itself, and understanding how to leverage its combinator-style parsing is crucial for many metaprogramming and code generation tasks. You will learn to define grammar rules and handle common parsing scenarios.
Problem Description
Your task is to implement a parser for a simplified arithmetic expression language. The language supports integers, addition (+), subtraction (-), multiplication (*), division (/), and parentheses for grouping.
Key Requirements:
- Tokenization (Implicit): While
synhandles much of the low-level tokenization, you'll be working withsyn's token streams, effectively dealing with parsed Rust tokens. - Grammar Definition: Define the grammar rules for your expression language using
syn's parsing combinators (e.g.,syn::parse_macro_input!,syn::Token, customParseimplementations). - Expression Structure: The parser should correctly represent the structure of expressions, respecting operator precedence and associativity.
- Arithmetic operators (
+,-,*,/) have standard precedence (multiplication/division before addition/subtraction). - Operators are left-associative.
- Parentheses should override precedence.
- Arithmetic operators (
- Error Handling: The parser should gracefully handle malformed input and return informative error messages.
synprovides mechanisms for this. - Output: The parser should produce a structured representation of the parsed expression. For this challenge, a simple
enumrepresenting the expression tree is sufficient.
Expected Behavior:
The parser should take a string representing an expression and return either a successful parse of an expression tree or a syn::Error.
Edge Cases to Consider:
- Empty input.
- Expressions with only numbers.
- Expressions with only operators.
- Mismatched parentheses.
- Invalid characters.
- Consecutive operators (e.g.,
1 + + 2).
Examples
Example 1:
Input: "1 + 2 * 3"
Output:
Ok(Add(
Box::new(Literal(1)),
Box::new(Multiply(
Box::new(Literal(2)),
Box::new(Literal(3))
))
))
Explanation: The parser correctly identifies the multiplication having higher precedence than addition and parses 2 * 3 first, then adds 1 to the result.
Example 2:
Input: "(1 + 2) * 3"
Output:
Ok(Multiply(
Box::new(Add(
Box::new(Literal(1)),
Box::new(Literal(2))
)),
Box::new(Literal(3))
))
Explanation: The parentheses force the addition 1 + 2 to be evaluated before the multiplication by 3.
Example 3:
Input: "5 - (2 + 1)"
Output:
Ok(Subtract(
Box::new(Literal(5)),
Box::new(Add(
Box::new(Literal(2)),
Box::new(Literal(1))
))
))
Explanation: Similar to Example 2, parentheses dictate the order of operations.
Example 4:
Input: "1 + )"
Output:
Err(Error {
kind: UnexpectedToken,
message: "expected expression, found `)`",
span: Span { start: 5, end: 6 } // Example span, actual will vary
})
Explanation: The parser encounters an unexpected closing parenthesis where an expression or number was expected.
Constraints
- Supported numbers are unsigned 64-bit integers (
u64). - The input will be a string.
- The parser should aim to be efficient enough for typical macro expansion scenarios. The primary goal is correctness and clear structure.
Notes
- You will likely need to define an AST (Abstract Syntax Tree) to represent the parsed expressions. An
enumis a good starting point. - Consider using
syn'sParsetrait to define how to parse your expression components. - Operator precedence and associativity can be handled by structuring your parsing logic. For example, you might parse terms first, then factors, and then expressions, chaining them together.
- Refer to the
syndocumentation for examples on parsing tokens, handling delimiters (like parentheses), and defining custom parse logic. - The
syn::parse_macro_input!macro is a convenient way to start parsing within a procedural macro, but you can also usesyn::parse_fileorsyn::parse_strfor standalone parsing. For this challenge, you can simulate input usingsyn::parse_stror a custom function that acceptsTokenStream.