Hone logo
Hone
Problems

Implementing a Simple Expression Parser with syn

This challenge focuses on building a robust parser for a simple arithmetic expression language using the syn crate in Rust. syn is a powerful library for parsing Rust code itself, and understanding how to leverage its combinator-style parsing is crucial for many metaprogramming and code generation tasks. You will learn to define grammar rules and handle common parsing scenarios.

Problem Description

Your task is to implement a parser for a simplified arithmetic expression language. The language supports integers, addition (+), subtraction (-), multiplication (*), division (/), and parentheses for grouping.

Key Requirements:

  1. Tokenization (Implicit): While syn handles much of the low-level tokenization, you'll be working with syn's token streams, effectively dealing with parsed Rust tokens.
  2. Grammar Definition: Define the grammar rules for your expression language using syn's parsing combinators (e.g., syn::parse_macro_input!, syn::Token, custom Parse implementations).
  3. Expression Structure: The parser should correctly represent the structure of expressions, respecting operator precedence and associativity.
    • Arithmetic operators (+, -, *, /) have standard precedence (multiplication/division before addition/subtraction).
    • Operators are left-associative.
    • Parentheses should override precedence.
  4. Error Handling: The parser should gracefully handle malformed input and return informative error messages. syn provides mechanisms for this.
  5. Output: The parser should produce a structured representation of the parsed expression. For this challenge, a simple enum representing the expression tree is sufficient.

Expected Behavior:

The parser should take a string representing an expression and return either a successful parse of an expression tree or a syn::Error.

Edge Cases to Consider:

  • Empty input.
  • Expressions with only numbers.
  • Expressions with only operators.
  • Mismatched parentheses.
  • Invalid characters.
  • Consecutive operators (e.g., 1 + + 2).

Examples

Example 1:

Input: "1 + 2 * 3"

Output:

Ok(Add(
    Box::new(Literal(1)),
    Box::new(Multiply(
        Box::new(Literal(2)),
        Box::new(Literal(3))
    ))
))

Explanation: The parser correctly identifies the multiplication having higher precedence than addition and parses 2 * 3 first, then adds 1 to the result.

Example 2:

Input: "(1 + 2) * 3"

Output:

Ok(Multiply(
    Box::new(Add(
        Box::new(Literal(1)),
        Box::new(Literal(2))
    )),
    Box::new(Literal(3))
))

Explanation: The parentheses force the addition 1 + 2 to be evaluated before the multiplication by 3.

Example 3:

Input: "5 - (2 + 1)"

Output:

Ok(Subtract(
    Box::new(Literal(5)),
    Box::new(Add(
        Box::new(Literal(2)),
        Box::new(Literal(1))
    ))
))

Explanation: Similar to Example 2, parentheses dictate the order of operations.

Example 4:

Input: "1 + )"

Output:

Err(Error {
    kind: UnexpectedToken,
    message: "expected expression, found `)`",
    span: Span { start: 5, end: 6 } // Example span, actual will vary
})

Explanation: The parser encounters an unexpected closing parenthesis where an expression or number was expected.

Constraints

  • Supported numbers are unsigned 64-bit integers (u64).
  • The input will be a string.
  • The parser should aim to be efficient enough for typical macro expansion scenarios. The primary goal is correctness and clear structure.

Notes

  • You will likely need to define an AST (Abstract Syntax Tree) to represent the parsed expressions. An enum is a good starting point.
  • Consider using syn's Parse trait to define how to parse your expression components.
  • Operator precedence and associativity can be handled by structuring your parsing logic. For example, you might parse terms first, then factors, and then expressions, chaining them together.
  • Refer to the syn documentation for examples on parsing tokens, handling delimiters (like parentheses), and defining custom parse logic.
  • The syn::parse_macro_input! macro is a convenient way to start parsing within a procedural macro, but you can also use syn::parse_file or syn::parse_str for standalone parsing. For this challenge, you can simulate input using syn::parse_str or a custom function that accepts TokenStream.
Loading editor...
rust