Incremental Parsing of a Simple Expression Language
Incremental parsing is a technique where you parse a stream of input incrementally, updating the parse tree as new input arrives. This is particularly useful for editors and IDEs where you want to provide real-time feedback as the user types. This challenge asks you to implement an incremental parser for a very simple expression language.
Problem Description
You are tasked with building an incremental parser for a language consisting of addition and subtraction of integers. The language's grammar is as follows:
expression : term ((PLUS | MINUS) term)*
term : integer
integer : [0-9]+
PLUS : '+'
MINUS : '-'
Your parser should accept a string of characters as input and maintain a parse tree representing the expression. As new characters are added to the input string, the parser should incrementally update the parse tree, providing a valid expression at each step. The parser should return the current, partially parsed expression as a nested list representing the Abstract Syntax Tree (AST). The AST should be structured as follows:
- Numbers are represented as integers.
- Addition is represented as
['+', number] - Subtraction is represented as
['-', number]
The parser should handle incomplete expressions gracefully, returning a valid AST representing the portion of the expression that has been parsed so far.
Examples
Example 1:
Input: "1"
Output: [1]
Explanation: The input is a single integer, so the AST is simply the integer itself.
Example 2:
Input: "1+2"
Output: [1, '+', 2]
Explanation: The input is a complete expression. The AST represents the addition of 1 and 2.
Example 3:
Input: "1+"
Output: [1, '+']
Explanation: The input is an incomplete expression. The AST represents the integer 1 followed by the addition operator.
Example 4:
Input: "1+2-3"
Output: [1, '+', 2, '-', 3]
Explanation: The input is a complete expression with multiple operations.
Example 5:
Input: "1+2-"
Output: [1, '+', 2, '-']
Explanation: The input is an incomplete expression.
Example 6:
Input: ""
Output: []
Explanation: Empty input results in an empty AST.
Constraints
- The input string will only contain digits (0-9), the '+' character, and the '-' character.
- The input string will not contain any leading or trailing whitespace.
- The parser should be able to handle input strings of up to 100 characters.
- The parser should be reasonably efficient; avoid unnecessary re-parsing of the entire input string with each incremental update.
Notes
- Consider using a state machine or a similar approach to track the parsing state.
- You don't need to evaluate the expression; only build the AST.
- Think about how to handle errors gracefully. For this challenge, simply returning a partial AST is sufficient for error cases.
- Focus on the incremental aspect – the ability to update the parse tree efficiently as new input arrives. A naive approach that re-parses the entire string on each update will not be considered satisfactory.
- The order of operations is not relevant for this challenge; the AST should reflect the order in which the operators appear in the input string.