Hone logo
Hone
Problems

Incremental Parsing of a Simple Expression Language

Incremental parsing is a technique where you parse a stream of input incrementally, updating the parse tree as new input arrives. This is particularly useful for editors and IDEs where you want to provide real-time feedback as the user types. This challenge asks you to implement an incremental parser for a very simple expression language.

Problem Description

You are tasked with building an incremental parser for a language consisting of addition and subtraction of integers. The language's grammar is as follows:

expression : term ((PLUS | MINUS) term)*
term       : integer
integer    : [0-9]+
PLUS       : '+'
MINUS      : '-'

Your parser should accept a string of characters as input and maintain a parse tree representing the expression. As new characters are added to the input string, the parser should incrementally update the parse tree, providing a valid expression at each step. The parser should return the current, partially parsed expression as a nested list representing the Abstract Syntax Tree (AST). The AST should be structured as follows:

  • Numbers are represented as integers.
  • Addition is represented as ['+', number]
  • Subtraction is represented as ['-', number]

The parser should handle incomplete expressions gracefully, returning a valid AST representing the portion of the expression that has been parsed so far.

Examples

Example 1:

Input: "1"
Output: [1]
Explanation: The input is a single integer, so the AST is simply the integer itself.

Example 2:

Input: "1+2"
Output: [1, '+', 2]
Explanation: The input is a complete expression. The AST represents the addition of 1 and 2.

Example 3:

Input: "1+"
Output: [1, '+']
Explanation: The input is an incomplete expression. The AST represents the integer 1 followed by the addition operator.

Example 4:

Input: "1+2-3"
Output: [1, '+', 2, '-', 3]
Explanation: The input is a complete expression with multiple operations.

Example 5:

Input: "1+2-"
Output: [1, '+', 2, '-']
Explanation: The input is an incomplete expression.

Example 6:

Input: ""
Output: []
Explanation: Empty input results in an empty AST.

Constraints

  • The input string will only contain digits (0-9), the '+' character, and the '-' character.
  • The input string will not contain any leading or trailing whitespace.
  • The parser should be able to handle input strings of up to 100 characters.
  • The parser should be reasonably efficient; avoid unnecessary re-parsing of the entire input string with each incremental update.

Notes

  • Consider using a state machine or a similar approach to track the parsing state.
  • You don't need to evaluate the expression; only build the AST.
  • Think about how to handle errors gracefully. For this challenge, simply returning a partial AST is sufficient for error cases.
  • Focus on the incremental aspect – the ability to update the parse tree efficiently as new input arrives. A naive approach that re-parses the entire string on each update will not be considered satisfactory.
  • The order of operations is not relevant for this challenge; the AST should reflect the order in which the operators appear in the input string.
Loading editor...
python