Hone logo
Hone
Problems

Simple Expression Parser Generator in JavaScript

This challenge asks you to build a basic parser generator in JavaScript. Parser generators are powerful tools that take a grammar definition and automatically produce a parser, which is a program that analyzes a string of text (in this case, mathematical expressions) and determines its structure according to the grammar. This is a fundamental concept in compiler design and language processing.

Problem Description

You are to create a JavaScript class called ParserGenerator that can generate a parser for a simple expression grammar. The grammar will define how mathematical expressions are structured. The grammar will be provided as a string, and the generator should produce a parser function that can then be used to parse expressions.

The grammar will use a Backus-Naur Form (BNF)-like notation. The grammar will consist of production rules separated by newline characters. Each production rule will have the form NonTerminal -> Sequence of Symbols, where NonTerminal is a capital letter representing a non-terminal symbol, and Sequence of Symbols is a sequence of terminal symbols (numbers, operators) or non-terminal symbols, separated by spaces.

The supported grammar will include:

  • Expr -> Term {("+" | "-") Term}
  • Term -> Factor {("*" | "/") Factor}
  • Factor -> Number | "(" Expr ")"
  • Number -> [0-9]+

The parser should return a JavaScript object representing the parsed expression. The object should have the following structure:

  • type: A string representing the type of the expression (e.g., "Number", "Expr", "Term", "Factor").
  • value: For Number type, this should be the numerical value of the number. For other types, this might be an array of child nodes or other relevant information.

The parser should handle operator precedence correctly (multiplication and division before addition and subtraction). Parentheses should be respected.

Expected Behavior:

The ParserGenerator class should have a method generate(grammarString) that takes the grammar string as input and returns a parse function. The parse function should take an expression string as input and return the parsed expression object. If the input string is not a valid expression according to the grammar, the parse function should throw an error.

Examples

Example 1:

Input: `Expr -> Term {("+" | "-") Term}\nTerm -> Factor {("*" | "/") Factor}\nFactor -> Number | "(" Expr ")}\nNumber -> [0-9]+`
Expression: "1 + 2 * 3"
Output: { type: 'Expr', value: [ { type: 'Term', value: [ { type: 'Factor', value: 1 }, { type: 'Factor', value: 2 } ] }, { type: 'Term', value: [ { type: 'Factor', value: 3 }, { type: 'Factor', value: 3 } ] } ] }
Explanation: The parser correctly parses the expression, respecting operator precedence.  The output represents the expression tree.

Example 2:

Input: `Expr -> Term {("+" | "-") Term}\nTerm -> Factor {("*" | "/") Factor}\nFactor -> Number | "(" Expr ")}\nNumber -> [0-9]+`
Expression: "(1 + 2) * 3"
Output: { type: 'Expr', value: [ { type: 'Term', value: [ { type: 'Factor', value: { type: 'Expr', value: [ { type: 'Term', value: [ { type: 'Factor', value: 1 }, { type: 'Factor', value: 2 } ] } ] } } , { type: 'Factor', value: 3 } ] } ] }
Explanation: The parser correctly handles parentheses, ensuring the expression inside the parentheses is evaluated first.

Example 3: (Edge Case)

Input: `Expr -> Term {("+" | "-") Term}\nTerm -> Factor {("*" | "/") Factor}\nFactor -> Number | "(" Expr ")}\nNumber -> [0-9]+`
Expression: "1"
Output: { type: 'Factor', value: 1 }
Explanation: A single number is correctly parsed as a Factor.

Constraints

  • The grammar string will always be well-formed and adhere to the specified BNF-like format.
  • The expression string will contain only digits, operators (+, -, *, /), parentheses, and spaces.
  • The parser should handle integer numbers only. No floating-point numbers are required.
  • The parser should not handle variables or functions.
  • The generated parser should be reasonably efficient for expressions of moderate complexity (up to 20 tokens). Performance is not the primary focus, but excessively slow parsing should be avoided.

Notes

  • You will need to implement a recursive descent parser.
  • Consider using regular expressions to tokenize the input expression string.
  • The generate method should create and return a closure that captures the grammar.
  • Error handling is important. Throw an error if the input expression is invalid according to the grammar.
  • The structure of the output object is a suggestion; you can modify it as long as it accurately represents the parsed expression. The key is to demonstrate understanding of the parsing process.
  • Focus on correctness and clarity of code. Good variable names and comments are appreciated.
  • The grammar is fixed for this challenge. You are not expected to implement a grammar parser (a parser that parses the grammar itself). You are given the grammar as a string.
Loading editor...
javascript