Custom Optimization Pass for a Simple Expression Evaluator
This challenge asks you to implement a custom optimization pass for a simplified expression evaluator written in Rust. The evaluator takes a string representing a simple arithmetic expression (addition and multiplication only) and returns its result. Your task is to create an optimization pass that simplifies the expression by performing constant folding – replacing sub-expressions where both operands are constants with their calculated result. This is a fundamental concept in compiler optimization and a good exercise in understanding data flow analysis and transformation.
Problem Description
You are provided with a basic expression evaluator that parses and evaluates simple arithmetic expressions consisting of integers, addition (+), and multiplication (*). Your goal is to implement a function optimize_expression that takes the expression string as input and returns a simplified expression string after applying constant folding.
What needs to be achieved:
- Parse the input expression string into a tree-like structure (you can use a simple string representation for this, no need for a full AST).
- Identify sub-expressions where both operands are constants (integers).
- Evaluate these constant sub-expressions.
- Replace the original sub-expression with the calculated result.
- Return the optimized expression string.
Key Requirements:
- The optimization pass should only perform constant folding.
- The order of operations (PEMDAS/BODMAS) should be respected. Multiplication should be performed before addition.
- The output expression should be a valid expression string.
- The optimization should be performed in-place (or create a new string efficiently).
Expected Behavior:
The optimize_expression function should take a string as input and return a string. The returned string should be the original string with constant sub-expressions replaced by their calculated values. If no constant folding is possible, the function should return the original string unchanged.
Edge Cases to Consider:
- Empty input string.
- Expressions with only a single number.
- Expressions with multiple levels of nested operations.
- Expressions with leading or trailing whitespace.
- Invalid expressions (e.g., "2 + * 3"). While the evaluator itself handles this, your optimization pass should not crash on invalid input. It can simply return the original string.
Examples
Example 1:
Input: "2 + 3 * 4"
Output: "2 + 12"
Explanation: The sub-expression "3 * 4" is a constant sub-expression. It is evaluated to 12, and the original expression becomes "2 + 12".
Example 2:
Input: "5 + 2 * 3 + 1"
Output: "5 + 6 + 1"
Explanation: The sub-expression "2 * 3" is a constant sub-expression. It is evaluated to 6, and the original expression becomes "5 + 6 + 1".
Example 3:
Input: "10 * 2 + 5"
Output: "20 + 5"
Explanation: The sub-expression "10 * 2" is a constant sub-expression. It is evaluated to 20, and the original expression becomes "20 + 5".
Example 4:
Input: "2 + 3"
Output: "5"
Explanation: The entire expression is a constant sub-expression. It is evaluated to 5.
Example 5:
Input: "7"
Output: "7"
Explanation: The expression is a single number, no optimization is possible.
Constraints
- The input expression string will contain only digits, '+', '*', and whitespace.
- All numbers in the expression will be non-negative integers.
- The length of the input expression string will be less than 256 characters.
- The optimization pass should complete within 100ms for any valid input.
Notes
- You can use a simple string manipulation approach to identify and replace constant sub-expressions. A full AST is not required.
- Consider using regular expressions to help with parsing and identifying constant sub-expressions.
- Pay close attention to operator precedence (multiplication before addition).
- Error handling is not required; assume the input is a valid expression (or at least won't cause your program to crash).
- Focus on the core logic of constant folding. Efficiency is important, but correctness is paramount.
- Think about how to handle nested expressions and ensure that the optimization is applied correctly at each level.