Hone logo
Hone
Problems

Implementing a Custom Dead Code Elimination Pass in Rust for a Toy Compiler

This challenge requires you to implement a custom optimization pass for a simplified Rust-based compiler. Specifically, you will create a "dead code elimination" pass that removes unused variables and statements from an Abstract Syntax Tree (AST). This is a fundamental optimization that improves code efficiency by removing computations that have no effect.

Problem Description

Your task is to write a Rust function that takes an Abstract Syntax Tree (AST) representing a simplified programming language and performs dead code elimination. The AST will be represented using a set of Rust structs and enums that you will define. The optimization pass should identify and remove:

  1. Unused Variable Declarations: If a variable is declared but never read, its declaration and any associated initial assignment should be removed.
  2. Statements with No Side Effects: If a statement consists solely of an expression that evaluates to a value but has no observable side effects (like printing, modifying a variable, etc.), it should be removed.

You will need to define the AST structure that your compiler will operate on. For simplicity, assume the language supports basic expressions (literals, variable access, arithmetic operations), variable declarations, assignments, and function calls (though for this problem, function calls will be treated as having potential side effects and won't be removed unless they are part of an unused statement).

Key Requirements:

  • Define appropriate Rust enums and structs to represent the AST.
  • Implement a function dead_code_elimination that takes a mutable reference to the AST and modifies it in place.
  • The pass should be idempotent (running it multiple times should yield the same result as running it once).
  • Handle basic data types (integers, booleans) and potentially simple string literals.

Expected Behavior:

The function should traverse the AST, identify dead code according to the rules above, and prune the AST by removing those elements.

Edge Cases to Consider:

  • Variables declared and immediately reassigned before being read.
  • Expressions that appear to have no side effects but are part of a larger statement that does have side effects (e.g., an unused expression as the condition of an if statement with a side-effecting then block). For this problem, we'll simplify and assume that any standalone expression statement is removable if the expression itself has no side effects.
  • Functions that are called but their return value is ignored. As mentioned, assume function calls have potential side effects.
  • Scopes: For this challenge, we will assume a single global scope to simplify scope management.

Examples

Example 1: Unused Variable

// Input AST (conceptual representation)
let mut program = vec![
    Statement::Declaration {
        name: "x".to_string(),
        initializer: Some(Expression::Literal(Literal::Integer(10))),
    },
    Statement::Assignment {
        name: "y".to_string(),
        value: Expression::Literal(Literal::Integer(20)),
    },
];

// After dead_code_elimination:
// 'x' is declared but never used, so its declaration and initializer are removed.
// 'y' is assigned a value but never read, but its assignment is not "dead" in this context
// as assignments themselves are considered statements with potential side effects (in a broader compiler).
// For this specific challenge, we focus on unused *declarations*.
let mut expected_program = vec![
    Statement::Assignment {
        name: "y".to_string(),
        value: Expression::Literal(Literal::Integer(20)),
    },
];

Explanation: The variable x was declared and initialized but never used. Therefore, the Statement::Declaration for x is removed. The assignment to y remains because assignments are considered statements with potential side effects, even if the assigned variable isn't read later in this simplified model.

Example 2: Statement with No Side Effects

// Input AST (conceptual representation)
let mut program = vec![
    Statement::Declaration {
        name: "a".to_string(),
        initializer: Some(Expression::Literal(Literal::Boolean(true))),
    },
    Statement::Expression {
        expr: Expression::Literal(Literal::Integer(42)),
    },
    Statement::Expression {
        expr: Expression::BinaryOp {
            op: BinaryOperator::Add,
            left: Box::new(Expression::Variable("a".to_string())),
            right: Box::new(Expression::Literal(Literal::Integer(1))),
        },
    },
];

// After dead_code_elimination:
// The expression '42' has no side effects and is a standalone statement, so it's removed.
// The expression 'a + 1' has no side effects and is a standalone statement, so it's removed.
// The declaration of 'a' remains because it *could* have been used in a side-effecting operation
// or a statement that wasn't removed. (This highlights a limitation of simple AST analysis without
// full control flow). For this specific challenge, we focus on standalone expressions.
let mut expected_program = vec![
    Statement::Declaration {
        name: "a".to_string(),
        initializer: Some(Expression::Literal(Literal::Boolean(true))),
    },
];

Explanation: The Statement::Expression containing Literal::Integer(42) is removed because it's a standalone expression with no side effects. Similarly, the Statement::Expression containing a + 1 is also removed. The declaration of a remains, as this pass doesn't track read usage of variables within expressions that are themselves removed. A more advanced pass would.

Example 3: Combined Scenario

// Input AST (conceptual representation)
let mut program = vec![
    Statement::Declaration {
        name: "unused_var".to_string(),
        initializer: Some(Expression::Literal(Literal::Integer(99))),
    },
    Statement::Expression {
        expr: Expression::Literal(Literal::String("Hello".to_string())),
    },
    Statement::Declaration {
        name: "useful_var".to_string(),
        initializer: Some(Expression::Literal(Literal::Integer(1))),
    },
    Statement::Assignment {
        name: "useful_var".to_string(),
        value: Expression::BinaryOp {
            op: BinaryOperator::Add,
            left: Box::new(Expression::Variable("useful_var".to_string())),
            right: Box::new(Expression::Literal(Literal::Integer(1))),
        },
    },
];

// After dead_code_elimination:
// 'unused_var' is declared but never read. Its declaration is removed.
// The string literal statement has no side effects and is removed.
// 'useful_var' is declared and its assignment is performed, so it remains.
let mut expected_program = vec![
    Statement::Declaration {
        name: "useful_var".to_string(),
        initializer: Some(Expression::Literal(Literal::Integer(1))),
    },
    Statement::Assignment {
        name: "useful_var".to_string(),
        value: Expression::BinaryOp {
            op: BinaryOperator::Add,
            left: Box::new(Expression::Variable("useful_var".to_string())),
            right: Box::new(Expression::Literal(Literal::Integer(1))),
        },
    },
];

Explanation: unused_var is removed because it's declared but not used. The string literal statement is removed as it has no side effects. The declaration and subsequent assignment of useful_var are preserved because assignments are treated as statements with potential side effects in this simplified model.

Constraints

  • The AST will be a Vec<Statement>.
  • The AST will not contain control flow structures like if, while, or for loops, or function definitions/calls for simplicity in this challenge.
  • Variable names are simple strings.
  • Literals will include integers, booleans, and strings.
  • Expressions will include literals, variable access, and binary operations.
  • Performance is not a primary concern, but the solution should be reasonably efficient for typical input sizes (e.g., avoid O(N^2) operations within the AST traversal if possible).

Notes

  • You'll need to define the Statement, Expression, Literal, and BinaryOperator types. Consider how to represent these in Rust using enums and structs.
  • To determine if a variable is "used," you'll need to scan the AST for all references to that variable. This might involve a separate pass or careful tracking during the primary traversal.
  • For identifying statements with no side effects, you'll need a helper function that can analyze an Expression and determine if it's purely computational or if it could alter program state. For this challenge, assume that only Literal expressions and arithmetic BinaryOp expressions that don't involve variable reads are considered to have no side effects.
  • Think about how to manage the AST modifications. Removing elements from a Vec while iterating can be tricky; consider strategies like filtering or building a new Vec.
Loading editor...
rust