Hone logo
Hone
Problems

Python Code Refactoring Assistant

This challenge asks you to build a foundational Python tool that can automatically refactor Python code by identifying and replacing specific code patterns. Developing such tools is crucial for maintaining code quality, improving readability, and ensuring consistency across large codebases.

Problem Description

You need to create a Python script that acts as a simple code refactoring assistant. This script will take a Python source code string as input and perform a series of predefined refactoring operations. For this challenge, we will focus on two common refactoring tasks:

  1. Replace print() calls with logging.debug(): Identify all instances of print() statements and replace them with equivalent logging.debug() calls. This promotes more structured logging over simple printing.
  2. Replace x == True with x is True (and similar for False): Identify comparisons like x == True, x == False, True == x, and False == x and replace them with the more idiomatic and explicit x is True, x is False, True is x, and False is x respectively. This improves clarity and avoids potential pitfalls with truthiness.

The refactoring should be applied sequentially.

Key Requirements:

  • The script should accept a string containing valid Python code.
  • It should return a string containing the refactored Python code.
  • The refactoring should preserve the original functionality of the code.
  • The refactoring should handle different contexts for print() and boolean comparisons.

Expected Behavior:

  • print("Hello") should become logging.debug("Hello").
  • my_var = 5; print(my_var) should become my_var = 5; logging.debug(my_var).
  • if x == True: should become if x is True:.
  • if False == my_flag: should become if False is my_flag:.
  • The refactoring should not alter comments or string literals that happen to contain patterns similar to the ones being replaced. For example, a comment # This is a print statement should remain as is.

Edge Cases to Consider:

  • print might be a variable name. The tool should only refactor actual print() function calls.
  • Boolean comparisons might occur within complex expressions or nested structures.
  • The logging module might not be imported. For this challenge, assume import logging will be added implicitly or is already present.

Examples

Example 1: print() Refactoring

Input:

import sys

def greet(name):
    print(f"Hello, {name}!")
    return f"Greetings, {name}"

message = greet("World")
print("Done.")

Output:

import sys

def greet(name):
    logging.debug(f"Hello, {name}!")
    return f"Greetings, {name}"

message = greet("World")
logging.debug("Done.")

Explanation: All print() calls have been replaced with logging.debug().

Example 2: Boolean Comparison Refactoring

Input:

def check_status(is_active, is_processed):
    if is_active == True:
        print("Active and...")
    if is_processed == False:
        print("Not processed.")
    if True == is_active:
        print("Is active again.")
    if False == is_processed:
        print("Is not processed again.")
    return "Checked"

result = check_status(True, False)

Output:

def check_status(is_active, is_processed):
    if is_active is True:
        logging.debug("Active and...")
    if is_processed is False:
        logging.debug("Not processed.")
    if True is is_active:
        logging.debug("Is active again.")
    if False is is_processed:
        logging.debug("Is not processed again.")
    return "Checked"

result = check_status(True, False)

Explanation: == True, == False, True ==, and False == comparisons have been replaced with is True, is False, True is, and False is respectively. print calls were also refactored.

Example 3: Combined and Edge Case

Input:

# A comment about printing
def process_data(data):
    if data is None:
        print("Input is None") # Inline comment
    else:
        result = len(data)
        if result == 0:
            print("Empty data")
        else:
            print(f"Data length: {result}")

is_valid = False
if is_valid == False: # Check if invalid
    print("Invalid state")

print("End of processing.")

Output:

# A comment about printing
def process_data(data):
    if data is None:
        logging.debug("Input is None") # Inline comment
    else:
        result = len(data)
        if result == 0:
            logging.debug("Empty data")
        else:
            logging.debug(f"Data length: {result}")

is_valid = False
if is_valid is False: # Check if invalid
    logging.debug("Invalid state")

logging.debug("End of processing.")

Explanation: Both print statements and the is_valid == False comparison have been refactored correctly, while comments and inline comments remain unchanged.

Constraints

  • The input Python code will be syntactically valid.
  • The input code will not contain complex metaprogramming that would fundamentally alter the interpretation of print or boolean comparisons.
  • The refactoring operations should be atomic for each line/statement where they occur.
  • You are expected to use Python's Abstract Syntax Trees (AST) module for robust parsing and manipulation of the code. Regular expressions alone are not sufficient for this task due to the complexity and potential for false positives.

Notes

  • The ast module in Python is your best friend for this challenge. It allows you to parse Python code into a tree structure, analyze it, and then unparse it back into code.
  • You'll likely need to implement an ast.NodeTransformer or ast.NodeVisitor to traverse the AST and make modifications.
  • Pay close attention to how print is called. It's a function call, so you'll need to identify ast.Call nodes where the function being called is an ast.Name with id='print'.
  • For boolean comparisons, you'll need to identify ast.Compare nodes. You'll be looking for operators like ast.Eq and operands that are ast.Constant (for True/False) or ast.Name (for boolean variables).
  • Remember to handle the order of operations if you decide to refactor print and boolean comparisons in separate passes. For this challenge, refactor print first, then boolean comparisons.
Loading editor...
python