Python Code Refactoring Assistant
This challenge asks you to build a foundational Python tool that can automatically refactor Python code by identifying and replacing specific code patterns. Developing such tools is crucial for maintaining code quality, improving readability, and ensuring consistency across large codebases.
Problem Description
You need to create a Python script that acts as a simple code refactoring assistant. This script will take a Python source code string as input and perform a series of predefined refactoring operations. For this challenge, we will focus on two common refactoring tasks:
- Replace
print()calls withlogging.debug(): Identify all instances ofprint()statements and replace them with equivalentlogging.debug()calls. This promotes more structured logging over simple printing. - Replace
x == Truewithx is True(and similar forFalse): Identify comparisons likex == True,x == False,True == x, andFalse == xand replace them with the more idiomatic and explicitx is True,x is False,True is x, andFalse is xrespectively. This improves clarity and avoids potential pitfalls with truthiness.
The refactoring should be applied sequentially.
Key Requirements:
- The script should accept a string containing valid Python code.
- It should return a string containing the refactored Python code.
- The refactoring should preserve the original functionality of the code.
- The refactoring should handle different contexts for
print()and boolean comparisons.
Expected Behavior:
print("Hello")should becomelogging.debug("Hello").my_var = 5; print(my_var)should becomemy_var = 5; logging.debug(my_var).if x == True:should becomeif x is True:.if False == my_flag:should becomeif False is my_flag:.- The refactoring should not alter comments or string literals that happen to contain patterns similar to the ones being replaced. For example, a comment
# This is a print statementshould remain as is.
Edge Cases to Consider:
printmight be a variable name. The tool should only refactor actualprint()function calls.- Boolean comparisons might occur within complex expressions or nested structures.
- The
loggingmodule might not be imported. For this challenge, assumeimport loggingwill be added implicitly or is already present.
Examples
Example 1: print() Refactoring
Input:
import sys
def greet(name):
print(f"Hello, {name}!")
return f"Greetings, {name}"
message = greet("World")
print("Done.")
Output:
import sys
def greet(name):
logging.debug(f"Hello, {name}!")
return f"Greetings, {name}"
message = greet("World")
logging.debug("Done.")
Explanation: All print() calls have been replaced with logging.debug().
Example 2: Boolean Comparison Refactoring
Input:
def check_status(is_active, is_processed):
if is_active == True:
print("Active and...")
if is_processed == False:
print("Not processed.")
if True == is_active:
print("Is active again.")
if False == is_processed:
print("Is not processed again.")
return "Checked"
result = check_status(True, False)
Output:
def check_status(is_active, is_processed):
if is_active is True:
logging.debug("Active and...")
if is_processed is False:
logging.debug("Not processed.")
if True is is_active:
logging.debug("Is active again.")
if False is is_processed:
logging.debug("Is not processed again.")
return "Checked"
result = check_status(True, False)
Explanation: == True, == False, True ==, and False == comparisons have been replaced with is True, is False, True is, and False is respectively. print calls were also refactored.
Example 3: Combined and Edge Case
Input:
# A comment about printing
def process_data(data):
if data is None:
print("Input is None") # Inline comment
else:
result = len(data)
if result == 0:
print("Empty data")
else:
print(f"Data length: {result}")
is_valid = False
if is_valid == False: # Check if invalid
print("Invalid state")
print("End of processing.")
Output:
# A comment about printing
def process_data(data):
if data is None:
logging.debug("Input is None") # Inline comment
else:
result = len(data)
if result == 0:
logging.debug("Empty data")
else:
logging.debug(f"Data length: {result}")
is_valid = False
if is_valid is False: # Check if invalid
logging.debug("Invalid state")
logging.debug("End of processing.")
Explanation: Both print statements and the is_valid == False comparison have been refactored correctly, while comments and inline comments remain unchanged.
Constraints
- The input Python code will be syntactically valid.
- The input code will not contain complex metaprogramming that would fundamentally alter the interpretation of
printor boolean comparisons. - The refactoring operations should be atomic for each line/statement where they occur.
- You are expected to use Python's Abstract Syntax Trees (AST) module for robust parsing and manipulation of the code. Regular expressions alone are not sufficient for this task due to the complexity and potential for false positives.
Notes
- The
astmodule in Python is your best friend for this challenge. It allows you to parse Python code into a tree structure, analyze it, and then unparse it back into code. - You'll likely need to implement an
ast.NodeTransformerorast.NodeVisitorto traverse the AST and make modifications. - Pay close attention to how
printis called. It's a function call, so you'll need to identifyast.Callnodes where the function being called is anast.Namewithid='print'. - For boolean comparisons, you'll need to identify
ast.Comparenodes. You'll be looking for operators likeast.Eqand operands that areast.Constant(forTrue/False) orast.Name(for boolean variables). - Remember to handle the order of operations if you decide to refactor
printand boolean comparisons in separate passes. For this challenge, refactorprintfirst, then boolean comparisons.