Hone logo
Hone
Problems

Simple Python Static Analyzer

Static analysis is a crucial part of software development, allowing you to identify potential errors and style issues before runtime. This challenge asks you to build a simplified static analyzer for Python code, focusing on identifying unused variables and basic style violations (line length exceeding a limit). This tool will help developers catch common mistakes and improve code readability.

Problem Description

You are tasked with creating a Python static analyzer that examines Python code represented as a string and reports any unused variables or lines exceeding a specified maximum length. The analyzer should not execute the code; it should only analyze the source code text.

What needs to be achieved:

  1. Unused Variable Detection: Identify variables that are declared but never used within the provided code.
  2. Line Length Check: Check each line of code for exceeding a maximum allowed length.
  3. Report Generation: Generate a report detailing the identified issues, including the line number and a brief description of the problem.

Key Requirements:

  • The analyzer should handle basic Python syntax (variable assignments, function definitions, simple expressions). It does not need to handle complex constructs like classes, loops, or imports.
  • The analyzer should be able to identify unused variables even if they are declared within a function.
  • The analyzer should report line numbers accurately.
  • The analyzer should be able to handle multi-line strings correctly (i.e., not flag lines within a multi-line string as exceeding the line length limit).

Expected Behavior:

The analyzer should take a string containing Python code as input and return a list of strings, where each string represents a detected issue. The report should be formatted as follows:

  • "Unused variable: <variable_name> on line <line_number>"
  • "Line too long: Line <line_number> exceeds <max_length> characters"

Edge Cases to Consider:

  • Variables declared but not used within a function.
  • Variables declared globally but not used within the provided code snippet.
  • Comments and docstrings should not be considered as code for unused variable detection.
  • Multi-line strings (using triple quotes) should not trigger line length violations.
  • Empty input string.
  • Code with no issues.

Examples

Example 1:

Input: """
def my_function():
    x = 10
    y = 20
    print(y)
"""
Output: ['Unused variable: x on line 4']
Explanation: The variable 'x' is declared but never used within the function.

Example 2:

Input: """
def another_function():
    a = 1
    b = 2
    c = 3
    print(a + b)
    print(c)
"""
Output: []
Explanation: All variables are used.

Example 3:

Input: """
def long_line_function():
    very_long_variable_name = 12345678901234567890  # This line is too long
    print(very_long_variable_name)
"""
Output: ['Line too long: Line 3 exceeds 80 characters']
Explanation: The line containing the variable assignment exceeds the maximum line length.

Example 4:

Input: """
def multi_line_string_function():
    long_string = """This is a
    very long string
    that spans multiple lines."""
    print(long_string)
"""
Output: []
Explanation: The lines within the multi-line string are not considered for line length checking.

Constraints

  • Maximum Line Length: 80 characters.
  • Input: A string containing valid Python code.
  • Performance: The analyzer should be reasonably efficient for code snippets up to 500 lines. While optimization is not the primary focus, avoid excessively inefficient algorithms.
  • Variable Scope: The analyzer should only consider variables within the provided code snippet. It should not attempt to infer variable usage from external scopes.

Notes

  • You can use Python's ast module for parsing the code, but it's not strictly required. A simple line-by-line analysis is also acceptable for this simplified problem.
  • Focus on identifying declared but unused variables. Don't worry about detecting variables that are used but not assigned a value.
  • Consider using a set to efficiently track declared variables.
  • For line length checking, split the input string into lines and check the length of each line.
  • Remember to handle edge cases carefully to ensure the analyzer behaves correctly in all scenarios.
  • The goal is to create a functional, albeit simplified, static analyzer. Don't over-engineer the solution.
Loading editor...
python