Python Code Analyzer: Function Complexity and Documentation Check
This challenge focuses on building a basic Python code analyzer. You will write a script that examines Python source files to identify potential areas for improvement in terms of function complexity and documentation coverage. This is a fundamental step in understanding and maintaining larger codebases.
Problem Description
Your task is to create a Python script that takes the path to a Python file as input and performs two types of analysis:
- Function Complexity: Calculate and report the cyclomatic complexity of each function defined in the script. Cyclomatic complexity measures the number of linearly independent paths through a function's source code. Higher complexity often indicates a function that is harder to test and understand.
- Docstring Coverage: Determine if each function has a docstring. Docstrings are crucial for explaining what a function does, its parameters, and what it returns.
Key Requirements:
- The script should accept a single command-line argument: the path to a Python file.
- It should parse the Python code to identify function definitions.
- For each function, it should calculate its cyclomatic complexity.
- For each function, it should check for the presence of a docstring.
- The output should be a structured report listing each function, its cyclomatic complexity, and whether it has a docstring.
Expected Behavior:
The script should print a report to standard output. The report should be easy to read and clearly associate each function with its analysis results.
Edge Cases to Consider:
- Empty Python files.
- Files with no function definitions.
- Functions with no parameters.
- Nested functions (for this challenge, focus on top-level functions or functions directly within classes).
- Functions with docstrings spanning multiple lines.
- Functions that are decorators (you can choose to either analyze them or skip them).
Examples
Example 1:
Input File (sample_code.py):
def greet(name):
"""Greets the user by name."""
if name:
print(f"Hello, {name}!")
else:
print("Hello there!")
def calculate_sum(a, b):
return a + b
def complex_logic(x):
if x > 10:
if x % 2 == 0:
return "Large even number"
else:
return "Large odd number"
elif x < 0:
return "Negative number"
else:
return "Small or medium number"
Output:
Function Analysis Report for: sample_code.py
-------------------------------------------------
Function: greet
Cyclomatic Complexity: 2
Docstring Present: Yes
Function: calculate_sum
Cyclomatic Complexity: 1
Docstring Present: No
Function: complex_logic
Cyclomatic Complexity: 4
Docstring Present: No
Explanation:
greethas two paths (ifnameis true, else). Its docstring is present.calculate_sumhas one simple path. It lacks a docstring.complex_logichas four distinct paths due to the nested if/elif/else structure. It lacks a docstring.
Example 2:
Input File (empty_file.py):
# This is an empty file
Output:
Function Analysis Report for: empty_file.py
-------------------------------------------------
No functions found in this file.
Explanation: The file contains no function definitions, so the report indicates this.
Example 3: (Edge case: function with no body)
Input File (no_body.py):
def do_nothing():
pass
Output:
Function Analysis Report for: no_body.py
-------------------------------------------------
Function: do_nothing
Cyclomatic Complexity: 1
Docstring Present: No
Explanation: Even a pass statement constitutes a simple function with a complexity of 1. It lacks a docstring.
Constraints
- The input Python file will be a valid
.pyfile. - The script should handle files up to 10MB in size.
- The analysis for a single file should complete within 5 seconds on a typical modern machine.
- You are expected to use Python's built-in
astmodule for parsing the Python code. - Avoid external libraries that directly calculate cyclomatic complexity (e.g.,
radon). You should implement the logic or use basicastnode analysis to determine complexity.
Notes
- The
astmodule in Python is your primary tool for this task. It allows you to represent your Python code as an Abstract Syntax Tree, which can be traversed and analyzed. - Cyclomatic complexity can be calculated by counting the number of decision points (if, while, for, elif, and, or, etc.) in a function's code and adding 1.
- When checking for docstrings, remember that they are the first statement in a function's body, provided it's a string literal.
- Consider how you will handle classes and methods within classes. For simplicity, you can either analyze methods as separate functions or include them in the report with their class name.
- Success looks like a script that accurately parses Python files, identifies functions, calculates their cyclomatic complexity, checks for docstrings, and presents this information in a clear, well-formatted report.