Robust File Processing with Error Recovery
Imagine you are building a system that needs to process data from multiple files. These files might be malformed, incomplete, or inaccessible due to various reasons. Your task is to create a robust file processing function that can gracefully handle these errors, log them, and continue processing as many files as possible, demonstrating effective error recovery.
Problem Description
You need to implement a Python function process_files(file_paths) that takes a list of file paths as input. For each file path in the list, the function should attempt to:
- Open the file: Read its content.
- Process the content: For simplicity, let's assume "processing" means counting the number of lines in the file.
- Handle potential errors: Specifically, you should anticipate and handle
FileNotFoundError(if a file doesn't exist) andIOError(for general input/output issues like permission errors). - Log errors: When an error occurs, the function should log the file path that caused the error and the type of error encountered to a predefined error log file named
error.log. - Return results: The function should return a dictionary where keys are the file paths that were successfully processed, and values are the number of lines in each file. Files that caused errors should not be included in the returned dictionary.
Key Requirements:
- The function must iterate through all provided file paths.
- It must use
try-exceptblocks to catch specific exceptions:FileNotFoundErrorandIOError. - When an error is caught, it should be appended to the
error.logfile in a clear, readable format (e.g.,"[ERROR_TYPE] - File: [file_path] - Message: [error_message]"). - The function should continue processing subsequent files even if an error occurs with a previous one.
- The function should return a dictionary containing only the successfully processed files and their line counts.
Expected Behavior:
- If all files are processed successfully, the returned dictionary will contain all file paths and their line counts. The
error.logfile will be empty or will not be created if no errors occur. - If some files fail to process, the returned dictionary will only contain the successful ones. The
error.logfile will contain entries for each failed file.
Edge Cases to Consider:
- An empty list of
file_paths. - A file path pointing to a directory instead of a file.
- A file that exists but has no read permissions.
Examples
Example 1:
Input: ["data1.txt", "data2.txt", "nonexistent.txt"]
Assume:
data1.txt contains:
Line 1
Line 2
Line 3
data2.txt contains:
First line
Second line
Output: {"data1.txt": 3, "data2.txt": 2}
Explanation: data1.txt and data2.txt are successfully read and their line counts are recorded. nonexistent.txt causes a FileNotFoundError, which is logged to error.log. Since it failed, it's not included in the output dictionary.
Example 2:
Input: ["config.ini", "report.txt"]
Assume:
config.ini contains:
[Settings]
mode=auto
report.txt is a file for which the program does not have read permissions.
Output: {"config.ini": 2}
Explanation: config.ini is successfully processed. report.txt causes an IOError (simulating a permission error), which is logged to error.log. report.txt is not included in the output.
Example 3:
Input: []
Output: {}
Explanation: An empty input list results in an empty output dictionary and no log file entries.
Constraints
- The input
file_pathswill be a list of strings. - File content will be plain text, with lines separated by newline characters (
\n). - The
error.logfile should be created in the current working directory. If it already exists, new error messages should be appended. - Performance is not a critical concern for this challenge, but the solution should be reasonably efficient.
Notes
- Consider using
with open(...)for safe file handling. - Remember to properly open the
error.logfile in append mode ('a'). - The exact error message from the exception can be helpful for logging.
- You can simulate
IOErrorfor permission issues by creating a file and then changing its permissions to be unreadable by the script (though for testing, simply assuming it happens is sufficient).