Data Archiving System in Python

Data archiving is a crucial process for managing large datasets, ensuring compliance, and freeing up valuable storage space. This challenge asks you to implement a simplified data archiving system in Python that moves older files from a source directory to an archive directory based on a specified age threshold. This is a common task in data management and backup systems.

Problem Description

You are tasked with creating a Python script that archives files older than a given number of days from a source directory to an archive directory. The script should:

Take three arguments:
- source_directory: The path to the directory containing the files to be archived.
- archive_directory: The path to the directory where archived files will be moved.
- age_in_days: The age (in days) beyond which files should be archived.
Identify files older than age_in_days: The script should iterate through all files in the source_directory and determine their last modification time.
Move files to the archive directory: For each file identified as older than the specified age, the script should move it to the archive_directory. The original file name should be preserved in the archive directory.
Handle directory creation: If the archive_directory does not exist, the script should create it.
Error Handling: The script should gracefully handle potential errors, such as:
- The source_directory not existing.
- Permissions issues when reading from the source_directory or writing to the archive_directory.
- Files that cannot be moved (e.g., due to permissions or being in use). These files should be logged to the console (printed to standard output) with a message indicating the failure and the reason.

Examples

Example 1:

Input: source_directory="/tmp/source", archive_directory="/tmp/archive", age_in_days=7

Assume /tmp/source contains two files: file1.txt (modified 10 days ago) and file2.txt (modified 3 days ago). Assume /tmp/archive does not exist.

Output:
/tmp/archive created
Moving file1.txt to /tmp/archive

Explanation: The script creates /tmp/archive. It identifies file1.txt as older than 7 days and moves it to /tmp/archive. file2.txt is not older than 7 days, so it is not moved.

Example 2:

Input: source_directory="/tmp/source", archive_directory="/tmp/archive", age_in_days=30

Assume /tmp/source contains file1.txt (modified 45 days ago), file2.txt (modified 15 days ago), and /tmp/archive already exists.

Output:
Moving file1.txt to /tmp/archive

Explanation: The script identifies file1.txt as older than 30 days and moves it to /tmp/archive. file2.txt is not older than 30 days, so it is not moved.

Example 3: (Edge Case - Permissions Error)

Input: source_directory="/root/source", archive_directory="/tmp/archive", age_in_days=1

Assume /root/source contains file1.txt (modified 2 days ago) and the script is run by a user without read permissions for /root/source.

Output:
Error: Permission denied when accessing /root/source.  Skipping archiving.

Explanation: The script detects a permission error when trying to access /root/source and prints an error message to standard output, skipping the archiving process.

Constraints

source_directory, archive_directory must be strings representing valid file paths.
age_in_days must be a non-negative integer.
The script should handle files of any size.
The script should be reasonably efficient for directories containing a large number of files (e.g., thousands). Avoid unnecessary iterations or complex data structures.
The script should not delete the original files from the source_directory. It should move them.
The script should not archive directories, only files.

Notes

You can use the os, time, and shutil modules in Python.
Consider using os.path.getmtime() to get the last modification time of a file.
Use shutil.move() to move files.
Think about how to handle potential exceptions (e.g., FileNotFoundError, PermissionError).
The script should be designed to be robust and handle unexpected situations gracefully. Logging errors is important.
Assume the input arguments are provided correctly (no need to validate the input arguments themselves, just handle errors that arise from them).