Hone logo
Hone
Problems

Automated Database Migration Tool in Python

Database migrations are essential for managing changes to your database schema over time, especially in evolving applications. This challenge asks you to build a simplified command-line tool in Python that automatically generates SQL migration scripts based on a defined schema evolution. This tool will help developers track and apply database changes in a controlled and repeatable manner.

Problem Description

You are tasked with creating a Python script that generates SQL migration scripts for a simple database. The script should take a starting schema (represented as a dictionary) and an ending schema (also a dictionary) as input. It will then analyze the differences between these schemas and generate a series of SQL statements to transform the database from the starting schema to the ending schema.

The schemas are dictionaries where keys represent table names and values are lists of column names. For simplicity, we'll only consider adding and dropping columns. We will not handle column type changes, renaming columns, or adding constraints.

The generated SQL script should contain separate statements for adding and dropping columns. Each statement should be a valid SQL ALTER TABLE command. The script should be formatted for readability.

Key Requirements:

  • Schema Representation: Schemas are represented as dictionaries: {table_name: [column_name1, column_name2, ...], ...}.
  • Migration Generation: The script must generate SQL statements to add and drop columns to bring the database from the starting schema to the ending schema.
  • SQL Formatting: The generated SQL statements should be properly formatted for readability (e.g., using indentation).
  • Error Handling: The script should handle cases where a table exists in one schema but not the other.
  • Output: The script should output the generated SQL migration script to the console.

Expected Behavior:

The script should take two schema dictionaries as input, analyze the differences, and output a formatted SQL script containing ALTER TABLE statements for adding and dropping columns. The order of operations should be:

  1. Add any columns present in the ending schema but not in the starting schema.
  2. Drop any columns present in the starting schema but not in the ending schema.

Edge Cases to Consider:

  • Empty schemas (both starting and ending schemas are empty).
  • Tables present in only one schema.
  • Identical schemas (no changes needed).
  • Tables with no columns.

Examples

Example 1:

Input:
starting_schema = {"users": ["id", "name"]}
ending_schema = {"users": ["id", "name", "email"]}
Output:
ALTER TABLE users ADD COLUMN email;

Explanation: The email column is present in the ending schema but not in the starting schema. Therefore, a single ALTER TABLE statement is generated to add the email column.

Example 2:

Input:
starting_schema = {"users": ["id", "name", "email"]}
ending_schema = {"users": ["id", "name"]}
Output:
ALTER TABLE users DROP COLUMN email;

Explanation: The email column is present in the starting schema but not in the ending schema. Therefore, a single ALTER TABLE statement is generated to drop the email column.

Example 3:

Input:
starting_schema = {"users": ["id", "name"]}
ending_schema = {"users": ["id", "name"], "products": ["id", "name"]}
Output:
ALTER TABLE products ADD COLUMN id;
ALTER TABLE products ADD COLUMN name;

Explanation: The products table is only present in the ending schema. Both id and name columns are added to the products table.

Constraints

  • The schemas will only contain strings as column names.
  • Table names will also be strings.
  • The script should be able to handle schemas with up to 10 tables.
  • The generated SQL script should be reasonably formatted for readability (indentation is sufficient).
  • The script should not attempt to connect to a database or execute the generated SQL. It only generates the script.

Notes

  • Consider using Python's dictionary operations to efficiently compare the schemas.
  • Think about how to handle tables that exist in only one of the schemas.
  • Focus on generating the correct SQL statements; error handling for invalid SQL syntax is not required.
  • The goal is to create a clear and concise script that accurately reflects the schema differences.
  • You can assume that the input schemas are valid dictionaries.
Loading editor...
python