comdab – Compare Database Schemas

comdab allows you to compare in depth two database schemas to find all differences (missing columns, different nullabilities or defaults, slight changes in function or triggers definitions…) and turn these reports into model migrations.

comdab migration generation supercharges migration tools like Alembic to create migrations that, when applied to an existing database, produce the exact same schema than a new database created from scratch.

Indeed, while migration tools can auto-detect model changes and write automatically the migrations to apply to pre-existing databases, it does not cover the whole schema complexity, and does not prevent human errors (like modifying the model without re-generating migrations, or manually editing migrations in a slightly wrong way…)

This may cause dangerous and hard to spot bugs, especially if your unit tests and CI run on a fresh database created from the Python-written model, and not on pre-existing databases with the new migration applied.

By running comdab, you can ensure the two are the nearly-exact same.

comdab is based on the wonderful SQLAlchemy library to connect to the database, and for most of schema introspection.

Warning

comdab is still in development, only tested with PostgreSQL 14 — 18 to date.

All feedback and contributions are welcome!

Requirements

Installation

Usage

Comparing database schemas

comdab main reporting function is compare_databases(), which needs already established SQLAlchemy connections to the two databases to compare:

from comdab import compare_databases
from sqlalchemy import create_engine

engine_1 = create_engine("postgresql://user:pass@host/foo")
engine_2 = create_engine("postgresql://user:pass@host/bar")

with engine_1.connect() as left_conn, engine_2.connect() as right_conn:
    reports = compare_databases(left_conn, right_conn)

if reports:
    print("❌ Database schemas are different:", reports)
else:
    print("✅ Database schemas are the same!")

These connections will, of course, only be used for read-only database introspection.

To compare an existing database with a Python-defined schema, the latter has to be created first in a fresh database:

from comdab import compare_databases
from sqlalchemy import create_engine

from my_app import BaseModel

engine_1 = create_engine("postgresql://user:password@host/foo")
engine_2 = create_engine("postgresql://user:password@host/tmp_db_just_created")

with engine_1.connect() as left_connection, engine_2.connect() as right_connection:
   BaseModel.metadata.create_all(bind=right_connection)
   reports = compare_databases(left_connection, right_connection)

Generating migrations

To generate migrations, a user-defined MigrationGeneratorPort / PartialMigrationGeneratorPort subclass is additionally needed:

from comdab import generate_migrations, PartialMigrationGeneratorPort
from comdab.models import ComdabTable
from sqlalchemy import create_engine

class MyMigrationGenerator(PartialMigrationGeneratorPort):
    def __init__(self) -> None:
        self.sql_text = ""

    def create_table(self, *, table: ComdabTable) -> None:
        self.sql_text += f"CREATE TABLE {table.name} [...];\n"

    def drop_table(self, *, table: ComdabTable) -> None:
        self.sql_text += f"DROP TABLE {table.name};\n"

    ...

generator = MyMigrationGenerator()
engine_1 = create_engine("postgresql://user:pass@host/source")
engine_2 = create_engine("postgresql://user:pass@host/target")

with engine_1.connect() as source_conn, engine_2.connect() as target_conn:
   generate_migrations(source_conn, target_conn, generator)

print(generator.sql_text)

An Alembic-based migration generator implementation might be included in a future comdab release.

Supported database features

comdab heavily relies on SQLAlchemy schema reflection capabilities to build its internal schema representation, but extends it (using custom queries) to retrieve objects not natively handled by SQLALchemy.

Top-level schema fields

Object

Support

Notes

Tables

Yes

Views

Partial

PostgreSQL only (temporary + materialized)

Sequences

Partial

PostgreSQL only

Functions

Partial

PostgreSQL only (compares reconstructed definitions)

Custom types

Partial

All databases: enums that are used (in column.type)
PostgreSQL only: unused enums

Columns

Yes

Constraints

Partial

All databases: PK, FK, unique & check constraints
PostgreSQL only: exclude constraints

Indexes

Yes

Triggers

Partial

PostgreSQL only (compares reconstructed definitions)

Column type

Partial

PostgreSQL only: HSTORE, ranges, multiranges

Owners

No

No notion of users/roles/permissions…

Comments

No

Each object can hold non-standard / dialect-specific data in a extra dictionary, that will be compared too (unless specifically ignored, see below).

Ignore rules

comdab comes with a powerful system to allow some specific differences between the two schemas. generate_migrations(), build_comdab_schema() and compare_databases() functions take a rules argument, where you can specifies paths to be ignored or reported as warnings.

These are a mapping of model path (build from comdab.ROOT) to directives (either "ignore", warning or error), where a path may override a more general exclusion:

rules={
   # Do not compare functions
   ROOT.functions: "ignore",
   # Report differences as warnings for all tables starting with "_"
   ROOT.tables["_.*"]: "warning",
   # Except for triggers, that are really importants
   ROOT.tables["_.*"].triggers: "error",
   # Allow all left tables to have extra columns
   ROOT.tables[...].columns.left_only: "ignore",  # [...] means [".*"]
   # But warn the other way around
   ROOT.tables[...].columns.right_only: "warning",
}

Rules can use regular expressions, but two rules cannot match a same object: if you want to say “every tables but foo”, use (?!foo$).*.