pglifecycle: Your PostgreSQL schema as version-controlled data

01: The problem Migrations hide state

Ordered migration files tell you how you got here. They never tell you where you are.

A folder of 0001_…sql, 0002_…sql is a transaction log, not a schema. To know the current shape of a table you replay history in your head, or you dump the database and read raw SQL. The declared state is never written down.

Migrations: the log

State is implied, never stated

The shape of users lives scattered across dozens of append-only SQL files. Reviewers approve deltas they can't fully picture. Drift goes unnoticed.

0001_create_users.sql

0014_add_email_citext.sql

0039_drop_legacy_name.sql

current state of users = ???

pglifecycle: the state

State is the source of truth

pglifecycle inverts the model: it versions the declared state as structured data. One object, one file. The current shape of users is exactly what's on disk.

tables/public.users.yaml

tables/public.orders.yaml

roles/PUBLIC.yaml

current state of users = read the file

↳ Migrations still run the deploy. pglifecycle owns the declaration: the canonical, diffable picture of what your database should look like, kept honest by a round-trip test gate.

02: The commands

Four commands. That's the whole surface area.

pglifecycle does four things, and they compose into everything else: scaffold a new project, read a live database into files, compile files into a restorable archive, and apply the declared state to a live database.

pull

Point it at any reachable database. pglifecycle reads the catalog through the same paths pg_dump uses, and writes one file per object into your schema directory.

Tables, views, functions, sequences, types, roles, and grants.
SQL bodies parsed with tree-sitter and auto-formatted on the way out.
Output validated against the JSON-Schema contract before it lands.
--update: Merge fresh introspection into an existing project, preserving comments and ordering.

pglifecycle pull

$ pglifecycle pull -h db.internal -U postgres -d app schema/ --exclude-schema pgq

pglifecycle v2.0.0 Creating postgres@db.internal → schema/

Created schema/ from postgres@db.internal with 3036 objects:

    37  schemas          278  sequences         80  views
    13  extensions      1734  tables            11  materialized views
     3  domains          523  functions        191  users
    62  types            104  roles

build

Compile the whole declaration into a pg_restore-compatible archive. Object ordering isn't ours; it comes straight from pg_dump's own topological sort, so the output restores cleanly every time.

Dependency order resolved by libpgdump, not heuristics.
Deterministic: same YAML in, byte-identical archive out.
Emit a pg_restore -Fc-compatible archive; restore with pg_restore.

pglifecycle build

$ pglifecycle build # files → archive
    schema/ build.dump
 
  topological sort … 3036 objects
  wrote build.dump
  ✓ restore-order verified

create

Scaffold a new empty project directory, the starting point before you run pull on an existing database or build a schema from scratch by hand.

Creates the full directory layout and a project.yaml config.
Sets encoding, superuser, and standard-conforming strings.
Adds .gitkeep files so empty dirs survive your first commit.

pglifecycle create

$ pglifecycle create # scaffold → disk
    my-project/
 
  created my-project/
  ✓ project.yaml · tables/ · views/ · roles/
  ✓ functions/ · schemata/ · sequences/

deploy

Diff the declared state against a live database and apply the delta with no migration files and no replay surprises. Today, deploys still run through your migration tool.

Computes the diff between the declared schema and the live catalog.
Applies only what changed, with no full drop-and-recreate.
The missing piece that closes the loop: declare, review, deploy.

pglifecycle deploy

$ pglifecycle deploy # diff → apply
    schema/ --dbname app_production
 
  not yet available; coming in a future release

03: The proof The killer demo

The round-trip is a test, not a slogan.

Dump a database. Pull it into YAML. Build it back. Restore it. Dump again. The two dumps are identical, and that comparison can run as an automated gate on every change.

pg_dump

live database

→

pglifecycle pull

→ schema/*.yaml

→

pglifecycle build

→ build.dump

→

pg_restore

→ fresh database

→

pg_dump

→ re-dump

diff original.dump roundtrip.dump → 0 differences

The dump that comes out the far end is byte-for-byte identical to the one that went in.

# .github/workflows/schema.yml
roundtrip-test ✓ passed

round-trip · CI gate

$ pg_dump -Fc app_production -f original.dump
$ pglifecycle pull --dbname app_production schema/
$ pglifecycle build schema/ roundtrip.dump
$ pg_restore -d app_roundtrip roundtrip.dump
$ pg_dump -Fc app_roundtrip -f roundtrip2.dump
 
$ pg_restore -l original.dump > a.txt; pg_restore -l roundtrip2.dump > b.txt
$ diff a.txt b.txt && echo "round-trip OK"
  round-trip OK
 
  # if the dumps ever drift, this check catches it, loudly, in CI.

04: No manifest It already knows the graph

No ordering file. No manifest. It knows how your schema fits together.

Other tools make you maintain a hand-ordered list: a manifest that says "run roles, then schemas, then tables, then views." pglifecycle never asks. It reads the dependencies out of your objects and resolves them the same way Postgres does.

dependency graph resolved automatically · 0 manifests

↳ built in pg_dump's topological order; restores clean every time

The manifest you don't write

There's no order.txt, no numbered prefixes, no "depends_on" key for you to keep in sync. Add a foreign key and the edge appears in the graph on its own.

Dependencies come from the objects

A view references a table; a grant references a role; a foreign key references another table. pglifecycle reads those relationships directly, using the same catalog facts Postgres itself relies on.

Ordered by pg_dump, not by us

When it's time to build, the topological sort is borrowed straight from pg_dump via libpgdump. So "what depends on what" is decided by the same engine that has gotten it right for decades.

05: Why it's different

Built on Postgres' own machinery, not around it.

Plenty of tools generate SQL. pglifecycle is opinionated about exactly two things: schema is data, and the source of truth is whatever pg_dump agrees with.

Schema as data

Your declared state is structured YAML, not opaque SQL blobs. Query it, diff it, lint it, template it: it's just data.

A JSON-Schema contract

Every object is validated against a published JSON-Schema before it's accepted. Typos and invalid shapes fail fast, not at deploy time.

Native pg_dump fidelity

Builds round-trip through pg_dump / pg_restore, including pg_dump's own topological sort. We don't reinvent dependency ordering.

One file per object

Each table, view, function, and role is its own file. Git history per object: blameable lines, reviewable pull requests, clean merges.

First-class roles & grants

Roles, memberships, and per-object ACLs (including PUBLIC) are versioned objects, not an afterthought bolted on at the end.

Auto-formatted SQL bodies

Function, view, and trigger bodies are normalized with libpgfmt, so diffs reflect a logic change, not someone's whitespace.

A single Rust binary

One static binary. No interpreter, no virtualenv, no dependency tree to resolve in CI. Download it, drop it on the runner, and it just runs on Linux and macOS.

06: See the real thing

This is what your schema looks like on disk.

Not a marketing mock-up. The actual users table, the PUBLIC role's ACL, and a real project tree, exactly as pglifecycle writes and reads them.

schema/tables/public.users.yaml

# schema/tables/public.users.yaml
schema: public
name: users
owner: app
columns:
  - name: id
    type: bigint
    nullable: false
    identity: always
  - name: email
    type: citext
    nullable: false
  - name: full_name
    type: text
  - name: created_at
    type: timestamptz
    nullable: false
    default: now()
primary_key:
  name: users_pkey
  columns: [id]
indexes:
  - name: users_email_key
    columns: [email]
    unique: true
grants:
  - role: app_readwrite
    privileges: [SELECT, INSERT, UPDATE, DELETE]
  - role: app_readonly
    privileges: [SELECT]

One object, fully declared

Columns, identity, primary key, indexes, and grants all live in a single file. There's no migration to read alongside it. This is the table.

pglifecycle pull wrote this file. pglifecycle build turns it back into the exact CREATE TABLE + GRANT statements pg_dump would emit.

Validated against the JSON-Schema contract on every read and write.
Grants travel with the table, with no separate permissions script.

schema/roles/PUBLIC.yaml

# schema/roles/PUBLIC.yaml
# The PUBLIC pseudo-role, locked down by default.
name: PUBLIC
revoke:
  - on: SCHEMA public
    privileges: [ALL]
  - on: DATABASE app
    privileges: [ALL]
grants:
  - on: DATABASE app
    privileges: [CONNECT]
default_acls:
  - in_schema: public
    on: TABLES
    privileges: []   # no implicit grants to PUBLIC

PUBLIC is a real, reviewable object

In most tools the implicit PUBLIC grants are invisible until they become a security finding. Here they're a versioned file you can review in a pull request.

Revokes and grants are both explicit, so the resulting ACL is exactly what pg_dump prints, and exactly what restores.

Default privileges modeled, so future objects inherit the right ACL.
Changes to access control show up in git blame like any other change.

schema/: project tree

schema/ ├── project.yaml # project config + contract version ├── schemata/ │ └── public.yaml ├── roles/ │ ├── PUBLIC.yaml │ ├── app_readwrite.yaml │ └── app_readonly.yaml ├── tables/ │ ├── public.users.yaml │ ├── public.orders.yaml │ └── public.sessions.yaml ├── views/ │ └── public.active_users.yaml ├── functions/ │ └── public.set_updated_at.yaml └── sequences/ └── public.orders_id_seq.yaml

A directory you'd actually commit

Objects are grouped by kind, named schema.object.yaml. It reads like a filesystem because it is one. No database required to browse your schema.

project.yaml pins the contract version, so a checkout from two years ago still validates against the schema it was written for.

Tiny, focused diffs: a column change touches exactly one file.
CODEOWNERS by directory: roles reviewed by security, tables by the team.

07: Under the hood The crates

Three Rust crates do the heavy lifting.

pglifecycle is the CLI on top. The interesting parts are the libraries underneath, each one published and reusable on its own.

tree-sitter-postgres

An incremental parser for the PostgreSQL SQL dialect. Function, view, and check bodies become real syntax trees, so they can be validated and reformatted rather than string-matched.

libpgdump

Reads and writes pg_dump's archive format and reproduces its topological object ordering: the reason a built schema restores cleanly without hand-rolled dependency logic.

libpgfmt

A canonical formatter for PostgreSQL SQL. Normalizes embedded bodies to one stable style, so version control shows what changed in meaning, not in spacing.

Rewritten in Rust v2

v2 is a ground-up Rust rewrite: a single static binary, faster introspection, and a strict contract. Same workflow, far fewer moving parts.

Footnote: v1 was written in Python and has run AWeber's production schema for 7+ years. v2 keeps the verbs identical so existing schema directories carry over.

Open source BSD 3-Clause

pglifecycle is released under the BSD 3-Clause license: permissive, business-friendly, and the same license the PostgreSQL project itself uses.

Battle-tested at AWeber on a real, large production schema for over seven years before the rewrite.

Pull your schema into version control today.

One binary, four commands, and a round-trip test you can trust. Point it at a database and read the diff.

install · Homebrew

$ brew tap gmr/postgres
$ brew install pglifecycle
$ pglifecycle --version
  pglifecycle 2.0.0-alpha.0 · contract v2 · BSD-3

Homebrew 6.0+ may require trusting the tap first: brew trust --formula gmr/postgres/pglifecycle

install · Cargo

$ cargo install pglifecycle
$ pglifecycle --version
  pglifecycle 2.0.0-alpha.0 · contract v2 · BSD-3

Pre-built binaries for Linux and macOS (x86_64 and aarch64) are attached to each release. No Rust toolchain required.

Browse releases →

Read the docs → View on GitHub