Tactical Dependency Tree Analysis for PostgreSQL Extension Upgrades

Automating PostgreSQL extension upgrades in production requires deterministic dependency resolution, explicit failure boundaries, and CI/CD-native validation gates. When an extension declares requires in its .control file or relies on shared libraries, implicit upgrade paths frequently break during minor version bumps or platform migrations. Dependency tree analysis transforms this risk into a predictable, auditable pipeline stage.

Resolution Pipeline at a Glance

Dependency resolution runs as a deterministic, fail-fast pipeline stage before any DDL executes.

flowchart TD
    Start(["Target extensions"]) --> SA["Static analysis<br/>parse .control + catalogs"]
    SA --> DAG["Build dependency DAG"]
    DAG --> Cyc{"Cycle or<br/>missing dependency?"}
    Cyc -- yes --> Fail[/Exit non-zero/]
    Cyc -- no --> Topo["Topological sort"]
    Topo --> RV["Runtime validation<br/>on staging replica"]
    RV --> Exec["CREATE / ALTER EXTENSION<br/>in dependency order"]

CI/CD Integration & Resolution Strategy

Extension lifecycle management begins with mapping declared dependencies against the target PostgreSQL major/minor version. Understanding how PostgreSQL Extension Architecture & Lifecycle Fundamentals govern control file parsing, library loading, and catalog registration is mandatory before scripting automated upgrades. In a production pipeline, dependency resolution must run as a pre-flight gate before any ALTER EXTENSION UPDATE or CREATE EXTENSION executes.

The resolution phase operates in two distinct modes:

  1. Static Analysis: Parse .control files, pg_available_extensions, and internal package manifests to construct a directed acyclic graph (DAG). Accurate Extension Registry Mapping ensures that version constraints, optional dependencies, and transitive requirements are captured before deployment.
  2. Runtime Validation: Connect to a staging replica, verify shared library compatibility ($libdir paths), and confirm role permissions before promoting to production. This step must account for Security Boundaries & Permissions to prevent privilege escalation, catalog corruption, or unintended schema modifications.

Production-Ready Resolution Engine

The following Python script implements a dry-run capable dependency tree analyzer designed for CI/CD integration. It queries the system catalog, resolves transitive dependencies using Kahn’s algorithm, detects missing or circular references, and exits with explicit status codes for pipeline orchestration. The implementation distinguishes between available packages and currently installed extensions, aligning with the conceptual separation detailed in Understanding pg_available_extensions vs Installed Extensions.

#!/usr/bin/env python3
"""
PostgreSQL Extension Dependency Tree Analyzer
Designed for CI/CD pre-flight validation and idempotent dry-run execution.
Requires: psycopg2-binary (pip install psycopg2-binary)
"""

import argparse
import sys
import json
from collections import defaultdict, deque
from typing import Dict, List, Set, Tuple, Optional

try:
    import psycopg2
    import psycopg2.extras
    from psycopg2 import sql
except ImportError:
    print("ERROR: psycopg2 is required. Install via: pip install psycopg2-binary", file=sys.stderr)
    sys.exit(2)

def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser(description="Resolve PostgreSQL extension dependency trees")
    parser.add_argument("--db-uri", required=True, help="PostgreSQL connection URI (postgresql://...)")
    parser.add_argument("--target-extensions", nargs="+", required=True, help="Extensions to analyze/upgrade")
    parser.add_argument("--dry-run", action="store_true", help="Validate DAG without executing DDL")
    parser.add_argument("--strict-mode", action="store_true", help="Fail on missing optional dependencies")
    return parser.parse_args()

def fetch_catalog_data(conn) -> Tuple[Dict[str, dict], Dict[str, dict]]:
    """
    Returns:
      available: {ext_name: {"default_version": str, "requires": List[str]}}
      installed: {ext_name: {"version": str, "schema": str}}
    """
    with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur:
        # Prerequisite extensions live in pg_available_extension_versions.requires
        # (a name[] array), NOT in the free-text `comment` column.
        cur.execute("""
            SELECT e.name, e.default_version, v.requires
            FROM pg_available_extensions e
            JOIN pg_available_extension_versions v
              ON v.name = e.name AND v.version = e.default_version;
        """)
        available = {
            row["name"]: {
                "default_version": row["default_version"],
                "requires": list(row["requires"]) if row["requires"] else []
            }
            for row in cur.fetchall()
        }

        cur.execute("SELECT extname, extversion, extnamespace::regnamespace::text AS schema FROM pg_extension;")
        installed = {
            row["extname"]: {"version": row["extversion"], "schema": row["schema"]}
            for row in cur.fetchall()
        }
    return available, installed

def build_dependency_graph(
    targets: List[str],
    available: Dict[str, dict],
    installed: Dict[str, dict],
    strict: bool
) -> Tuple[Dict[str, List[str]], List[str]]:
    """Constructs a DAG of required extensions and returns missing dependencies."""
    graph = defaultdict(list)
    missing = []
    visited = set()
    queue = deque(targets)

    while queue:
        ext = queue.popleft()
        if ext in visited:
            continue
        visited.add(ext)

        if ext not in available:
            missing.append(ext)
            continue

        graph.setdefault(ext, [])  # register node so standalone targets aren't dropped
        deps = available[ext]["requires"]
        for dep in deps:
            graph[dep].append(ext)
            if dep not in visited:
                queue.append(dep)

    if strict and missing:
        print(f"STRICT MODE FAILURE: Missing dependencies: {', '.join(missing)}", file=sys.stderr)
        sys.exit(1)

    return graph, missing

def topological_sort(graph: Dict[str, List[str]], targets: Set[str]) -> List[str]:
    """Kahn's algorithm for deterministic dependency resolution."""
    in_degree = defaultdict(int)
    for node, neighbors in graph.items():
        if node not in in_degree:
            in_degree[node] = 0
        for neighbor in neighbors:
            in_degree[neighbor] += 1

    queue = deque([n for n in in_degree if in_degree[n] == 0])
    sorted_order = []

    while queue:
        node = queue.popleft()
        sorted_order.append(node)
        for neighbor in graph.get(node, []):
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    if len(sorted_order) != len(in_degree):
        print("ERROR: Circular dependency detected in extension graph.", file=sys.stderr)
        sys.exit(1)

    return [n for n in sorted_order if n in targets or n in graph]

def execute_upgrade_sequence(conn, order: List[str], available: Dict, installed: Dict, dry_run: bool) -> int:
    """Applies CREATE or ALTER EXTENSION UPDATE in dependency-safe order."""
    executed = []
    for ext in order:
        if ext not in available:
            continue

        target_ver = available[ext]["default_version"]
        is_installed = ext in installed
        current_ver = installed[ext]["version"] if is_installed else None

        if is_installed and current_ver == target_ver:
            continue  # Idempotency guarantee: skip if already at target

        if dry_run:
            action = "CREATE EXTENSION" if not is_installed else "ALTER EXTENSION UPDATE"
            print(f"[DRY-RUN] {action} {ext} TO {target_ver}")
        else:
            with conn.cursor() as cur:
                if not is_installed:
                    cur.execute(sql.SQL(
                        "CREATE EXTENSION IF NOT EXISTS {} VERSION {}"
                    ).format(sql.Identifier(ext), sql.Literal(target_ver)))
                else:
                    cur.execute(sql.SQL(
                        "ALTER EXTENSION {} UPDATE TO {}"
                    ).format(sql.Identifier(ext), sql.Literal(target_ver)))
            conn.commit()
            print(f"[EXECUTED] {ext} -> {target_ver}")
        executed.append(ext)

    return 0

def main() -> int:
    args = parse_args()

    try:
        conn = psycopg2.connect(args.db_uri)
        conn.autocommit = False
    except Exception as e:
        print(f"CONNECTION ERROR: {e}", file=sys.stderr)
        return 2

    try:
        available, installed = fetch_catalog_data(conn)
        graph, missing = build_dependency_graph(args.target_extensions, available, installed, args.strict_mode)

        if missing and not args.strict_mode:
            print(f"WARNING: Optional dependencies missing: {', '.join(missing)}")

        # Filter graph to only include targets and their required ancestors
        target_set = set(args.target_extensions)
        resolution_order = topological_sort(graph, target_set)

        print(json.dumps({"status": "resolved", "order": resolution_order, "missing": missing}, indent=2))

        return execute_upgrade_sequence(conn, resolution_order, available, installed, args.dry_run)
    except Exception as e:
        print(f"RESOLUTION ERROR: {e}", file=sys.stderr)
        return 1
    finally:
        conn.close()

if __name__ == "__main__":
    sys.exit(main())

Pipeline Orchestration & Idempotency Guarantees

The script enforces idempotency by comparing pg_extension.extversion against pg_available_extensions.default_version before issuing DDL. When integrated into CI/CD workflows (GitHub Actions, GitLab CI, or Jenkins), the tool should run as an isolated stage with read-only credentials for staging validation and elevated privileges for production promotion.

Pipeline configuration should enforce:

  • Exit Code Handling: 0 indicates successful validation or dry-run completion. 1 signals dependency resolution failure (circular graphs, missing required packages). 2 denotes connection or execution errors.
  • Artifact Generation: Pipe the JSON output to a downstream deployment step that applies ALTER EXTENSION commands only when the DAG resolves cleanly.
  • Version Pinning: By default the tool targets each extension’s default_version; extend the resolver with a --target-version argument to pin an explicit version and prevent unexpected major bumps during automated runs.

For authoritative guidance on extension loading mechanics and catalog behavior, consult the official PostgreSQL Documentation: Extension Loading. When implementing custom database drivers or connection wrappers, adhere to the PEP 249 – Python Database API Specification v2.0 to ensure transactional safety and consistent error propagation across environments.

By treating dependency resolution as a deterministic, auditable pipeline stage rather than an implicit runtime operation, engineering teams eliminate upgrade drift, enforce strict version boundaries, and maintain catalog integrity across distributed PostgreSQL clusters.