Tactical Dependency Tree Analysis for PostgreSQL Extension Upgrades
Automating PostgreSQL extension upgrades in production requires deterministic dependency resolution, explicit failure boundaries, and CI/CD-native validation gates. When an extension declares requires in its .control file or relies on shared libraries, implicit upgrade paths frequently break during minor version bumps or platform migrations. Dependency tree analysis transforms this risk into a predictable, auditable pipeline stage.
Resolution Pipeline at a Glance
Dependency resolution runs as a deterministic, fail-fast pipeline stage before any DDL executes.
flowchart TD
Start(["Target extensions"]) --> SA["Static analysis<br/>parse .control + catalogs"]
SA --> DAG["Build dependency DAG"]
DAG --> Cyc{"Cycle or<br/>missing dependency?"}
Cyc -- yes --> Fail[/Exit non-zero/]
Cyc -- no --> Topo["Topological sort"]
Topo --> RV["Runtime validation<br/>on staging replica"]
RV --> Exec["CREATE / ALTER EXTENSION<br/>in dependency order"]
CI/CD Integration & Resolution Strategy
Extension lifecycle management begins with mapping declared dependencies against the target PostgreSQL major/minor version. Understanding how PostgreSQL Extension Architecture & Lifecycle Fundamentals govern control file parsing, library loading, and catalog registration is mandatory before scripting automated upgrades. In a production pipeline, dependency resolution must run as a pre-flight gate before any ALTER EXTENSION UPDATE or CREATE EXTENSION executes.
The resolution phase operates in two distinct modes:
- Static Analysis: Parse
.controlfiles,pg_available_extensions, and internal package manifests to construct a directed acyclic graph (DAG). Accurate Extension Registry Mapping ensures that version constraints, optional dependencies, and transitive requirements are captured before deployment. - Runtime Validation: Connect to a staging replica, verify shared library compatibility (
$libdirpaths), and confirm role permissions before promoting to production. This step must account for Security Boundaries & Permissions to prevent privilege escalation, catalog corruption, or unintended schema modifications.
Production-Ready Resolution Engine
The following Python script implements a dry-run capable dependency tree analyzer designed for CI/CD integration. It queries the system catalog, resolves transitive dependencies using Kahn’s algorithm, detects missing or circular references, and exits with explicit status codes for pipeline orchestration. The implementation distinguishes between available packages and currently installed extensions, aligning with the conceptual separation detailed in Understanding pg_available_extensions vs Installed Extensions.
#!/usr/bin/env python3
"""
PostgreSQL Extension Dependency Tree Analyzer
Designed for CI/CD pre-flight validation and idempotent dry-run execution.
Requires: psycopg2-binary (pip install psycopg2-binary)
"""
import argparse
import sys
import json
from collections import defaultdict, deque
from typing import Dict, List, Set, Tuple, Optional
try:
import psycopg2
import psycopg2.extras
from psycopg2 import sql
except ImportError:
print("ERROR: psycopg2 is required. Install via: pip install psycopg2-binary", file=sys.stderr)
sys.exit(2)
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(description="Resolve PostgreSQL extension dependency trees")
parser.add_argument("--db-uri", required=True, help="PostgreSQL connection URI (postgresql://...)")
parser.add_argument("--target-extensions", nargs="+", required=True, help="Extensions to analyze/upgrade")
parser.add_argument("--dry-run", action="store_true", help="Validate DAG without executing DDL")
parser.add_argument("--strict-mode", action="store_true", help="Fail on missing optional dependencies")
return parser.parse_args()
def fetch_catalog_data(conn) -> Tuple[Dict[str, dict], Dict[str, dict]]:
"""
Returns:
available: {ext_name: {"default_version": str, "requires": List[str]}}
installed: {ext_name: {"version": str, "schema": str}}
"""
with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur:
# Prerequisite extensions live in pg_available_extension_versions.requires
# (a name[] array), NOT in the free-text `comment` column.
cur.execute("""
SELECT e.name, e.default_version, v.requires
FROM pg_available_extensions e
JOIN pg_available_extension_versions v
ON v.name = e.name AND v.version = e.default_version;
""")
available = {
row["name"]: {
"default_version": row["default_version"],
"requires": list(row["requires"]) if row["requires"] else []
}
for row in cur.fetchall()
}
cur.execute("SELECT extname, extversion, extnamespace::regnamespace::text AS schema FROM pg_extension;")
installed = {
row["extname"]: {"version": row["extversion"], "schema": row["schema"]}
for row in cur.fetchall()
}
return available, installed
def build_dependency_graph(
targets: List[str],
available: Dict[str, dict],
installed: Dict[str, dict],
strict: bool
) -> Tuple[Dict[str, List[str]], List[str]]:
"""Constructs a DAG of required extensions and returns missing dependencies."""
graph = defaultdict(list)
missing = []
visited = set()
queue = deque(targets)
while queue:
ext = queue.popleft()
if ext in visited:
continue
visited.add(ext)
if ext not in available:
missing.append(ext)
continue
graph.setdefault(ext, []) # register node so standalone targets aren't dropped
deps = available[ext]["requires"]
for dep in deps:
graph[dep].append(ext)
if dep not in visited:
queue.append(dep)
if strict and missing:
print(f"STRICT MODE FAILURE: Missing dependencies: {', '.join(missing)}", file=sys.stderr)
sys.exit(1)
return graph, missing
def topological_sort(graph: Dict[str, List[str]], targets: Set[str]) -> List[str]:
"""Kahn's algorithm for deterministic dependency resolution."""
in_degree = defaultdict(int)
for node, neighbors in graph.items():
if node not in in_degree:
in_degree[node] = 0
for neighbor in neighbors:
in_degree[neighbor] += 1
queue = deque([n for n in in_degree if in_degree[n] == 0])
sorted_order = []
while queue:
node = queue.popleft()
sorted_order.append(node)
for neighbor in graph.get(node, []):
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
if len(sorted_order) != len(in_degree):
print("ERROR: Circular dependency detected in extension graph.", file=sys.stderr)
sys.exit(1)
return [n for n in sorted_order if n in targets or n in graph]
def execute_upgrade_sequence(conn, order: List[str], available: Dict, installed: Dict, dry_run: bool) -> int:
"""Applies CREATE or ALTER EXTENSION UPDATE in dependency-safe order."""
executed = []
for ext in order:
if ext not in available:
continue
target_ver = available[ext]["default_version"]
is_installed = ext in installed
current_ver = installed[ext]["version"] if is_installed else None
if is_installed and current_ver == target_ver:
continue # Idempotency guarantee: skip if already at target
if dry_run:
action = "CREATE EXTENSION" if not is_installed else "ALTER EXTENSION UPDATE"
print(f"[DRY-RUN] {action} {ext} TO {target_ver}")
else:
with conn.cursor() as cur:
if not is_installed:
cur.execute(sql.SQL(
"CREATE EXTENSION IF NOT EXISTS {} VERSION {}"
).format(sql.Identifier(ext), sql.Literal(target_ver)))
else:
cur.execute(sql.SQL(
"ALTER EXTENSION {} UPDATE TO {}"
).format(sql.Identifier(ext), sql.Literal(target_ver)))
conn.commit()
print(f"[EXECUTED] {ext} -> {target_ver}")
executed.append(ext)
return 0
def main() -> int:
args = parse_args()
try:
conn = psycopg2.connect(args.db_uri)
conn.autocommit = False
except Exception as e:
print(f"CONNECTION ERROR: {e}", file=sys.stderr)
return 2
try:
available, installed = fetch_catalog_data(conn)
graph, missing = build_dependency_graph(args.target_extensions, available, installed, args.strict_mode)
if missing and not args.strict_mode:
print(f"WARNING: Optional dependencies missing: {', '.join(missing)}")
# Filter graph to only include targets and their required ancestors
target_set = set(args.target_extensions)
resolution_order = topological_sort(graph, target_set)
print(json.dumps({"status": "resolved", "order": resolution_order, "missing": missing}, indent=2))
return execute_upgrade_sequence(conn, resolution_order, available, installed, args.dry_run)
except Exception as e:
print(f"RESOLUTION ERROR: {e}", file=sys.stderr)
return 1
finally:
conn.close()
if __name__ == "__main__":
sys.exit(main())
Pipeline Orchestration & Idempotency Guarantees
The script enforces idempotency by comparing pg_extension.extversion against pg_available_extensions.default_version before issuing DDL. When integrated into CI/CD workflows (GitHub Actions, GitLab CI, or Jenkins), the tool should run as an isolated stage with read-only credentials for staging validation and elevated privileges for production promotion.
Pipeline configuration should enforce:
- Exit Code Handling:
0indicates successful validation or dry-run completion.1signals dependency resolution failure (circular graphs, missing required packages).2denotes connection or execution errors. - Artifact Generation: Pipe the JSON output to a downstream deployment step that applies
ALTER EXTENSIONcommands only when the DAG resolves cleanly. - Version Pinning: By default the tool targets each extension’s
default_version; extend the resolver with a--target-versionargument to pin an explicit version and prevent unexpected major bumps during automated runs.
For authoritative guidance on extension loading mechanics and catalog behavior, consult the official PostgreSQL Documentation: Extension Loading. When implementing custom database drivers or connection wrappers, adhere to the PEP 249 – Python Database API Specification v2.0 to ensure transactional safety and consistent error propagation across environments.
By treating dependency resolution as a deterministic, auditable pipeline stage rather than an implicit runtime operation, engineering teams eliminate upgrade drift, enforce strict version boundaries, and maintain catalog integrity across distributed PostgreSQL clusters.