Triggers vs Logical Replication for CDC in pg_trickle

Triggers vs Logical Replication for CDC in pg_trickle

Status: Evaluation Report (updated with implementation status)
Date: 2026-02-24
Context: ADR-001/ADR-002 in PLAN_ADRS.md · PLAN_USER_TRIGGERS_EXPLICIT_DML.md

Executive Summary

pg_trickle uses row-level AFTER triggers to capture changes on source tables. This report evaluates the trigger-based approach against logical replication (WAL-based CDC) across five dimensions: correctness, performance, operations, and two end-user features — user-defined triggers on stream tables and logical replication subscriptions from stream tables.

Conclusion: Triggers remain the correct choice for the current scope given operational simplicity and zero-config deployment. The hybrid approach — trigger bootstrap for creation with automatic WAL transition for steady-state — is now implemented (pg_trickle.cdc_mode GUC, src/wal_decoder.rs). User- defined triggers on stream tables are also implemented (pg_trickle.user_triggers GUC, DISABLE TRIGGER USER during refresh). These were previously recommendations (§6.2, §6.6); both are now shipped.

However, the atomicity constraint — the original reason for choosing triggers — is primarily a creation-time inconvenience, not a steady-state limitation. Once a stream table exists, logical replication has three significant runtime advantages:

No write-side overhead — With triggers, every INSERT/UPDATE/DELETE on a tracked source table does extra work before the application’s transaction can commit: it runs a PL/pgSQL function, writes a row into a buffer table, and updates an index. This slows down the application. With logical replication, PostgreSQL already writes every change to its internal transaction log (WAL) regardless — the CDC layer simply reads that log after the fact, so the application’s writes are not slowed down at all.
TRUNCATE capture — When someone runs TRUNCATE on a source table, row-level triggers do not fire (TRUNCATE replaces the entire file rather than deleting rows one-by-one). This leaves stream tables silently stale until a manual refresh. Logical replication captures TRUNCATE natively from the WAL, so pg_trickle would know immediately that all rows were removed.
Change ordering from the transaction log — With triggers, each trigger independently calls pg_current_wal_lsn() to timestamp its change. With logical replication, the ordering comes directly from the WAL — the authoritative, global record of all database changes — which means change ordering is guaranteed to match commit order, even across concurrent transactions.

The two end-user features (user triggers and logical replication FROM stream tables) are both achievable without changing the CDC mechanism. A hybrid approach (triggers for creation, logical replication for steady-state) deserves serious consideration. See §3 for the full analysis.

1. Background

Current Architecture

CDC triggers on each tracked source table write typed per-column rows into per-table buffer tables (pgtrickle_changes.changes_<oid>). Each buffer row captures:

Column	Purpose
`change_id`	BIGSERIAL ordering within a source
`lsn`	`pg_current_wal_lsn()` at trigger time
`action`	`'I'` / `'U'` / `'D'`
`pk_hash`	Content hash of PK columns (optional)
`new_<col>`	Per-column NEW values (INSERT/UPDATE)
`old_<col>`	Per-column OLD values (UPDATE/DELETE)

A covering B-tree index (lsn, pk_hash, change_id) INCLUDE (action) supports the differential refresh’s LSN-range scan.

The Atomicity Constraint

create_stream_table() performs DDL (CREATE TABLE) and DML (catalog inserts) before setting up CDC. pg_create_logical_replication_slot() cannot execute inside a transaction that has already performed writes. This makes single-transaction atomic creation impossible with logical replication — the decisive factor in the original ADR.

2. Comparison Matrix

2.1 Correctness & Transactional Safety

Aspect	Triggers	Logical Replication
Atomic creation	✅ Same transaction as DDL+catalog	❌ Slot creation requires separate transaction
Change visibility	✅ Immediate (same transaction)	⚠️ Asynchronous (after COMMIT + WAL decode)
TRUNCATE capture	❌ Row-level triggers not fired	✅ WAL emits TRUNCATE since PG 11
Transaction ordering	✅ Change buffer rows ordered by LSN	✅ WAL stream preserves commit order
Crash recovery	✅ Buffer tables are WAL-logged; no orphan state	⚠️ Slot survives crash but may need re-sync
Schema change handling	✅ DDL event hooks rebuild trigger in-place	⚠️ Requires slot re-creation or output plugin awareness

Key insight: The TRUNCATE gap is the most significant correctness limitation of the trigger approach. A statement-level AFTER TRUNCATE trigger that marks downstream STs for automatic FULL refresh would close this gap without changing the CDC architecture (see §6 Recommendation 3).

2.2 Performance

Metric	Triggers	Logical Replication
Per-row write overhead	~2–4 μs (narrow INSERT) to ~5–15 μs (wide UPDATE)	~0 (WAL writes happen regardless)
Expected throughput reduction	1.5–5× on tracked source tables	None on source tables
Write amplification	2× (source WAL + buffer table WAL + index)	1× (only source WAL)
Change buffer storage	Heap table + index per source	WAL segments (shared, recycled)
Sequence contention	BIGSERIAL per buffer (lightweight)	N/A
Throughput ceiling	~5,000 writes/sec (estimated)	WAL throughput (much higher)
Decoding CPU cost	N/A	Non-trivial; output plugin runs in WAL sender
Zero-change refresh	~3 ms (EXISTS check on empty buffer)	~3 ms (no pending WAL changes)

Key insight: Trigger overhead is synchronous — every committing transaction pays the cost. For applications with moderate write rates (<5,000 writes/sec) this is acceptable. For high-throughput OLTP workloads, logical replication’s zero write-side overhead is a significant advantage.

2.3 Operational Complexity

Aspect	Triggers	Logical Replication
PostgreSQL configuration	None required	`wal_level = logical` + restart
Managed PG compatibility	✅ Works everywhere	⚠️ Some providers restrict `wal_level`
WAL retention risk	None (buffer tables are independent)	Slots prevent WAL cleanup; disk exhaustion risk
Slot management	N/A	Create, monitor, drop; orphan detection
`max_replication_slots`	N/A	Must be sized for number of tracked sources
`REPLICA IDENTITY` config	N/A	Required on all tracked source tables
Monitoring	Buffer table row counts	Slot lag, WAL retention, decode rate
Extension dependencies	None	Output plugin (`pgoutput`, `wal2json`, or custom)
Upgrade path	`CREATE OR REPLACE FUNCTION`	Slot protocol version compatibility

Key insight: Triggers are operationally simpler by a wide margin. Logical replication introduces a class of failure modes (stuck slots, WAL bloat, replica identity misconfiguration) that require dedicated monitoring and operational runbooks.

2.4 Feature: User Triggers on Stream Tables

This addresses end-user triggers on the output stream tables, not CDC triggers on source tables.

Aspect	Current (Trigger CDC)	With Logical Replication CDC
Feasibility	✅ Achievable via `session_replication_role`	✅ Same mechanism applies
Refresh suppression	`SET LOCAL session_replication_role = 'replica'`	Same
Post-refresh notification	`NOTIFY pg_trickle_refresh` with metadata	Same
MERGE firing pattern	DELETE+INSERT (not UPDATE); must be suppressed	Same — refresh mechanism is independent of CDC

Key insight: User trigger support on stream tables is orthogonal to the CDC mechanism and is now implemented. The solution uses ALTER TABLE ... DISABLE TRIGGER USER / ENABLE TRIGGER USER around FULL refresh (avoiding the session_replication_role conflict with logical replication publishing). In DIFFERENTIAL mode, explicit per-row DML (INSERT/UPDATE/DELETE) is used instead of MERGE so that user-defined AFTER triggers fire correctly. The implementation is controlled by the pg_trickle.user_triggers GUC (auto/ on/off). See PLAN_USER_TRIGGERS_EXPLICIT_DML.md for the full design.

Note: Sections 2.1–2.5 compare creation-time and operational aspects. For a focused steady-state comparison (what matters once the ST exists), see §3.

2.5 Feature: Logical Replication FROM Stream Tables

This addresses end-users subscribing to stream table changes via PostgreSQL’s built-in logical replication.

Aspect	Status	Notes
Basic publishing	✅ Works today	STs are regular heap tables; `CREATE PUBLICATION` works
`__pgt_row_id` column	⚠️ Replicated by default	Use column list in PUBLICATION to exclude, or document as usable PK
Differential refresh	✅ DELETE+INSERT via MERGE are replicated	Subscriber sees individual DELETEs and INSERTs, not UPDATEs
Full refresh	✅ TRUNCATE + INSERT replicated	Subscriber needs `replica_identity` set; receives TRUNCATE + mass INSERT
`REPLICA IDENTITY`	Needs configuration	`__pgt_row_id` could serve as unique index for identity

The `session_replication_role` Conflict

If the refresh engine sets session_replication_role = 'replica' to suppress user triggers (Phase 1 of the user-trigger plan), this may also suppress publication of the DML to logical replication subscribers. When a session is in replica mode, PostgreSQL treats it as a replication subscriber — DML performed in that session may not be forwarded to downstream subscribers (depending on the publication’s publish_via_partition_root and the subscriber’s origin setting).

This is a potential conflict between the two features. Options:

Option	User Triggers Suppressed?	Replication Published?	Drawback
`session_replication_role = 'replica'`	✅ Yes	❌ May not be published	Breaks logical replication from STs
`ALTER TABLE ... DISABLE TRIGGER USER`	✅ Yes	✅ Yes	Requires `ACCESS EXCLUSIVE` lock
`pg_trickle.suppress_user_triggers` GUC → `DISABLE TRIGGER USER` only when needed	✅ Configurable	✅ Yes	Lock overhead; crash safety concern (ENABLE on recovery)
`tgisinternal` flag manipulation	✅ Yes	✅ Yes	Non-portable; catalog-level hack

Recommended resolution: Use ALTER TABLE ... DISABLE TRIGGER USER within a SAVEPOINT, restoring on error. The ACCESS EXCLUSIVE lock is brief (only held for the catalog update, not the entire refresh). If the user has enabled both user triggers AND logical replication on a stream table, this is the only approach that supports both simultaneously. If neither feature is in use, skip the overhead entirely.

3. Separating Creation-Time from Steady-State

The original ADR chose triggers because pg_create_logical_replication_slot() cannot execute inside a transaction that has already performed writes. This report initially treated that constraint as “decisive.” But it deserves scrutiny: the atomicity constraint only affects the create_stream_table() call — a one-time event. Once a stream table exists, CDC runs for hours, days, or months. The steady-state characteristics are what actually matter for performance, correctness, and user experience.

3.1 The Atomicity Constraint Is a Solvable Engineering Problem

The constraint is real but workable. Three approaches exist, all with well-understood trade-offs:

Approach	How It Works	Downside
Two-phase creation	Phase 1: DDL + catalog in one transaction. Phase 2: slot creation in a separate transaction. Rollback Phase 1 artifacts on Phase 2 failure.	Brief window where catalog entry exists without CDC. Cleanup on failure adds ~50 lines of code.
Background worker handoff	Main transaction creates DDL + catalog + temporary trigger. Background worker creates slot asynchronously, then drops trigger.	Race window: changes between COMMIT and slot creation are captured by the temporary trigger, so no data is lost. Adds complexity (~100 lines).
Trigger bootstrap → slot transition	Create with triggers (current approach). After first successful refresh, migrate to logical replication in the background.	Trigger overhead during bootstrap period (minutes). Most natural hybrid approach.

None of these are architecturally difficult. The two-phase approach is straightforward — if slot creation fails, drop the storage table and catalog entry. The temporary-trigger approach eliminates even the theoretical data-loss window. These are engineering inconveniences, not fundamental blockers.

3.2 Steady-State: Triggers vs Logical Replication (Honest Comparison)

Once the stream table exists and CDC is running, here is how the two approaches compare on their actual runtime merits.

In plain terms: With triggers, every time the application writes a row to a tracked source table, the database does extra work right then and there — calling a function, writing to a buffer table, updating an index — all before the application’s transaction can finish. This is like a toll booth on a highway: every car (write) must stop and pay (trigger overhead) before continuing.

With logical replication, the database already writes every change to its internal transaction log (the WAL) as part of normal operation. CDC simply reads that log after the fact, in a separate background process. The application’s writes pass through without stopping — there is no toll booth. The cost of reading the log is paid by the database server, but it happens asynchronously and never slows down the application.

Where Logical Replication Wins (Steady-State)

Dimension	Trigger Impact	Logical Replication Advantage
Write-path latency	Every INSERT/UPDATE/DELETE on a tracked source pays ~2–15 μs synchronous overhead (PL/pgSQL dispatch, buffer INSERT, index update). This is inside the committing transaction’s critical path.	Zero additional write-path cost. WAL writes happen regardless; decoding is asynchronous. Source table DML performance is completely unaffected.
Write amplification	Each source row change produces: (1) source table WAL, (2) buffer table heap write, (3) buffer table WAL, (4) buffer index update, (5) index WAL. ~2–3× total write amplification.	1× — only the source table’s normal WAL. No additional heap writes, no secondary indexes.
TRUNCATE capture	Cannot capture. Row-level triggers don’t fire. Requires a separate statement-level AFTER TRUNCATE workaround (§4) that only marks for reinit — the actual row deletions are invisible to differential mode.	Native. WAL emits TRUNCATE events since PG 11. The decoder receives a clean signal that all rows were removed.
Throughput ceiling	Estimated ~5,000 writes/sec on tracked sources before trigger overhead dominates. PL/pgSQL function dispatch is the bottleneck.	Bounded by WAL throughput — typically 50,000–200,000+ writes/sec depending on hardware and `wal_buffers`.
Connection-pool pressure	Trigger executes in the application’s connection. Long-running trigger INSERTs can increase connection hold time under load.	Decoding runs in a dedicated WAL sender process. Application connections are unaffected.
Vacuum pressure	Buffer tables accumulate dead tuples between cleanups. Each refresh cycle creates bloat that autovacuum must reclaim.	No buffer tables to vacuum. WAL segments are recycled by the WAL management subsystem.
Transaction ID consumption	Each trigger INSERT consumes sub-transaction resources within the outer transaction. High-volume batch operations can cause excessive subtransaction overhead.	No additional transaction work.

Where Triggers Win (Steady-State)

Dimension	Trigger Advantage	Logical Replication Impact
Operational simplicity	No external state to manage. Buffer tables are regular heap tables — queryable, monitorable, backed up normally. Drop the trigger and it’s gone.	Replication slots are persistent server-side state. A stuck or crashed consumer prevents WAL recycling, potentially filling the disk. Requires monitoring, max_slot_wal_keep_size guards, and orphan-slot cleanup.
Zero configuration	Works with any `wal_level` (`minimal`, `replica`, `logical`). No restart required. No `REPLICA IDENTITY` configuration.	Requires `wal_level = logical` (server restart), `max_replication_slots` sizing, and `REPLICA IDENTITY` on every tracked source table. Many managed PostgreSQL providers default to `wal_level = replica`.
Schema evolution	DDL event hooks rebuild the trigger function via `CREATE OR REPLACE FUNCTION`. New columns are added to the buffer table with `ADD COLUMN IF NOT EXISTS`. Simple, same-transaction, no coordination.	Schema changes on tracked tables require careful handling. The output plugin must be aware of column additions/removals. Slot may need to be recreated. `ALTER TABLE` during active decoding can cause protocol errors.
Debugging & visibility	Change buffers are queryable tables: `SELECT * FROM pgtrickle_changes.changes_12345 ORDER BY change_id DESC LIMIT 10`. Immediate visibility into what was captured.	WAL is binary and opaque. Inspecting captured changes requires `pg_logical_slot_peek_changes()` which advances or peeks the slot — disruptive in production.
Crash recovery	Buffer tables are WAL-logged and survive crashes. No special recovery needed — the refresh engine picks up from the last frontier LSN.	Slots survive crashes, but the decoding position may be ahead of what pg_trickle has consumed. Requires careful bookkeeping to avoid replaying or losing changes.
Multi-source coordination	Each source has an independent buffer table. The refresh engine reads from multiple buffers with independent LSN ranges. No coordination between sources.	Multiple sources could share a single slot (decoding all tables) or use per-source slots. Shared slots require demultiplexing; per-source slots multiply the slot management burden.
Isolation	Trigger failure (e.g., buffer table full) raises an error in the application transaction — visible and immediate.	Decoding failure is asynchronous. The application commits successfully, but changes may never reach the buffer. Silent data loss is possible unless monitored.

Neutral (Roughly Equivalent)

Dimension	Notes
Refresh-path performance	Both approaches populate the same buffer table schema. The MERGE/DVM pipeline is identical regardless of how buffers were filled.
Zero-change detection	Triggers: `EXISTS` check on empty buffer (~3 ms). Logical replication: check slot position vs current WAL LSN (~3 ms). Equivalent.
Memory footprint	Triggers: PL/pgSQL function cache per backend. Logical replication: WAL sender process + decoding context. Both are modest.

3.3 When Does Logical Replication Become the Better Choice?

The crossover point depends on workload characteristics:

Scenario	Better Choice	Why
< 1,000 writes/sec on tracked sources	Triggers	Overhead is negligible; operational simplicity dominates
1,000–5,000 writes/sec	Either / Triggers still acceptable	Trigger overhead is measurable but unlikely to be the bottleneck
> 5,000 writes/sec	Logical Replication	Write-path overhead starts to matter; 2–3× write amplification compounds
ETL patterns (TRUNCATE + bulk INSERT)	Logical Replication	Native TRUNCATE capture; no stale-data gap
Wide tables (20+ columns)	Logical Replication	Trigger overhead scales with column count (~5–15 μs); WAL overhead does not
Managed PostgreSQL with `wal_level` restrictions	Triggers	No choice — logical replication may not be available
Many tracked sources (50+)	Logical Replication	Fewer moving parts than 50 triggers + 50 buffer tables + 50 indexes
Need logical replication FROM stream tables	Triggers (with caveats)	see §2.5 — `session_replication_role` conflict with `DISABLE TRIGGER USER` as workaround

3.4 Reassessing the Decision

With the atomicity constraint properly scoped as a creation-time concern, the decision to use triggers rests on three remaining pillars:

Operational simplicity — no wal_level change, no slot management, no REPLICA IDENTITY configuration. This is genuinely valuable for an early-stage extension that needs frictionless adoption.
Debugging visibility — queryable buffer tables are a major developer experience advantage. Being able to SELECT * FROM changes_<oid> during debugging is invaluable.
Zero-config deployment — works on any PostgreSQL 18 instance without server restarts or configuration changes. Critical for managed PostgreSQL environments.

However, these advantages are primarily about developer and operator experience, not about the fundamental capability of the system. A mature pg_trickle deployment that needs high write throughput, TRUNCATE support, or minimal source-table impact would be better served by logical replication in steady-state.

The honest assessment: Triggers are the right choice today for pragmatic reasons (simplicity, early-stage adoption, managed PG compatibility). But the report should not overstate the atomicity constraint as a fundamental blocker — it is a solvable problem. If pg_trickle grows to serve high-throughput production workloads, the migration to logical replication for steady-state CDC should be treated as a planned evolution, not a theoretical future.

4. TRUNCATE: The Gap and How to Close It

This limitation is one of the strongest arguments for logical replication in steady-state — see §3.2 for the comparison.

The TRUNCATE limitation is the most commonly cited drawback of trigger-based CDC. PostgreSQL does not fire row-level triggers for TRUNCATE because TRUNCATE operates at the file level (O(1)) — there are no individual rows to enumerate.

Current Behavior

User runs TRUNCATE source_table
CDC trigger does not fire — change buffer remains empty
Scheduler sees zero changes → NO_DATA → stream table is stale
Stream table shows data from rows that no longer exist

Proposed Fix: Statement-Level AFTER TRUNCATE Trigger

PostgreSQL supports statement-level AFTER TRUNCATE triggers. While they provide no OLD row data, they can mark downstream stream tables for reinitialization:

CREATE TRIGGER pg_trickle_truncate_<oid>
  AFTER TRUNCATE ON <source_table>
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.on_source_truncated('<source_oid>');

The trigger function would: 1. Look up all stream tables that depend on this source 2. Mark them needs_reinit = true in the catalog 3. Cascade transitively to downstream STs

This closes the TRUNCATE gap without changing the CDC architecture. The next scheduler cycle would trigger a FULL refresh automatically.

Effort estimate: ~2–4 hours (trigger creation in cdc.rs, PL/pgSQL or Rust function for on_source_truncated, cascade logic reuse from hooks.rs).

5. Migration Path: Trigger → Logical Replication (Now Implemented)

Status: Phase A (Hybrid Creation) is now implemented in src/wal_decoder.rs. The pg_trickle.cdc_mode GUC controls the behavior (trigger/auto/wal).

As discussed in §3, the atomicity constraint is a creation-time problem with known solutions. The buffer table schema and downstream IVM pipeline are decoupled from the capture mechanism, so migration is isolated to the CDC layer. This should be treated as a planned evolution for high-throughput deployments, not a theoretical future:

Phase A: Hybrid Creation

create_stream_table() continues using triggers for atomic creation
After first successful full refresh, a background worker creates a replication slot and transitions to WAL-based capture
Trigger is dropped; buffer table continues to be populated from WAL decode

Phase B: Steady-State WAL Capture

Background worker runs a logical decoding consumer per tracked source
WAL changes are decoded and written to the same buffer table schema
Downstream pipeline (DVM, MERGE, frontier) is unchanged
TRUNCATE events are captured natively from WAL

Prerequisites

wal_level = logical (must be documented as optional upgrade path)
REPLICA IDENTITY on tracked sources (auto-configured or user-managed)
Custom output plugin or pgoutput + column mapping
Slot health monitoring (WAL retention alerts, orphan cleanup)

Effort estimate: 3–5 weeks for a production-quality implementation.

6. Recommendations

Recommendation 1: Keep Trigger-Based CDC (For Now)

Operational simplicity and zero-config deployment are strong advantages for an early-stage extension. The performance ceiling (~5,000 writes/sec) is adequate for current target use cases. The atomicity constraint, while solvable (see §3.1), adds creation-time complexity that is not yet justified.

However: This decision should be revisited when any of these triggers are hit: (a) users report write-path latency from CDC triggers, (b) TRUNCATE-based ETL patterns become a common pain point, © pg_trickle targets environments where wal_level = logical is already the norm. The steady-state advantages of logical replication (§3.2) are substantial and should not be dismissed.

Recommendation 2: ✅ IMPLEMENTED — User Trigger Suppression

User-defined triggers on stream tables are now fully supported. The implementation uses ALTER TABLE ... DISABLE TRIGGER USER / ENABLE TRIGGER USER around FULL refresh, and explicit per-row DML (INSERT/UPDATE/DELETE) instead of MERGE during DIFFERENTIAL refresh so user AFTER triggers fire correctly. Controlled by pg_trickle.user_triggers GUC (auto/on/off). The session_replication_role approach from the original plan was rejected to avoid conflict with logical replication publishing (see §2.5).

Recommendation 3: Add TRUNCATE Capture Trigger

Add a statement-level AFTER TRUNCATE trigger on each tracked source table that marks downstream STs for reinitialization. This closes the most significant usability gap without changing the CDC architecture.

Recommendation 4: Document Logical Replication FROM Stream Tables

Add documentation and examples for CREATE PUBLICATION on stream tables, including: - Column filtering to exclude __pgt_row_id - REPLICA IDENTITY configuration using __pgt_row_id as unique index - Behavior during FULL vs DIFFERENTIAL refresh - Interaction with user trigger suppression

Recommendation 5: Benchmark Trigger Overhead

Execute the benchmark plan in PLAN_TRIGGERS_OVERHEAD.md to establish data-driven thresholds for the logical replication migration crossover point. The results should feed directly into the §3.3 crossover analysis.

Recommendation 6: ✅ IMPLEMENTED — Hybrid CDC Approach

The “trigger bootstrap → slot transition” pattern is now implemented in src/wal_decoder.rs (1152 lines). The implementation includes:

Automatic transition: After stream table creation with triggers, a background worker creates a logical replication slot and transitions to WAL-based capture.
GUC control: pg_trickle.cdc_mode (trigger/auto/wal) and pg_trickle.wal_transition_timeout control the behavior.
Transition orchestration: Create slot → wait for catch-up → drop trigger. Automatic fallback to triggers if slot creation fails.
Catalog extension: pgt_dependencies gains cdc_mode, slot_name, decoder_confirmed_lsn, transition_started_at columns.
Health monitoring: pgtrickle.check_cdc_health() function and NOTIFY pg_trickle_cdc_transition notifications.

7. Decision Log

#	Decision	Rationale
D1	Keep triggers for CDC on source tables — for now	Zero-config, operational simplicity, adequate for current scale
D2	Atomicity constraint is solvable, not fundamental	Two-phase creation and hybrid bootstrap are proven patterns (§3.1)
D3	Logical replication is superior in steady-state	Zero write overhead, TRUNCATE capture, higher throughput ceiling (§3.2)
D4	User triggers on STs are orthogonal to CDC choice	`session_replication_role` / `DISABLE TRIGGER USER` works with either approach
D5	Logical replication FROM STs works today	Regular heap tables; needs documentation, not code
D6	TRUNCATE gap is closable with statement-level trigger	Low effort, high impact — but logical replication handles it natively
D7	Hybrid approach is the optimal long-term target	Trigger bootstrap for creation + logical replication for steady-state
D8	User trigger suppression uses `DISABLE TRIGGER USER`	Avoids `session_replication_role` conflict with logical replication publishing (§2.5)
D9	Hybrid CDC implemented with auto-transition	`pg_trickle.cdc_mode = 'auto'` triggers → WAL transition after creation
D10	Explicit DML for DIFFERENTIAL refresh with user triggers	INSERT/UPDATE/DELETE instead of MERGE so AFTER triggers fire correctly

PGXN

PostgreSQL Extension Network

Contents

Triggers vs Logical Replication for CDC in pg_trickle

Executive Summary

1. Background

Current Architecture

The Atomicity Constraint

2. Comparison Matrix

2.1 Correctness & Transactional Safety

2.2 Performance

2.3 Operational Complexity

2.4 Feature: User Triggers on Stream Tables

2.5 Feature: Logical Replication FROM Stream Tables

The `session_replication_role` Conflict

3. Separating Creation-Time from Steady-State

3.1 The Atomicity Constraint Is a Solvable Engineering Problem

3.2 Steady-State: Triggers vs Logical Replication (Honest Comparison)

Where Logical Replication Wins (Steady-State)

Where Triggers Win (Steady-State)

Neutral (Roughly Equivalent)

3.3 When Does Logical Replication Become the Better Choice?

3.4 Reassessing the Decision

4. TRUNCATE: The Gap and How to Close It

Current Behavior

Proposed Fix: Statement-Level AFTER TRUNCATE Trigger

5. Migration Path: Trigger → Logical Replication (Now Implemented)

Phase A: Hybrid Creation

Phase B: Steady-State WAL Capture

Prerequisites

6. Recommendations

Recommendation 1: Keep Trigger-Based CDC (For Now)

Recommendation 2: ✅ IMPLEMENTED — User Trigger Suppression

Recommendation 3: Add TRUNCATE Capture Trigger

Recommendation 4: Document Logical Replication FROM Stream Tables

Recommendation 5: Benchmark Trigger Overhead

Recommendation 6: ✅ IMPLEMENTED — Hybrid CDC Approach

7. Decision Log

PGXN

PostgreSQL Extension Network

Contents

Triggers vs Logical Replication for CDC in pg_trickle

Executive Summary

1. Background

Current Architecture

The Atomicity Constraint

2. Comparison Matrix

2.1 Correctness & Transactional Safety

2.2 Performance

2.3 Operational Complexity

2.4 Feature: User Triggers on Stream Tables

2.5 Feature: Logical Replication FROM Stream Tables

The session_replication_role Conflict

3. Separating Creation-Time from Steady-State

3.1 The Atomicity Constraint Is a Solvable Engineering Problem

3.2 Steady-State: Triggers vs Logical Replication (Honest Comparison)

Where Logical Replication Wins (Steady-State)

Where Triggers Win (Steady-State)

Neutral (Roughly Equivalent)

3.3 When Does Logical Replication Become the Better Choice?

3.4 Reassessing the Decision

4. TRUNCATE: The Gap and How to Close It

Current Behavior

Proposed Fix: Statement-Level AFTER TRUNCATE Trigger

5. Migration Path: Trigger → Logical Replication (Now Implemented)

Phase A: Hybrid Creation

Phase B: Steady-State WAL Capture

Prerequisites

6. Recommendations

Recommendation 1: Keep Trigger-Based CDC (For Now)

Recommendation 2: ✅ IMPLEMENTED — User Trigger Suppression

Recommendation 3: Add TRUNCATE Capture Trigger

Recommendation 4: Document Logical Replication FROM Stream Tables

Recommendation 5: Benchmark Trigger Overhead

Recommendation 6: ✅ IMPLEMENTED — Hybrid CDC Approach

7. Decision Log

The `session_replication_role` Conflict