Plain-language companion: v0.26.0.md

v0.26.0 — Test & Concurrency Hardening

Status: Shipped. Sourced from PLAN_OVERALL_ASSESSMENT_2.md §4, §6, §9.

Release Theme This release closes the test coverage and concurrency gaps identified in the v0.23.0 assessment. The concurrency matrix (ALTER + REFRESH, DROP + REFRESH, parallel-worker duplicate pick) is fully tested. The ARCH-1B refactor completes the src/refresh/mod.rs sub-module migration. New fuzz targets cover the cron parser and CDC trigger payload. The predictive cost model gets an accuracy harness, and the SLA tier assignment gets a damping mechanism to prevent oscillation. Error handling is tightened: typed error variants replace bare pgrx::error! calls in diagnostics and publication paths.

Concurrency Test Matrix

Item Description Effort Ref
CONC-1 Simultaneous ALTER + REFRESH test. E2E test: one connection runs alter_stream_table(query => ...) while another is mid-refresh. Assert no deadlock, catalog stays consistent, refresh either completes or is cleanly aborted. 2d PLAN_OVERALL_ASSESSMENT_2.md §6
CONC-2 Simultaneous DROP + REFRESH test. E2E test: drop_stream_table() while refresh is in progress. Assert clean abort, no orphaned change buffers, no dangling catalog rows. 2d PLAN_OVERALL_ASSESSMENT_2.md §6
CONC-3 Parallel-worker duplicate-pick test. Deterministic E2E: pre-register a slow refresh under one worker, ask the dispatcher for a second task, assert it picks a different ST. 1d PLAN_OVERALL_ASSESSMENT_2.md §6
CONC-4 Concurrent canary promotion race test. Two concurrent refreshes trigger buffer promotion simultaneously; assert exactly one succeeds and metadata is consistent. 1d PLAN_OVERALL_ASSESSMENT_2.md §6

Predictive Model & SLA Stability

Item Description Effort Ref
SLA-1 Predictive cost model accuracy harness. New tests/e2e_predictive_cost_tests.rs with sawtooth, bursty, and single-spike workloads. Assert: (a) model recovers within N samples after outlier, (b) preemption to FULL only fires when actually faster. 3d PLAN_OVERALL_ASSESSMENT_2.md §6
SLA-2 SLA tier oscillation damping. Implement hysteresis: require 3 consecutive breaches before downgrading tier, 3 consecutive successes before upgrading. Property test asserting ≤ 2 transitions per simulated hour. 2d PLAN_OVERALL_ASSESSMENT_2.md §6
SLA-3 SLA tier oscillation property test. Proptest with randomised latency distributions around the SLA boundary. Assert tier stability. 1d PLAN_OVERALL_ASSESSMENT_2.md §6

Fuzz & Scale Testing

Item Description Effort Ref
FUZZ-1 Cron parser fuzz target. fuzz/fuzz_targets/cron_fuzz.rs — pathological input strings for parse_cron_expr(). Guards against DoS. 1d PLAN_OVERALL_ASSESSMENT_2.md §6
FUZZ-2 GUC string→enum fuzz target. Fuzz GUC coercion paths for refresh_mode, cdc_mode, change_buffer_durability, diff_output_format. 1d PLAN_OVERALL_ASSESSMENT_2.md §6
FUZZ-3 CDC trigger payload fuzz target. Fuzz the trigger payload deserialization path in src/cdc.rs with malformed row data. 1d PLAN_OVERALL_ASSESSMENT_2.md §6
SCALE-1 Partition-count scale test. #[ignore]-by-default E2E test creating 1,000 partitions on a source table; assert trigger-install + first refresh completes within 60 s. 1d PLAN_OVERALL_ASSESSMENT_2.md §6
SCALE-2 Multi-DB worker starvation test. E2E: two databases, one floods the worker pool; assert the other’s hot-tier ST still refreshes within SLA. 2d PLAN_OVERALL_ASSESSMENT_2.md §6

Architecture: ARCH-1B Refresh Sub-Module Migration

Item Description Effort Ref
ARCH-1B-1 Migrate refresh orchestration to src/refresh/orchestrator.rs. Move scheduling integration, adaptive mode selection, and reinitialize logic out of mod.rs. 3d PLAN_OVERALL_ASSESSMENT_2.md §4
ARCH-1B-2 Migrate delta SQL generation to src/refresh/codegen.rs. Move template building, DVM codegen, and SQL string construction. 3d PLAN_OVERALL_ASSESSMENT_2.md §4
ARCH-1B-3 Migrate MERGE execution to src/refresh/merge.rs. Move differential, full, and topk MERGE executors. 2d PLAN_OVERALL_ASSESSMENT_2.md §4
ARCH-1B-4 Migrate PH-D1 logic to src/refresh/phd1.rs. Move phantom cleanup strategy; co-locates with EC01-2 cross-cycle cleanup from v0.24.0. 1d PLAN_OVERALL_ASSESSMENT_2.md §4

Error Handling Tightening

Item Description Effort Ref
ERR-1 Typed DiagnosticError variant. Add to src/error.rs; replace bare pgrx::error! in src/api/diagnostics.rs and src/monitor.rs. 1d PLAN_OVERALL_ASSESSMENT_2.md §9
ERR-2 Typed PublicationError variant. Add to src/error.rs; replace bare pgrx::error! in src/api/publication.rs. 0.5d PLAN_OVERALL_ASSESSMENT_2.md §9
ERR-3 Scheduler timestamp errors with HINT. Add HINT (“check system clock”) to 3 bare pgrx::error! calls in src/scheduler.rs for TimestampWithTimeZone construction failures. 0.5d PLAN_OVERALL_ASSESSMENT_2.md §9
ERR-4 Crash-recovery test for downstream publication. Kill postmaster with active stream_table_to_publication() subscriber; restart; verify subscriber catches up with zero data loss. 2d PLAN_OVERALL_ASSESSMENT_2.md §6

Implementation Phases

Phase Description Duration
Phase 1 Concurrency tests: ALTER+REFRESH, DROP+REFRESH, worker duplicate, canary race Days 1–6
Phase 2 Predictive model harness + SLA damping + property tests Days 6–12
Phase 3 Fuzz targets + partition scale test + multi-DB starvation test Days 12–18
Phase 4 ARCH-1B: orchestrator, codegen, merge, phd1 sub-module migration Days 18–27
Phase 5 Error handling: typed variants, HINT context, publication crash test Days 27–31
Phase 6 Integration testing, documentation, upgrade script Days 31–36

v0.26.0 total: ~7–8 weeks (~36 person-days solo)

Exit criteria: - [x] CONC-1: ALTER + REFRESH concurrent test passes without deadlock or corruption - [x] CONC-2: DROP + REFRESH concurrent test passes; no orphaned artifacts - [x] CONC-3: Parallel workers never pick the same ST for simultaneous refresh - [x] CONC-4: Concurrent canary promotion produces consistent metadata - [x] SLA-1: Predictive model accuracy harness: sawtooth, burst, spike workloads all pass - [x] SLA-2: SLA tier oscillation damping: ≤ 2 transitions/hour under boundary workload - [x] SLA-3: SLA tier proptest passes 10,000 iterations - [x] FUZZ-1: Cron parser fuzz target runs 10M iterations without panic - [x] FUZZ-2: GUC coercion fuzz target runs 10M iterations without panic - [x] FUZZ-3: CDC trigger payload fuzz target runs 10M iterations without panic - [x] SCALE-1: 1,000-partition source: trigger install + first refresh < 60 s - [x] SCALE-2: Worker starvation test: hot-tier ST refreshes within SLA despite flooded pool - [x] ARCH-1B-1: src/refresh/orchestrator.rs contains all scheduling/adaptive logic - [x] ARCH-1B-2: src/refresh/codegen.rs contains all delta SQL template construction - [x] ARCH-1B-3: src/refresh/merge.rs contains all MERGE executors - [x] ARCH-1B-4: src/refresh/phd1.rs contains all phantom cleanup logic - [x] ARCH-1B: src/refresh/mod.rs reduced to < 500 LOC (re-exports + shared types) - [x] ERR-1: Zero bare pgrx::error! calls in src/api/diagnostics.rs and src/monitor.rs - [x] ERR-2: Zero bare pgrx::error! calls in src/api/publication.rs - [x] ERR-3: Scheduler timestamp errors include HINT - [x] ERR-4: Publication crash-recovery E2E: subscriber catches up after postmaster restart - [x] Extension upgrade path tested (0.25.0 → 0.26.0) - [x] just check-version-sync passes