v0.7.0 — Circular DAG Execution, Watermarks, and Observability

Full technical details: v0.7.0.md-full.md

Status: ✅ Released | Scope: Large (~5–6 weeks)

Complete support for circular dependency graphs in stream table refresh scheduling, watermark-based gating for ETL pipelines, major performance improvements, and full Prometheus and Grafana observability.


What problem does this solve?

v0.6.0 laid the groundwork for circular dependencies; v0.7.0 completes the implementation. Independently, ETL pipelines needed a more sophisticated “ready to refresh” signal based on data completeness rather than just elapsed time. And as deployments grew, the need for real-time operational visibility into what pg_trickle was doing became clear.


Circular DAG Execution

Some advanced analytics require stream tables that reference each other — for example, an iterative algorithm where each cycle of a stream table feeds the next iteration. The scheduler can now execute these circular graphs safely by detecting the strongly connected components (the cycle) and applying a fixed-point algorithm: refresh the cycle repeatedly until the results stabilise.

In plain terms: pg_trickle can now maintain stream tables with feedback loops. The scheduler knows when the cycle has converged and stops refreshing, preventing infinite loops.


Watermark Gating for ETL

The pause/resume approach from v0.5.0 works well when an ETL pipeline has a clear “done” signal. But many pipelines produce data continuously, and the stream table should only refresh once data up to a certain point is available.

Watermarks solve this: you can define a watermark expression on a source table (for example, “refresh this stream table only when data up to this timestamp has arrived”), and pg_trickle will hold off on refresh until the watermark condition is met.

In plain terms: if your data arrives continuously from an external system and you want your stream table to only update when data is “complete” up to a certain time boundary, watermarks give you that control without manual pause/resume.


Performance Improvements

Several hot-path optimisations were made:

  • The differential engine’s row-hashing reduced allocation pressure
  • The scheduler’s dependency graph traversal was made O(n) instead of O(n²)
  • Change buffer read at refresh time was batched more efficiently

Prometheus and Grafana Observability

pg_trickle now ships a complete observability stack:

  • Prometheus metrics exported from the background worker — refresh latency, error counts, queue depth, change buffer sizes
  • Grafana dashboard included in the repository — pre-built panels for the key operational metrics
  • Docker Compose stack in monitoring/ for running the full stack locally

In plain terms: you can now see, in real time, how fast your stream tables are refreshing, how large the change backlog is, and whether any errors are occurring — without writing any custom monitoring code.


Infrastructure

  • CNPG (CloudNativePG) documentation and example configuration added
  • PGXN submission prepared
  • Docker images published to a registry

Scope

v0.7.0 is a large release delivering circular DAG execution (a unique capability among IVM systems), watermark-based scheduling, significant performance improvements, and production-grade observability. It represents pg_trickle’s first serious bid for large-scale production deployments.