Contributing to ProvSQL

Thank you for your interest in contributing to ProvSQL!

Reporting bugs and requesting features

Please use the GitHub issue tracker. Two issue templates are provided and will auto-fill when you open a new issue: a bug-report form that asks for the information we typically need (PostgreSQL version, ProvSQL version, a minimal SQL reproducer, and optional verbose-mode output) and a feature-request form. For security vulnerabilities, please use the private security advisory flow instead of a public issue — see SECURITY.md.

Developer guide

The developer guide is the authoritative reference for working on ProvSQL’s internals. It covers the PostgreSQL extension concepts ProvSQL relies on, the architecture and component map, the query rewriting pipeline, memory management, the where-provenance and data-modification subsystems, aggregation, semiring and probability evaluation, coding conventions, testing, debugging, and the build system. Read it before submitting a non-trivial pull request.

Development setup

Prerequisites: PostgreSQL ≥ 10 with development headers, a C++17 compiler, uuid-ossp, and the Boost libraries (libboost-dev, libboost-serialization-dev). See the installation guide for details.

make            # build the extension
make install    # install into the PostgreSQL extensions directory

ProvSQL requires shared_preload_libraries = 'provsql' in postgresql.conf (and a server restart) because it installs a planner hook.

For a debug build:

make DEBUG=1

The build system chapter of the developer guide documents the Makefile structure, the generated SQL files, and the CI workflows in detail.

Running the tests

Tests are integration tests against a live PostgreSQL instance:

make install    # as a user with write access to the PostgreSQL directories
service postgresql restart
make test

The testing chapter explains how to write a new test, the schedule files, the alternative-output skip pattern for tests that depend on optional external tools, and how to read regression diffs.

Code organization

The codebase mixes C and C++:

  • PostgreSQL interface code: C (src/*.c)
  • Data structures and algorithms: C++ (src/*.cpp)
  • Generic template algorithms: header-only (src/*.hpp)

SQL API is defined in sql/provsql.common.sql (and sql/provsql.14.sql for PostgreSQL ≥ 14). Both sql/provsql.sql and sql/provsql--<version>.sql are generated by the Makefile — do not edit them directly. Hand-written extension upgrade scripts live in sql/upgrades/provsql--<from>--<to>.sql; the build system chapter of the developer guide explains when and how to write them, and documents the on-disk mmap ABI that upgrades rely on.

The architecture chapter lists every source file in src/ with a one-line description, grouped by subsystem.

Coding conventions

ProvSQL has a small set of project-specific conventions for naming, error reporting, memory management, the C/C++ boundary, and Doxygen comments. They are collected in the coding conventions chapter. The most load-bearing one: every new function, type, class, and SQL function should carry a JavaDoc-style Doxygen comment, and new cross-references from the developer guide should be added to the appropriate map in doc/source/conf.py so the coherence checker can validate them.

Submitting a pull request

  1. Fork the repository and create a branch from master.
  2. Make your changes. If adding a new feature, add a regression test in test/sql/ with expected output in test/expected/.
  3. Ensure make test passes.
  4. If your change touches the documentation, run make docs and confirm the coherence checker (check-doc-links.py) reports OK.
  5. Open a pull request against master with a clear description of what the change does and why.

CI runs on Linux (PostgreSQL 10–18), macOS, and WSL. Failures on any of these block merging.

License

By contributing, you agree that your contributions will be licensed under the MIT License.