Provenance Export

ProvSQL can export provenance circuits in standard or visual formats for use in external tools or for inspection.

Symbolic Representation

sr_formula returns a symbolic representation of the provenance as a human-readable formula. Each leaf token is replaced by its mapped value, and the circuit operations are rendered using and :

SELECT name, sr_formula(provenance(), 'witness_mapping')
FROM suspects;

This is the simplest way to inspect provenance interactively. See semirings for the full description of semiring evaluation functions.

PROV-XML Export

PROV-XML is a W3C standard for representing provenance information. ProvSQL can serialise provenance circuits to this format using to_provxml:

SELECT provsql.to_provxml(provenance())
FROM mytable;

The function returns an XML document conforming to the PROV-XML schema, representing the provenance circuit rooted at the given token.

Example output (abbreviated):

<prov:document
    xmlns:prov="http://www.w3.org/ns/prov#"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <prov:entity prov:id="..."/>
  <prov:wasDerivedFrom>
    <prov:generatedEntity prov:id="..."/>
    <prov:usedEntity prov:id="..."/>
  </prov:wasDerivedFrom>
</prov:document>

Circuit Visualisation

view_circuit renders a provenance circuit as an ASCII box-art diagram. Internally it writes the circuit in GraphViz DOT format to a temporary file and runs graph-easy --as=boxart on it, returning the result as a text value. The graph-easy executable must be installed and accessible in the PostgreSQL server's PATH.

SELECT provsql.view_circuit(provenance(), 'my_mapping')
FROM mytable
LIMIT 1;

To visualise the circuits for multiple rows simultaneously:

SELECT city, view_circuit(provenance(), 'my_mapping')
FROM (SELECT DISTINCT city FROM personnel) t;

Verbosity

The provsql.verbose_level GUC variable controls diagnostic output. Setting it to 20 is useful when debugging circuit export: intermediate DOT and circuit files are kept on disk instead of being deleted, and the d-DNNF method and gate count are reported. For example:

SET provsql.verbose_level = 20;

The default is 0 (silent). See configuration for the full description of all thresholds.

Circuit Structure

Both export formats represent the same underlying DAG. Gate types that appear in exports:

  • input – input (variable) gate; corresponds to a base tuple
  • plus – semiring addition; generated by UNION and disjunctive conditions
  • times – semiring multiplication; generated by joins and conjunctive conditions
  • monus – m-semiring monus; generated by EXCEPT
  • zero, one – semiring additive and multiplicative identity elements
  • project, eq – where-provenance gates (column projection and equijoin)
  • agg – aggregation gate (for aggregate provenance)
  • semimod – semimodule scalar multiplication (for aggregate provenance)
  • value – scalar value (for aggregate provenance)
  • mulinput – multivalued input (for Boolean provenance)
  • delta – δ-semiring operator DBLP:conf/pods/AmsterdamerDT11
  • update – update operation gate