Install & basics#
Installation#
Install from PyPI (wheels are available for Python 3.7+):
pip install pyversus
The only runtime dependency is DuckDB.
Inputs#
compare() accepts DuckDB relations (tables or views) or pandas/polars
DataFrames. If you provide relations created on a non-default connection,
pass that connection into compare() via con= so helper queries run in
the same session.
>>> import duckdb
>>> from versus import compare
>>> rel_a = duckdb.sql("SELECT 1 AS id, 10 AS value")
>>> rel_b = duckdb.sql("SELECT 1 AS id, 12 AS value")
>>> comparison = compare(rel_a, rel_b, by=["id"])
>>> comparison.summary()
┌────────────────┬─────────┐
│ difference │ found │
│ varchar │ boolean │
├────────────────┼─────────┤
│ value_diffs │ true │
│ unmatched_cols │ false │
│ unmatched_rows │ false │
│ type_diffs │ false │
└────────────────┴─────────┘
Materialization modes#
When you call compare(), Pyversus defines summary tables for the
printed output (tables, by, intersection, unmatched_cols,
unmatched_rows). These are relation-like wrappers that materialize
themselves on print. The input tables are never materialized by Pyversus
in any mode; they stay as DuckDB relations and are queried lazily.
In full materialization, Pyversus also builds a diff table: a single
relation with the by keys plus one boolean flag per value column
indicating a difference. The table only includes rows with at least one
difference. Those precomputed flags let row-level helpers fetch the
differing rows quickly via joins. Other modes skip the diff table and detect
differences inline.
materialize="all": store the summary tables and the diff table as temp tables. This is fastest if you will call row-level helpers multiple times.materialize="summary": store only the summary tables. Row-level helpers run inline predicates and return lazy relations.materialize="none": do not store anything up front. Printing the comparison materializes the summary tables.
Row-level helper outputs are always returned as DuckDB relations and are never materialized automatically; materialize them explicitly if needed.