Skip to contents

Compare two data frames. Using a key-column common to both tables, see which rows are common and highlight differing values by column.

Usage

tblcompare(
  .data_a,
  .data_b,
  by,
  allow_bothNA = TRUE,
  ncol_by_out = 3,
  coerce = TRUE
)

value_diffs(comparison, col)

# S3 method for tbcmp_compare
value_diffs(comparison, col)

all_value_diffs(comparison)

# S3 method for tbcmp_compare
all_value_diffs(comparison)

Arguments

.data_a

A data frame or data table

.data_b

A data frame or data table

by

tidy-select. Selection of columns to use when matching rows between .data_a and .data_b. Both data frames must be unique on by.

allow_bothNA

Logical. If TRUE a missing value in both data frames is considered as equal

ncol_by_out

Number of by-columns to include in col_diffs and unmatched_rows output

coerce

Logical. If False only columns with the same class are compared.

comparison

An object of class "tbcmp_compare" (the output of a tablecompare::tablecompare() call)

col

tidy-select. A single column

Value

tblcompare()

A "tbcmp_compare"-class object, which is a list of data.table`s having the following elements:

tables

A data.table with one row per input table showing the number of rows and columns in each.

by

A data.table with one row per by column showing the class of the column in each of the input tables.

summ

A data.table with one row per column common to .data_a and .data_b and columns "n_diffs" showing the number of values which are different between the two tables, "class_a"/"class_b" the class of the column in each table, and "value_diffs" a (nested) data.table showing the rows in each input table where values are unequal, the values in each table, and one column for each of the first ncol_by_out by columns for the identified rows in the input tables.

unmatched_cols

A data.table with one row per column which is in one input table but not the other and columns "table": which table the column appears in, "column": the name of the column, and "class": the class of the column.

unmatched_rows

A data.table which, for each row present in one input table but not the other, contains the columns "table": which table the row appears in, "i" the row number of the input row, and one column for each of the first ncol_by_out by columns for each row.

value_diffs()

A data.table with one row for each element of col found to be unequal between the input tables ( .data_a and .data_b from the original tblcompare() call) The output table has columns "i_a"/"i_b": the row number of the element in the input tables, "val_a"/"val_b": the value of col in the input tables, and one column for each of the first ncol_by_out by columns for the identified rows in the input tables.

all_value_diffs()

A data.table of the value_diffs() output for all columns having at least one value difference, combined row-wise into a single table. To facilitate this combination into a single table, the "val_a" and "val_b" columns are coerced to character.