Compare Two Data Frames in R
In this tutorial, we will learn how to compare two Data Frames using compare() function.
To compare two R Data frames, there are many possible ways like using compare() function of compare package, or sqldf() function of sqldf package. In this article, we will use inbuilt function, compare() to compare two Data frames.
The syntax of compare() function is
compare(model, comparison,
equal = TRUE,
coerce = allowAll,
shorten = allowAll,
ignoreOrder = allowAll,
ignoreNameCase = allowAll,
ignoreNames = allowAll,
ignoreAttrs = allowAll,
round = FALSE,
ignoreCase = allowAll,
trim = allowAll,
dropLevels = allowAll,
ignoreLevelOrder = allowAll,
ignoreDimOrder = allowAll,
ignoreColOrder = allowAll,
ignoreComponentOrder = allowAll,
colsOnly = !allowAll,
allowAll = FALSE)
where
model
The “correct” object.comparison
The object to be compared with themodel
.equal
Test for equality if test for identity fails.coerce
If objects are not the same, allow coercion of comparsion to model class.shorten
If the length of one object is less than the other, shorten the longer object.ignoreOrder
Ignore the order of values when comparing.ignoreNameCase
Ignore the case of names when comparing.ignoreNames
Ignore names attributes altogether.ignoreAttrs
Ignore attributes altogether.round
If objects are not the same, allow numbers to be rounded.ignoreCase
Ignore the case of string values.trim
Ignore leading and trailing spaces in string values.dropLevels
If factors are not the same, allow unused levels to be dropped.ignoreLevelOrder
Ignore the order of factor levels.ignoreDimOrder
Ignore the order of dimensions when comparing matrices, arrays, or tables.ignoreColOrder
Ignore the order of columns when comparing data frames.ignoreComponentOrder
Ignore the order of components when comparing lists.colsOnly
Only transform columns (not rows) when comparing data frames.allowAll
Allow any sort of transformation (almost; see Details).
The list of arguments is very big. But no worries, we will go through those that are generally used for comparing data frames.
Basic Comparison between two Data Frames
In this case, we will go with the default values and just provide the original (model in argument list) data frame and the comparison data frame.
Consider two data frames, DF1 and DF2 shown below.
> DF1 = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> DF2 = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Tinu"))
> DF1
id name
1 1 John
2 2 Manu
3 3 Surya
4 4 Amith
> DF2
id name
1 1 John
2 2 Manu
3 3 Surya
4 4 Tinu
>
DF1 and DF2 differ in the fourth row name
value.
Now, use compare function with DF1 as model and DF2 as comparison.
> compare(DF1, DF2)
FALSE [TRUE, FALSE]
>
The straight away comparison results in FALSE which is right.
Let us take identical data frames and compare.
> DF1 = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> DF2 = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> compare(DF1, DF2)
TRUE
Conclusion
In this R Tutorial, we have learnt how to compare two Data Frames.