Combine Data Frames in R
In this tutorial, we will learn how to merge or combine two data frames in R programming.
Two R data frames can be combined with respect to columns or rows. We will look into both of these ways.
- To combine data frames based on a common column(s), i.e., adding columns of second data frame to the first data frame with respect to a common column(s), you can use merge() function.
- To combine data frames: with rows of second data frame added to those of the first one, you can use rbind() function.
R Combine Data Frames – Merge based on a common column(s)
merge() function is used to merge data frames. The syntax of merge() function is:
merge(x, y, by, by.x, by.y, sort = TRUE)
where
x, y
are data frames, or objects to be coerced or combined to oneby, by.x, by.y
are specifcations of the common columns.sort
logical (TRUE or FALSE). Results are sorted on theby
columns if TRUE and not if FALSE.
Example 1 – Combine Data Frames in R using merge()
In this example, we take two data frames. The first data frame contains id
and name
of students. The second data frame contains id
and marks
of students.
> studentsDF = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> marksDF = data.frame(id=c(1,2,3,4), marks=c(78, 88, 76, 91))
> studentsDF
id name
1 1 John
2 2 Manu
3 3 Surya
4 4 Amith
> marksDF
id marks
1 1 78
2 2 88
3 3 76
4 4 91
You can combine these two data frames with respect to the common column id
using merge()
function.
> studentMarksDF = merge(studentsDF, marksDF, by=c("id"))
> studentMarksDF
id name marks
1 1 John 78
2 2 Manu 88
3 3 Surya 76
4 4 Amith 91
>
The second data frame is added to the first data frame based on a column. The result is a new data frame with new columns.
This is useful when you collect the experimental data from different sources pertaining to the same experiments. Data from a source contains data collected for certain features while other source collects data for other features. Now, using merge(), you can combine these data to get a single data frame containing all the features values of experiments.
R Combine Data Frames – Concatenate Rows of Data Frame to another Data Frame
rbind() function is used to concatenate data frames. The syntax of rbind() function is:
rbind(x, ...)
where
x
an R6Frame...
additional parameters sent to rbind
Example 2 – Combine Data Frames in R using rbind()
In this example, we take two data frames. The first data frame contains id
and name
of students. The second data frame also contains id
and name
of students. Consider that these are two batches of students and we would like to concatenate these.
> studentsDF = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> studentsDF
id name
1 1 John
2 2 Manu
3 3 Surya
4 4 Amith
> studentsSomeMoreDF = data.frame(id=c(5,6,7,8), name=c("Nivin", "Sruthy", "Kiku", "Mahesh"))
> studentsSomeMoreDF
id name
1 5 Nivin
2 6 Sruthy
3 7 Kiku
4 8 Mahesh
>
You can combine these two data frames with respect to rows using rbind()
function.
> allStudentsDF = rbind(studentsDF, studentsSomeMoreDF)
> allStudentsDF
id name
1 1 John
2 2 Manu
3 3 Surya
4 4 Amith
5 5 Nivin
6 6 Sruthy
7 7 Kiku
8 8 Mahesh
>
The rows of second data frame are added to that of first data frame. The result is a new data frame with increased number of rows.
Conclusion
In this R Tutorial, we have learned how to combine R Data Frames based on rows or columns.