Combine Data Frames in R

In this tutorial, we will learn how to merge or combine two data frames in R programming.

Two R data frames can be combined with respect to columns or rows. We will look into both of these ways.

  • To combine data frames based on a common column(s), i.e., adding columns of second data frame to the first data frame with respect to a common column(s), you can use merge() function.
  • To combine data frames: with rows of second data frame added to those of the first one, you can use rbind() function.

R Combine Data Frames – Merge based on a common column(s)

merge() function is used to merge data frames. The syntax of merge() function is:

merge(x, y, by, by.x, by.y, sort = TRUE)

where

  • x, y are data frames, or objects to be coerced or combined to one
  • by, by.x, by.y are specifcations of the common columns.
  • sort logical (TRUE or FALSE). Results are sorted on the by columns if TRUE and not if FALSE.
ADVERTISEMENT

Example 1 – Combine Data Frames in R using merge()

In this example, we take two data frames. The first data frame contains id and name of students. The second data frame contains id and marks of students.

> studentsDF = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> marksDF = data.frame(id=c(1,2,3,4), marks=c(78, 88, 76, 91))
> studentsDF
  id  name
1  1  John
2  2  Manu
3  3 Surya
4  4 Amith
> marksDF
  id marks
1  1    78
2  2    88
3  3    76
4  4    91

You can combine these two data frames with respect to the common column id using merge() function.

> studentMarksDF = merge(studentsDF, marksDF, by=c("id"))
> studentMarksDF
  id  name marks
1  1  John    78
2  2  Manu    88
3  3 Surya    76
4  4 Amith    91
>

The second data frame is added to the first data frame based on a column. The result is a new data frame with new columns.

This is useful when you collect the experimental data from different sources pertaining to the same experiments. Data from a source contains data collected for certain features while other source collects data for other features. Now, using merge(), you can combine these data to get a single data frame containing all the features values of experiments.

R Combine Data Frames – Concatenate Rows of Data Frame to another Data Frame

rbind() function is used to concatenate data frames. The syntax of rbind() function is:

rbind(x, ...)

where

  • x an R6Frame
  • ... additional parameters sent to rbind

Example 2 – Combine Data Frames in R using rbind()

In this example, we take two data frames. The first data frame contains id and name of students. The second data frame also contains id and name of students. Consider that these are two batches of students and we would like to concatenate these.

> studentsDF = data.frame(id=c(1,2,3,4), name=c("John", "Manu", "Surya", "Amith"))
> studentsDF
  id  name
1  1  John
2  2  Manu
3  3 Surya
4  4 Amith
> studentsSomeMoreDF = data.frame(id=c(5,6,7,8), name=c("Nivin", "Sruthy", "Kiku", "Mahesh"))
> studentsSomeMoreDF
  id   name
1  5  Nivin
2  6 Sruthy
3  7   Kiku
4  8 Mahesh
>

You can combine these two data frames with respect to rows using rbind() function.

> allStudentsDF = rbind(studentsDF, studentsSomeMoreDF)
> allStudentsDF
  id   name
1  1   John
2  2   Manu
3  3  Surya
4  4  Amith
5  5  Nivin
6  6 Sruthy
7  7   Kiku
8  8 Mahesh
>

The rows of second data frame are added to that of first data frame. The result is a new data frame with increased number of rows.

Conclusion

In this R Tutorial, we have learned how to combine R Data Frames based on rows or columns.