Data Types in R
R Tutorial – We shall learn about R atomic data types, different R data types, their syntax and example R commands for R data types.
While writing a program, you may need to store your data in variables. And this data might be of different types like Integer, String, Array of Integers etc. Based on these data types, the Operating System stores them in memory in an optimized manner. Data Types also helpful to the programmer for understanding the type of data he is handling or manipulating.
Unlike statistically typed languages, R derives the data type of the variable implicitly from the R object assigned to the variable.
There are many types of R-objects. But all of them are built from R atomic data types. In R programming language there are six atomic data types.
Atomic Vectors of R Atomic Data Types
Following are the six types of vectors that could be built from R atomic data types.
Data Type | Example | Description |
---|---|---|
Logical | TRUE, FALSE | boolean values |
Numeric | 2, 45.9, 3782 | Numbers of all kinds |
Integer | 9L, 779L | Explicitly Integers |
Complex | 8+9i | Real Value + Complex Value |
Character | ‘m’, “hello” | Characters and Strings |
Raw | [68, 65, 6C, 6C,6F] is the value for string hello. | Any data is stored as raw bytes |
Note : When data type is Raw, user has to know the format or protocol of the data.
Examples of Atomic Vectors
We shall run the following commands to assign variables, data of different data types and print the class of the variable to verify the data type.
Logical
> x <- TRUE
> print(class(x))
[1] "logical"
Numeric
> x <- 67.54
> print(class(x))
[1] "numeric"
Integer
x <- 63L
> print(class(x))
[1] "integer"
Complex
> x <- 6 + 4i
> print(class(x))
[1] "complex"
Character
> x <- "hello"
> print(class(x))
[1] "character"
Raw
> x <- charToRaw("hello")
> print(class(x))
[1] "raw"
Data Types of R – Objects
As already mentioned there are many types of R Objects. We shall look into some of the most commonly used data types. They are :
- Vectors
- Lists
- Matrices
- Arrays
- Factors
- Data Frames
We shall in detail about about these data types.
R Vectors
In R programming language, a Vector is a fixed-length collection of values of a data type. The vector would get the data type of items in the collection.
Syntax – Define a Vector
variable <- c(comma separated atomic vectors belonging to a data type)
For example (‘apple’,’orange’,”banana”) is a vector and is a collection of values of data type Character. So the vector would become a Character Vector. Similarly an Integer Vector or Complex Vector.
Following is an example of a Character Vector. We shall learn how to assign an R Character Vector to a variable, print the vector and verify the data type of vector.
> fruits = c('apple','orange',"banana") > print(class(fruits))
[1] "character"
> print(fruits)
[1] "apple" "orange" "banana"
R Lists
In R programming language, a List is a collection of List Items (R Objects) belonging to different data types. A List may contain another list as its item. A List Item may contain a Matrices, an Array, a Factor, an R function or any of R Object.
Syntax to Define List
variable <- list(comma seperated list items)
Following is an example of an R List. We shall learn how to assign a list of Number, Character, Function and another list to a List and print the List.
> listX = list(51,"hello",tan,list(8L,"a")) > print(listX)
[[1]]
[1] 51
[[2]]
[1] "hello"
[[3]]
function (x) .Primitive("tan")
[[4]]
[[4]][[1]]
[1] 8
[[4]][[2]]
[1] "a"
Please observe that fourth and last item in the list is another list.
R Matrices
In R programming language, A Matrix is a 2-D set of data elements. A Vector, number of rows and number of columns could be used to create a Matrix.
Syntax – Define Matrix
variable <- matrix(vector, number of rows, number of columns, split by row or column)
split by row or column : if TRUE then its split by row, else if its FALSE then split by column.
Following is an example to define a matrix :
1. Split by row.
> A = matrix(c(1,2,3,4,5,6,7,8),2,4,TRUE) > print(A)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
2. Split by column.
> A <- matrix(c(1,2,3,4,5,6,7,8),2,4,FALSE) > print(A)
[,1] [,2] [,3] [,4]
[1,] 1 3 5 7
[2,] 2 4 6 8
R Arrays
In R programming language, Arrays are N-Dimensional data sets.
Syntax – Define an R Array
variable <- array(list, dimension)
where list contains the elements of array and dimension is a list containing the information about dimensionality of the array. If dimension is c(2,5,4,8), the array is 4-Dimensional with dimensions 2x5x4x8.
Following is an example of 3-D array.
> A = array(c(1,2,3,4,5,6,7,8,9,10,11,12),c(2,3,2)) > print(A)
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
R Factors
In R programming language, a Factor is a vector along with the distinct values of vector as levels. Factors are useful during statistical modelling.
Levels are stored as R Characters.
Syntax – Define an R Factor
variable <- factor(vector)
Following is an example to define an R Factor
> factorX = factor(c(1,4,7,2,6,7,1,6,4)) > print(factorX)
[1] 1 4 7 2 6 7 1 6 4
Levels: 1 2 4 6 7
R Data Frames
In R programming language, a Data Frame is a set of equal length vectors. The vectors could be of different data types.
Syntax – Define an R Data Frame
variable <- data.frame(listA, listB, listC, .., listN)
Following is an example to define an R Data Frame :
> dataX = data.frame(values = c(21,42,113), RGB = c('red','blue','green')) > print(dataX)
values RGB
1 21 red
2 42 blue
3 113 green
Conclusion
In this R Tutorial, we have learnt about different R atomic data types and different data types of R-Objects used most commonly in R programming language.