Linear Regression

Linear Regression is a statistical tool that establishes a linear relationship between given two variables.

This linear relationship is a straight line in X-Y coordinate system, where the straight line is defined by the following equation.

y = mx + c

where m is the slope or coefficient of x, and c is the y-intercept or just intercept.

In R language, lm() function is used to to derive the linear relationship between two variables using given data points for X and Y.

Syntax

The syntax to call lm() function with the data points as x and y vectors is

</>
Copy
lm(y~x)

lm() function returns a relation object.

The relation object contains the coefficient and intercept values, which define a straight line.

Using this relation object, we can predict the value of y, given the value of x and the relation object returned by lm(). The syntax of predict() function to predict the value of y given, relation object and x is

</>
Copy
predict(relation, x_value)

Examples

Linear Regression – Find Relation

In the following example, we take two vectors: x and y, and find the linear relation between these two variables.

example.r

</>
Copy
x <- c(14.2, 17, 13, 18, 12, 13, 17, 15.3, 14.8, 13)
y <- c(4, 6, 3.5, 7, 3, 4, 6, 5, 4.4, 3)

relation <- lm(y~x)
print(relation)

Output

tutorialkart$ Rscript example.r
Call:
lm(formula = y ~ x)

Coefficients:
(Intercept)            x  
    -5.0179       0.6523 

There, the linear (straight line) relation between x and y is

y = 0.6523 x - 5.0179

Prediction

Now, we shall predict the value of y, using this relation, for a value of x.

example.r

</>
Copy
x <- c(14.2, 17, 13, 18, 12, 13, 17, 15.3, 14.8, 13)
y <- c(4, 6, 3.5, 7, 3, 4, 6, 5, 4.4, 3)

relation <- lm(y~x)

x_new <- data.frame(x = 125)
y_predicted <- predict(relation, x_new)

print(y_predicted)

Output

      1 
76.5158 

Plot Data Points and Straight Line

We can plot the straight line and the data points used for finding the linear relation using linear regression technique.

example.r

</>
Copy
x <- c(14.2, 17, 13, 18, 12, 13, 17, 15.3, 14.8, 13)
y <- c(4, 6, 3.5, 7, 3, 4, 6, 5, 4.4, 3)

relation <- lm(y~x)

# save plot to the specified file
png(file = "output.png")

# plot the chart
plot(y,x,col = "blue",main = "Cost vs Temp",
     abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Temp in F",ylab = "Cost in Dollars")

# Save the file.
dev.off()

Output

R - Linear Regression

Conclusion

In this R Tutorial, we learned how to use linear regression technique to find a linear relationship between two variables using lm() function, predict y using the generated relation and input x, plotting the data points and the straight line to a plot.