Quick start with R: Box plots (Part 13)

In Part 13, let’s see how to create boxplots in R. Let’s create a simple boxplot using the boxplot() command, which is easy to use. First, we set up a vector of numbers and then we plot them.
Boxplots can be created for individual variables or for variables by group. The syntax is boxplot(x, data=), where x is a formula and data denotes the data frame providing the data. An example of a formula is: y ~ group, where you create a separate boxplot for each value of group.
Use varwidth=TRUE to make boxplot widths proportional to the square root of the sample sizes. Use horizontal=TRUE to reverse the axis orientation.
The Standard Boxplot does not indicate outliers. The Modified Boxplot highlights outliers. The Modified Boxplot is the default in R.
A <- c(3, 2, 5, 6, 4, 8, 1, 2, 3, 2, 4)
boxplot(A)


R has some built-in datasets. One of them is mtcars dataset. Let’s look at the built-in dataset mtcars. Print the first five rows of this data frame.
head(mtcars, n=5)

Now create a boxplot for vehicle weight for each type of car.
boxplot(wt~cyl, data=mtcars, main=toupper("Vehicle Weight"), font.main=3, cex.main=1.2, xlab="Number of Cylinders", ylab="Weight", font.lab=3, col="darkgreen")

Let’s create a notched boxplot of miles per gallon for each type of car, with different colours for each box. Boxplots help us to to make a visual comparison across levels and check for equality of medians.
boxplot(mpg~cyl, data=mtcars, main=toupper("Fuel Consumption"), font.main=3, cex.main=1.2, col=c("red","blue", "yellow"), xlab="Number of Cylinders", ylab="Miles per Gallon", font.lab=3, notch=TRUE, range = 0)

The notches do not overlap, so we have evidence for a difference in the medians.
NOTE: The range argument determines how far the plot whiskers extend out from the box. If range is positive, the whiskers extend to the datum that is no more than range times the interquartile range from the box. The argument range = 0 ensures that the whiskers extend to the data extremes. The argument horizontal=TRUE creates horizontal bars.
That wasn’t so hard! In Blog 14 we will look at further plotting techniques in R.
See you later!
David

Annex: R codes used

[code lang=”r”]
# Create a vector of numbers.
A <- c(3, 2, 5, 6, 4, 8, 1, 2, 3, 2, 4)
boxplot(A)

# Print the first five rows of mtcars data frame.
head(mtcars, n=5)

# Create a boxplots of vehicle weight for each type of car.
boxplot(wt~cyl, data=mtcars, main=toupper("Vehicle Weight"), font.main=3, cex.main=1.2, xlab="Number of Cylinders", ylab="Weight", font.lab=3, col="darkgreen")

# Create notched boxplots of miles per gallon for each type of car.
boxplot(mpg~cyl, data=mtcars, main=toupper("Fuel Consumption"), font.main=3, cex.main=1.2, col=c("red","blue", "yellow"), xlab="Number of Cylinders", ylab="Miles per Gallon", font.lab=3, notch=TRUE, range = 0)
[/code]