In Part 1 we installed R and used it to create a variable and summarise it using a few simple commands. Today let’s re-create that variable and also create a second variable, and see what we can do with them.

As before, we take `height`

to be a variable that describes the heights (in cm) of ten people. Copy and paste the following code to the R command line to create this variable.

`height = c(186, 165, 149, 206, 143, 187, 191, 179, 162, 185)`

Now let’s take `weight`

to be a variable that describes the weights (in kg) of the same ten people. Copy and paste the following code to the R command line to create the `weight`

variable.

`weight = c(89, 56, 60, 116, 51, 75, 84, 78, 67, 85)`

Both variables are now stored in the R workspace. To view them, enter:

`height`

weight

We can now create a simple plot of the two variables as follows:

`plot(weight, height)`

However, this is a rather simple plot and we can embellish it a little. Copy and paste the following code into the R workspace:

`plot(weight, height, pch = 16, cex = 1.3, col = "red", main = "My first plot using R", xlab = "Weight (kg)", ylab = "Height (cm)")`

In the above code, the syntax `pch = 16`

creates solid dots, while `cex = 1.3`

creates dots that are 1.3 times bigger than the default (where `cex = 1`

). More about these commands later.

Now let’s perform a linear regression on the two variables by adding the following text at the command line:

`lm(height ~ weight)`

We see that the intercept is 102.7071 and the slope is 0.9539.

Finally, we can add a best fit line to our plot by adding the following text at the command line:

`abline(102.7071, 0.9539)`

None of this was so difficult! 🙂

In Part 3 we will look again at regression and create more sophisticated plots.

David

#### Annex: R codes used

[code lang=”r”]

# Creating the height variable

height = c(186, 165, 149, 206, 143, 187, 191, 179, 162, 185)

# Creating the weight variable

weight = c(89, 56, 60, 116, 51, 75, 84, 78, 67, 85)

# Show content of both variables

height

weight

# Create a graph (scatterplot) for two variables

plot(weight, height)

# Improved scatterplot for two variables

plot(weight, height, pch = 16, cex = 1.3, col = "red", main = "My first plot using R", xlab = "Weight (kg)", ylab = "Height (cm)")

# Estimating the simple linear regression

lm(height ~ weight)

# Adding regression line on the existing graph

abline(102.7071, 0.9539)

[/code]

Screenshots of the R console with all results: