Graphical Displays of Data

 

Histograms


The simplest display for the shape of a distribution of data can be done using a histogram- a count of how many observations fall within specified divisions ("bins") of the x-axis.

> hist(airquality$Temp)

 

A sensible number of classes (bins) is usually chosen by R, but a recommendation can be given with the nclass (number of classes) or breaks argument.

> hist(airquality$Temp, breaks = 20)

 

By choosing breaks as a vector rather than a number, you can have full control over the interval divisions. By default, R plots the frequencies in the histogram, if you would rather plot the relative frequencies, you need to use the argument prob=T.

> hist(airquality$Temp, prob=T)

 

 There are a LOT of options to spruce this up. Here is code for a much nicer histogram 

> hist(airquality$Temp,prob=T,main="Temperature")

> points(density(airquality$Temp),type="l",col="blue")

> rug(airquality$Temp,col="red")

 

If we want to fit a normal curve over the data, instead of the command density() we can use dnorm() and curve() like so:

> m<-mean(airquality$Temp);std<-sqrt(var(airquality$Temp))

> hist(airquality$Temp,prob=T,main="Temperature")


> curve(dnorm(x, mean=m, sd=std), col="darkblue", lwd=2, add=TRUE)

**Note : You need to make sure that you have prob=T as an argument in your historgram !

If you type help(hist) into the command line, it shows all the possible parameters you can add to a standard histogram. There are a lot of options.

If you want two or more plots in the same window, you can use the command

> par(mfrow=c(#rows,#columns))

With the airquality dataset, we can do

> par(mfrow=c(2,2))

> hist(airquality$Temp, prob=T)

> hist(airquality$Ozone, prob=T)

> hist(airquality$Wind, prob=T)

> hist(airquality$Solar.R, prob=T)

Histograms in R (R Tutorial 2.3) MarinStatsLectures [Contents]

alternative accessible content