Graphical Displays of Data
Histograms
The simplest display for the shape of a distribution of data can be done using a histogram- a count of how many observations fall within specified divisions ("bins") of the x-axis.
> hist(airquality$Temp)
A sensible number of classes (bins) is usually chosen by R, but a recommendation can be given with the nclass (number of classes) or breaks argument.
> hist(airquality$Temp, breaks = 20)
By choosing breaks as a vector rather than a number, you can have full control over the interval divisions. By default, R plots the frequencies in the histogram, if you would rather plot the relative frequencies, you need to use the argument prob=T.
> hist(airquality$Temp, prob=T)
There are a LOT of options to spruce this up. Here is code for a much nicer histogram
> hist(airquality$Temp,prob=T,main="Temperature")
>
> points(density(airquality$Temp),type="l",col="blue")
>
> rug(airquality$Temp,col="red")
If we want to fit a normal curve over the data, instead of the command density() we can use dnorm() and curve() like so:
> m<-mean(airquality$Temp);std<-sqrt(var(airquality$Temp))
> hist(airquality$Temp,prob=T,main="Temperature")
> curve(dnorm(x, mean=m, sd=std), col="darkblue", lwd=2, add=TRUE)
**Note : You need to make sure that you have prob=T as an argument in your historgram !
If you type help(hist) into the command line, it shows all the possible parameters you can add to a standard histogram. There are a lot of options.
If you want two or more plots in the same window, you can use the command
> par(mfrow=c(#rows,#columns))
With the airquality dataset, we can do
> par(mfrow=c(2,2))
> hist(airquality$Temp, prob=T)
> hist(airquality$Ozone, prob=T)
> hist(airquality$Wind, prob=T)
> hist(airquality$Solar.R, prob=T)
Histograms in R (R Tutorial 2.3) MarinStatsLectures [Contents]