Writing Simple Functions in R: Why and How
The R language allows the user to create objects of mode function. These are true R functions that are stored in a special internal form and may be used in further expressions and so on. In the process, the language gains enormously in power, convenience and elegance, and learning to write useful functions is one of the main ways to make your use of R comfortable and productive.
It should be emphasized that most of the functions supplied as part of the R system, such as mean()
, var()
, dim()
and so on, are themselves written in R and thus do not differ materially from user written functions.
A function is defined by an assignment of the form
> name <- function(arg_1, arg_2, ...) expression
The expression is an R expression, (usually a grouped expression), that uses the arguments, arg_i, to calculate some value. The value of the expression is the value returned by the function.
A call to the function then usually takes the form name(
expr_1,
expr_2, ...)
and may occur anywhere a function call is legitimate.
Simple Functions
As a first example, consider a function to calculate a one-sample t-statistic to test the null hypothesis that in the height and weight data set, the mean population weight is "x" lb, where "x" can be specified by the user. This is an artificial example, of course, since there are other, simpler ways of achieving the same end (we'll do this next class).
> onesam <- function(y1, x) {
n1 <- length(y1) ##sample size
yb1 <- mean(y1) ##mean of y1
s1 <- var(y1) ##variance of y1
tstat <- (yb1 - x)/sqrt(s1/n1)
##computing t-statistic = (mean-x)/SE
tstat
}
With this function defined, you could perform one-sample t-tests using a call such as
> t.statistic <- onesam(htwtdata$weight, 130); t.statistic
To check whether this function works, compare it to running the actual t-test function inbuilt in R:
> t.test(htwtdata$weight-130)
A function can be called within a loop, or can be applied to elements of a vector or matrix at once, making R very powerful to use. We will continue looking at similar examples throughout the course.
Another example:
min.max.range <- function(x){
minimum<- min(x)
r <- max(x) - min(x)
maximum <- max(x)
print(minimum)
print(maximum)
print(r)
}
vec.1<- c(10, 20, 50)
min.max.range(vec.1)
[1] 10
[1] 50
[1] 40
Write a function called summarystat, which returns the mean, median,and standard deviation of a set of numbers. |
Summary
- Review of matrix/dataframe operations
- if statements, ifelse(), loops
- Conditional indexing
- Merging & sorting dataframes (not on homework)
- Creating functions
Reading
- VS. Chapter 8.1, 8.2, 9 and 10
Assignment
- Homework 2 due, Homework 3 assigned. (With extra credit for those interested in simulation.)