Basic Statistical Analysis Using the R Statistical Package

Contents

Page 1
Basic Statistical Analysis Using the R Statistical Package
Introduction

Page 2
1.2 The assign operator and inputting a data vector into R

Page 3
1.3 Bringing data into R from an Excel file or a text file
1.3.1 Bringing data into R from an Excel file using the read.csv(file.choose()) command
1.3.2 (Optional) Bringing data into R from an Excel file using the read.csv() command
1.3.3. Accessing individual variables from an imported data set
The 'dataframename$variablename' convention
The attach( ) command
1.3.4 Viewing or editing a data frame using the R data editor
1.3.5 (Optional) Bringing data into R from a space-delimited text file
1.3.6 (Optional) Specifying the default folder for R

Page 4
1.4 Creating new variables in R
1.4.1 Calculating new variables
1.4.2 Creating categorical variables

Page 5
1.5 Saving an R dataframe as a .csv file
1.6 The help( ) and help.search( ) functions

Page 6
1.7 Finding means, medians and standard deviations
1.8 Finding frequencies and proportions for categorical variables

Page 7
1.9 Subgroup analyses: finding means and standard deviations for subgroups

Page 8
1.10 Handling missing data in R

Page 9
1.11 Graphing histograms and box plots

Page 10
1.12 Statistical tables in R
The standard normal (z) distribution
The t distribution
The chi-square distribution

Page 11
2.1 Confidence Intervals for a Single Group
2.1.1 Confidence interval for a mean
2.1.2 Confidence interval for a proportion
Confidence Intervals for Comparing Means
2.1.3 Confidence interval for a difference in means, independent samples
2.1.4 Confidence interval for a mean difference, paired samples
Confidence Intervals for Comparing Frequencies
2.1.5 Confidence interval for the difference in proportions, independent samples
2.1.6 Confidence interval for a risk ratio
2.1.6.1 Confidence interval for a RR from a per-subject data set
2.1.6.2 Inputting counts from a 2x2 table into R for calculation of a RR
2.1.7 Confidence interval for an odds ratio

Page 12
2.2 t-tests for means of measurement outcomes
2.2.1 The one-sample t-test for a mean
2.2.2 The independent samples t-test to compare two means
2.2.3 The paired samples t-test

Page 13
2.3 z-tests for proportions, categorical outcomes
2.3.1 One-sample z-test for a proportion
2.3.2 Two-sample z-test comparing two proportions

Page 14
2.4 One factor ANOVA comparing means across several groups

Page 15
2.5 Chi-square tests for categorical outcoomes
2.5.1 The chi-square goodness-of-fit test for one sample
2.5.2 Contingency table analysis and the chi-square test of independence
2.5.2.1 The chi-square test of independence from per-subject data
2.5.2.2 The chi-square test of independence from tabled data
2.5.2.3 Fisher's exact test for small cell sizes
2.5.2.4 Relative Risk and Confidence interval for the RR
2.5.2.5 Odds ratios and 95% CI for the OR

Page 16
2.6 Nonparametric statistics for comparing medians of non-normal outcomes
2.6.1 Wilcoxon rank sum test for independent samples
2.6.2 Wilcoxon signed rank test for paired samples

Page 17
Section 3: Power and sample size calculations
3.1 Comparing means between groups
3.2 Comparing proportions between groups

Page 18
Section 4: Association between variables and multivariable methods to control for confounding
4.1 Simple correlation and regression
4.1.1 Scatterplots
4.1.2 Correlation
4.1.3 Simple regression analysis
4.1.4 Spearman's nonparametric correlation coefficient

Page 19
4.2 Multiple linear regression for a measurement outcome
4.2.1 Multiple regression analysis
4.2.2 Multiple regression with categorical predictors
4.2.3 Finding standardized regression coefficients in R

Page 20
4.3 Logistic regression for a Yes/No outcome

Page 21
5 Survival Analysis
5.1 Kaplan-Meier plots for one group
5.2 Kaplan-Meier plots and log-rank test for two groups
5.3 Cox's proportional hazards regression for survival data

Page 22
Section 6: p-value adjustment for multiple comparisons

Page 23
Section 7: User-defined functions in R

Content ©2016. All Rights Reserved.
Date last modified: August 2, 2016.
Wayne W. LaMorte, MD, PhD, MPH

Boston University School of Public Health