Basic Statistical Analysis Using the R Statistical Package
Contents
Page 1
Basic Statistical Analysis Using the R Statistical Package
Introduction
Page 2
1.2 The assign operator and inputting a data vector into R
Page 3
1.3 Bringing data into R from an Excel file or a text file
1.3.1 Bringing data into R from an Excel file using the read.csv(file.choose()) command
1.3.2 (Optional) Bringing data into R from an Excel file using the read.csv() command
1.3.3. Accessing individual variables from an imported data set
The 'dataframename$variablename' convention
The attach( ) command
1.3.4 Viewing or editing a data frame using the R data editor
1.3.5 (Optional) Bringing data into R from a space-delimited text file
1.3.6 (Optional) Specifying the default folder for R
Page 4
1.4 Creating new variables in R
1.4.1 Calculating new variables
1.4.2 Creating categorical variables
Page 5
1.5 Saving an R dataframe as a .csv file
1.6 The help( ) and help.search( ) functions
Page 6
1.7 Finding means, medians and standard deviations
1.8 Finding frequencies and proportions for categorical variables
Page 7
1.9 Subgroup analyses: finding means and standard deviations for subgroups
Page 8
1.10 Handling missing data in R
Page 9
1.11 Graphing histograms and box plots
Page 10
1.12 Statistical tables in R
The standard normal (z) distribution
The t distribution
The chi-square distribution
Page 11
2.1 Confidence Intervals for a Single Group
2.1.1 Confidence interval for a mean
2.1.2 Confidence interval for a proportion
Confidence Intervals for Comparing Means
2.1.3 Confidence interval for a difference in means, independent samples
2.1.4 Confidence interval for a mean difference, paired samples
Confidence Intervals for Comparing Frequencies
2.1.5 Confidence interval for the difference in proportions, independent samples
2.1.6 Confidence interval for a risk ratio
2.1.6.1 Confidence interval for a RR from a per-subject data set
2.1.6.2 Inputting counts from a 2x2 table into R for calculation of a RR
2.1.7 Confidence interval for an odds ratio
Page 12
2.2 t-tests for means of measurement outcomes
2.2.1 The one-sample t-test for a mean
2.2.2 The independent samples t-test to compare two means
2.2.3 The paired samples t-test
Page 13
2.3 z-tests for proportions, categorical outcomes
2.3.1 One-sample z-test for a proportion
2.3.2 Two-sample z-test comparing two proportions
Page 14
2.4 One factor ANOVA comparing means across several groups
Page 15
2.5 Chi-square tests for categorical outcoomes
2.5.1 The chi-square goodness-of-fit test for one sample
2.5.2 Contingency table analysis and the chi-square test of independence
2.5.2.1 The chi-square test of independence from per-subject data
2.5.2.2 The chi-square test of independence from tabled data
2.5.2.3 Fisher's exact test for small cell sizes
2.5.2.4 Relative Risk and Confidence interval for the RR
2.5.2.5 Odds ratios and 95% CI for the OR
Page 16
2.6 Nonparametric statistics for comparing medians of non-normal outcomes
2.6.1 Wilcoxon rank sum test for independent samples
2.6.2 Wilcoxon signed rank test for paired samples
Page 17
Section 3: Power and sample size calculations
3.1 Comparing means between groups
3.2 Comparing proportions between groups
Page 18
Section 4: Association between variables and multivariable methods to control for confounding
4.1 Simple correlation and regression
4.1.1 Scatterplots
4.1.2 Correlation
4.1.3 Simple regression analysis
4.1.4 Spearman's nonparametric correlation coefficient
Page 19
4.2 Multiple linear regression for a measurement outcome
4.2.1 Multiple regression analysis
4.2.2 Multiple regression with categorical predictors
4.2.3 Finding standardized regression coefficients in R
Page 20
4.3 Logistic regression for a Yes/No outcome
Page 21
5 Survival Analysis
5.1 Kaplan-Meier plots for one group
5.2 Kaplan-Meier plots and log-rank test for two groups
5.3 Cox's proportional hazards regression for survival data
Page 22
Section 6: p-value adjustment for multiple comparisons
Page 23
Section 7: User-defined functions in R
return to top