 # Introduction

The first step in solving problems in public health and making evidence-based decisions is to collect accurate data and to describe, summarize, and present it in such a way that it can be used to address problems. Information consists of data elements or data points which represent the variables of interest. When dealing with public health problems the units of measurement are most often individual people, although if we were studying differences in medical practice across the US, the subjects, or units of measurement, might be hospitals. A population consists of all subjects of interest, in contrast to a sample, which is a subset of the population of interest. It is generally not possible to gather information on all members of a population of interest. Instead, we select a sample from the population of interest, and generalizations about the population are based on the assumption that the sample is representative of the population from which it was drawn.

# Learning Objectives

After completing this module, the student will be able to:

1. Distinguish among dichotomous, ordinal, categorical, and continuous variables.
2. Identify appropriate numerical and graphical summaries for each variable type.
3. Compute a mean, median, standard deviation, quartiles, and range for a continuous variable.
4. Construct a frequency distribution table for dichotomous, categorical, and ordinal variables.
5. Give an example of when the mean is a better measure of central tendency (location) than the median.
6. Interpret the standard deviation of a continuous variable.
7. Generate and interpret a box plot for a continuous variable.
8. Generate and interpret side-by-side box plots.
9. Differentiate between a histogram and a bar chart.