# Introduction

The first step in solving problems in public health and making evidence-based decisions is to collect accurate data and to describe, summarize, and present it in such a way that it can be used to address problems. Information consists of __ data elements__ or

__which represent the variables of interest. When dealing with public health problems the__

*data points*__are most often individual people, although if we were studying differences in medical practice across the US, the subjects, or units of measurement, might be hospitals. A__

*units of measurement*__consists of all subjects of interest, in contrast to a__

*population*__, which is a subset of the population of interest. It is generally not possible to gather information on all members of a population of interest. Instead, we select a sample from the population of interest, and generalizations about the population are based on the assumption that the sample is representative of the population from which it was drawn.__

*sample*

# Learning Objectives

*After completing this module, the student will be able to:*

- Distinguish among dichotomous, ordinal, categorical, and continuous variables.
- Identify appropriate numerical and graphical summaries for each variable type.
- Compute a mean, median, standard deviation, quartiles, and range for a continuous variable.
- Construct a frequency distribution table for dichotomous, categorical, and ordinal variables.
- Give an example of when the mean is a better measure of central tendency (location) than the median.
- Interpret the standard deviation of a continuous variable.
- Generate and interpret a box plot for a continuous variable.
- Generate and interpret side-by-side box plots.
- Differentiate between a histogram and a bar chart.