We will now dive a handful of research designs in greater detail, exploring their strengths and weaknesses.
Starting with a selection of experimental designs, which utilize randomization in order to allow for comparison of the intervention group(s) with an equivalent group(s) not exposed to the intervention
Randomization is the key differentiator between experimental designs and quasi-experimental or observational designs.
But what does randomization provide?
Random assignment reduces the possibility that the exposed group(s) and the controls (those not receiving the intervention) have any variance between them that could impact the program effect.
Pretest-posttest randomized control group
When it comes to internal validity, experimental designs are often viewed as the ideal---with the "most" ideal being the pretest-posttest randomized control group design. The structure of this design has been outlined to the right:
- R indicates randomization occurred within that particular group.
- X indicates exposure. So in this case, only one group is the exposed group.
- O indicates observation points where data are collected. Here we see that both groups had data collected at the same time points—pre- and post-exposure time period.
The absence of a dotted line between the exposure and the control group indicates this is a true control group due to randomization.
When executed correctly, this design eliminates all internal validity threats EXCEPT Attrition. Yet, such a design is rarely feasible (cost, time constraints, etc.) and could be unethical in public health evaluation practice.
Fortunately, there are other designs that can help to reduce internal and external threats to validity and still help answer evaluation questions, such as:
- Experimental Designs
- Observational Designs
- Quasi-experimental Designs
In general, the three designs vary in terms of how well the threat of internal validity is reduced. How would you rank them in terms their level of internal validity?
Another experimental design is the Posttest-Only Control Group Design. A few activities are included below to help familiarize you with this study design.
Using the R, X, O structure introduced in the prior design, write out how this post-test only control group design is displayed.
However, it is important to note that the Posttest-Only Control Group design lacks a pretest. As such, effective randomization (confirming the groups are equivalent at baseline) cannot explicitly be verified.
Another drawback of the Postest-Only Control Group Design is that you cannot measure the change within your groups. The absence of the pretest means you can only measure the difference in outcomes between the intervention and controls.
Observational Research Designs
Sometimes, there is not an option to randomize and conduct a more resource intensive experimental design. There are other options, with their own internal threats to validity that the evaluator should consider, including the following selection of Observational designs.
One-Group Posttest-Only Design
This design has no comparison group and utilizes only a posttest to see program effects. In the diagram below, X is the exposure and O1 represents the observation.
Major threats to internal validity with this design are history and maturation.
One-Group Pretest-Posttest Design
Again, this design does not include a comparison group. It also utilizes the pretest-posttest design, allowing you to measure the change in the intervention group from before to after the program.
Maturation and history is still a major threat with this design, due to the lack of a comparison group. The addition of the pretest raises the possibility for some additional threats, specifically testing, instrumentation and regression.
Posttest-Only Comparison Group Design
This design includes a nonequivalent comparison group (indicated by the dashed line). The groups are not randomly assigned, but typically researchers will make efforts to minimize differences between the two groups as much as possible.
The inclusion of a comparison group, even one that is not randomized, helps to rule out some internal threats to validity including history and maturation.
However, the inclusion of a nonequivalent comparison group adds the threat of differential attrition and selection as well as the interactions with selection. Even if the researchers make every effort to have the intervention and comparison groups equivalent, they may still have baseline differences in characteristics.
The absence of a pretest also eliminates some internal threats to validity. Based on what you have learned so far, what internal threats can be ruled out?
Scenario: Post Operative Counseling Program
Here is a situation for you to consider: There is a new post-operative counseling program for knee replacement patients at BU medical aimed at reducing readmissions. You received the data from a six-month follow-up survey. The intervention counseling was given to all patients coming through the hospital. There is no control group.
- Name and write out (with X, O and R as needed) the design.
- Name the two major threats to internal validity.
Quasi-experimental Study Designs
Quasi-experimental studies provide an alternative to experimental and observational designs, falling somewhere in the middle on the internal validity spectrum---providing less than an experimental design but more than an observational study. Quasi-experimental studies can be a useful tool to obtain some of the benefits of an experimental design, when an experimental is not feasible or ethical.
Single time series design
This design is similar to the one group pretest-posttest design, with the addition of multiple observations before and after the program.
These additional observation points help control for more threats to internal validity than the design's one group pretest-posttest predecessor. The multiple observation points help control for maturation, testing and regression due to the tendency for these to diminish or level off over time. History threats are also partially controlled, and instrumentation and attrition could be issues depending on if there are changes to the testing tool or differential patterns of loss to follow up at the various time points respectively.
Multiple time series design
A relative of the single time series design, the multiple time series design adds a comparison group. This group is identified below the dashed line, indicating this is a nonequivalent comparison group, so ws not randomly assigned.
The addition of this comparison group helps to control for the history threat that was only partially controlled in the single time series design. However, the inclusion of a comparison group introduces the possibility of a selection threat. The better matched they are, the less likely selection or attrition will be threats to internal validity.
Recurrent Institutional Cycle Design
When programs are repeated in cycles, such as ongoing programs where individuals and groups come through the program in a repetitive manner, it is possible to set up this type of design. For example, if you had a 1-month alcohol educational program for college students that could only take 20 students at a time. As a combination of several different designs that when together can control for many different internal threats to validity.
In your analyses, assuming you are seeking improvement in your scores, if O2 > O1, O3 > O2 and O3 > O4 this supports that the impacts are due to the program rather than other factors.
While this design is useful for controlling internal validity threats, it does not control for maturation, regression or attrition. And due to the presence of a comparison group, selection is a potential threat as well as interactions with selection.
Nonequivalent comparison group design:
Similar to the one group pretest-posttest design, except with the addition of a comparison group that is not randomized.
Review the following list of threats and determine whether or not it is a likely threat, not a threat, or may be a threat given the nonequivalent comparison group design.
Types of threats: Selection, Attrition, Interaction of Selection and Maturation, Testing, Instrumentation, Regression, Interaction of Selection and History, Interaction of Selection and Instrumentation, History, and Maturation
Summary of Evaluation Designs
These designs are just a selection of the potential designs you could utilize for your evaluation. A good understanding of these and the various threats to internal and external validity that come along with each will help you to choose a design that fits your needs and your resources. No design is perfect, but knowing upfront what some of your challenges are may help you better prepare for your analyses at the end of your study.
Another consideration with your design is the statistical power you will need in order to allow for the possibility of seeing an effect. If the number of participants in your group (or groups) is too small, the statistical test you are using may not be powered properly to find an effect if one truly exists. This error, where the statistical tests indicates the program has no effect when in fact there is an effect, is called Type II error. As such, it is recommended you calculate the necessary sample size for your given design and statistical tests you intend to use in your analysis on your own or consult with a statistician to make sure you design is sufficiently powered.
Below is an excerpt from a Health Bucks report regarding one of the methods that was used in the evaluation.
"For each year of the study period, we compared differences in mean daily EBT revenue among markets that participated in the Health Bucks program and those that did not."
Consider the Health Bucks example. Given the design type, review the following list of threats and determine whether or not it is a likely threat, not a threat, or may be a threat. Note that threats can be external or internal.
Types of threats: Selection, Attrition, Testing, Instrumentation, Regression, History, Maturation, Multiple Treatment Threats, Interaction of Selection and Maturation, Interaction of Selection and Treatment, Interaction of Selection and Instrumentation, Interaction of Selection and History, Interaction of Testing and Treatment, Interaction of Setting/History and Treatment