Appendix A
Comparison Group Evaluation Designs


Evaluation research has taken two, not indistinct, paths in addressing the complexities inherent in the research situation.  First, and most commonly known, evaluators use methods to further 'control' the research situation.  Control strategies begin with what researchers call the counterfactual (Hollister and Hill, 1995 pp. 128).  A counterfactual is a question that asks what would have happened in the absence of the program initiative.  A possible counterfactual for outreach programs might ask "what would have happened to students if they didn't participate in the outreach program?"16.  Once the counterfactual is established, the evaluator attempts to control the research setting to answer the question.  Controls are required to isolate the influence of program components and remove other variables thought to influence the results.

Evaluators use two main types of controls.  In random assignment, "individuals or units that are potential candidates for the intervention are randomly assigned to be in the treatment group, which is subject to the intervention, or the control group, which is not subject to any special intervention." (Hollister and Hill, 134).  The advantage of random assignment is that with even a moderate number of participants, the chances, statistically, are great the groups will have similar characteristics.  In researcher terms, this means that the evaluation does not suffer from selection bias.  When comparisons are made between the control and treatment groups, researchers and audiences can conclude that the difference is due to the program and not other variables.

Any enthusiasm for random assignment must be weighed against the difficulty in designing research according to this standard.  On one hand, there are always ethical issues in determining who gets the treatment and who is made a control.  A further problem is that many outreach programs are voluntary with students selecting themselves into programs.  Random assignment in these cases is difficult unless there is greater demand for programs than supply.  In this case a "waiting list" approach can be used with participants in the treatment group and those waiting providing the control.  Alternatively, students can be randomly assigned to different programs with different programmatic emphases.  In this case, the two programs can be compared with a reasonable assumption that the observed outcomes can be attributed to the respective programs.17

More commonly used for controls are what Hill & Hollister call constructed comparison groups (135).  Constructed comparisons attempt to replicate the conditions of random assignment by carefully matching a treatment and a control group to isolate the impact of the program.  Three types of constructed comparison groups are used.  The first before or after designees where the treatment group is compared to itself prior to program participation.  A second type of constructed group compares program participants and non-participants.  These groups are selected to be as similar as possible to each other, with the exception that one group has received the treatment and the other has not.  In school-based projects, the control group can be selected from the same school or different schools within a district/state.  In the latter case, school and/or district and state characteristics are also controlled for.  A third type of constructed group can be created by using survey data.  As above, the characteristics of the control group are carefully matched to the treatment group to avoid bias (Hill & Hollister, 135-138).

The three types of constructed comparison groups each suffer from some difficulties and biases.  The general consensus in the evaluation literature is to let the research situation dictate which of the three may be better suited to a particular evaluation.  Before-and-after designs, by necessity, assume that any observed change in individuals is a result of the program or intervention.  This may not be a tenable assumption in all cases.

Comparisons between participants and non-participants or groups constructed from surveys and databases are only as good as the match made between the two groups.  Even if the groups are well matched, it is difficult to control for unobserved variables.  Within outreach programs, one such unobserved variable is student motivation.  Are participants more motivated to attend college than non-participants?  Might this affect how they score on the various performance indicators?18  The general principle is that without random assignment, the selection in and out of comparison groups may bias the study.  In most cases, this bias will overestimate the effects of program relative to non-program factors.

BACK  |  HOME  |  NEXT