|
V. Evaluating Outreach Programs
This chapter addresses the evaluation of outreach programs. We begin with a review of evaluation procedures and designs currently used to evaluate outreach programs. Next, we suggest some essential principles to follow in evaluating student-centered and school-centered programs. Finally we present a multi-level research agenda for addressing key questions about the effectiveness of programs.
A Review of Evaluations of Outreach Programs
We have noted that current evaluation data are of limited value in making policy decisions about future outreach strategies. In this section we present an analysis of current evaluations to illustrate some of the difficulties as well as some promising practices that illustrate how the difficulties might be overcome. To assess existing evaluation practices for outreach programs, we reviewed evaluations of an array of programs in California and in other states. While these programs use a wide array of approaches to evaluate outreach activities, a number of common features emerge.
One feature these studies share is that long-term outcomes are rarely measured. For the most part, the indicators in these studies are short and intermediate outcomes drop-out rates, course-taking patterns, test scores rather than rates of college attendance and graduation. Many programs target students in the middle grades, some even earlier. It may be four to six years before a student enters college and another four to six years before she completes college. Most programs do not have the resources to track students for this extended period of time.
Following students over an extended period is possible: an evaluation of AVID, for example, followed a group of students for four years to examine their rates of college attendance (Mehan, et al, 1994). Significant evaluation resources are needed to conduct these types of evaluations systematically. Developing indicators for college completion and career attainment is something that none of the programs reviewed had attempted to do.
Most of the evaluations reviewed were not explicit about how the various program components and performance indicators fit together. One evaluation of the College Readiness Program examined the recommendations made by teachers for 9th grade college preparation course enrollment, comparing students in the program with a comparable group of students who did not participate. The study found that the students in the program were more likely to be recommended for placement in college preparatory courses, with better results in math than in English. Why or how did these changes come about? Why were results better in math? These questions were not asked. While any answers to these types of questions are not likely to be definitive, hypothetical answers rooted in a theory of what may have prompted the changes are possible to attain. Programs, however, are often under pressure to show "results" (frequently, numbers of students or schools participating). Process outcomes, even unexplained ones, may be a great priority than an examination of the underlying theory. Incentives, therefore, need to be developed to encourage programs to examine their practices more reflectively in their evaluations.
Another common feature of evaluations of outreach programs is that virtually none attempt to make systematic comparisons among program components. One reason for this may be a basic constraint of having only a single program to evaluate. Comparison in a single program is possible, but not easy. It would require comparing students given different combinations of program "treatments," within the same broad outreach intervention. One reason this is not done may be cost; more pervasive may be the attachment programs feel to their mix of services and approaches. Getting programs to compare school-centered and student-centered interventions, or middle school and high school focused programs may be difficult. Comments from local program staff in the evaluations, however, do suggest that such comparisons are desirable.
In none of the evaluations examined were random assignments to treatment and comparison groups implemented. As is the case with many social policy interventions, random assignment may not be feasible for outreach program evaluation. The majority of evaluations used some type of constructed comparison group design. These designs were carefully and rigorously implemented, but suffer, to varying degrees, from the basic limitation of constructed groups: the influence of unobserved variables.
One evaluation that illustrates some of the difficulties in constructing comparison groups is Florida's evaluation of its College Reach Out Program (CROP). CROP is a statewide program that attempts to increase the number of "economically and academically disadvantaged youth" completing post-secondary education (PSEC, 1994, pp.2). CROP tries to meet this objective by strengthening the motivation and academic preparation of participating students. The program is run by local consortia of schools and secondary institutions awarded contracts through a competitive grant process. In 1993 there were 25 local projects serving 4,799 students in middle and high school grades (PSEC, 1993, 1994).
CROP has been evaluated in each of the last two years. A "random sample" of students from grades 6-12 in Florida's public schools was taken with CROP participants compared to the rest of the sample on a number of short and long-term indicators. Examples of these indicators used were promotion to higher grades, better college-prep course-taking patterns and rates of college attendance. The CROP students performed better on nearly all of the indicators compared to the group not participating in the program.
As a research design the CROP evaluation is rigorous, its findings persuasive. However, as with most evaluations, some difficulties remain. An important point to note is that "random sampling" is not the same thing as random assignment. For random assignment conditions to be met, individuals who are candidates for a program must be randomly placed in either treatment or control groups, groups that can be then compared without bias. The random sample used in the CROP evaluation samples a number of students from the larger universe of Florida's secondary school attendees, and then compares outreach program participants and non-participants. This design is a modified version of the constructed group studies discussed earlier. Whatever its persuasive properties, the study is not free of the potential bias involved in all constructed group studies.
Does this bias matter for the formulation of outreach program policy? The counterfactual that CROP and most outreach program evaluations operate under is something akin to: what would have happened to the students if they did not participate in the program? The question is usually answered by comparing patterns of college preparation activity and college attendance among two groups of students: outreach program participants and non-participants. In most of the studies reviewed, comparison groups were constructed based on socio-economic criteria. Additionally, the selection criteria for many California programs is that program participants already have the "ability and preparation" to do college work (CRP, 1994). Comparing these students with others from similar economic or racial background, in cases where student participation is voluntary, may lead to constructed groups of students that are not necessarily comparable. Outreach program participants may come in with significantly greater preparation and motivation for college.
In the CROP program, the group of program participants is not matched with a comparable demographic group. CROP students, on average, are poorer, have lower levels of family education, weaker grades and test scores before entering the program. A larger percentage of the students are African-American or Latino than in the comparison group. CROP students are selected for the program by the local consortia. Among the criteria for admission are: first generation college student, a GPA of 2.5 or below for the previous school year, no college preparation courses on transcript, and strict income criteria and poverty-level guidelines (PSEC, 1994).
Taken as a whole, the features of the students in the CROP program allow for the counterfactual to be reasonably evaluated. That these students score better after program participation than a comparable group of students, who, absent CROP, are predicted to do better on the indicators, is persuasive evidence of CROP's impact. In this case, any potential biases are at least partially overcome by a careful evaluation design.
BACK | HOME | NEXT
|
|