Effects of Sample Size on Effect Size in Systematic Reviews in Education
(Download Article, PDF: 253 KB)
This study uses data from reviews of evaluations of elementary and secondary mathematics programs to explore the effects of sample size on effect size in program evaluations. There is evidence suggesting that studies with small sample sizes tend to have much larger, positive effect sizes. The issue of large effect sizes in small studies is important if one considers how the What Works Clearinghouse (WWC) rates programs. “Positive effects,” programs will only receive the highest WWC rating if they have at least one study in which there was random assignment of students, classes, or schools to treatments and in which positive effects on important outcomes were statistically significant. There is also an additional requirement of a second matched or randomized study with significant positive effects. Small sample studies that obtain negative or non-significant effects are unlikely to be published or even be accessible to reviewers in the form of technical reports. Because it takes a larger effect size to produce statistical significance in a small study than in a large study, large studies with both small and large positive effects are likely to be accessible while small studies will likely be available only if their effect sizes are large. Sample sizes in studies considered for this article ranged from 30 to about 40, 000. The researchers found out that effect sizes for studies with smaller sample sizes were higher than effect sizes for studies with larger sample sizes and bias due to small sample size was much greater than bias due to lack of random assignment. Variability of effect sizes became smaller with increasing sample size, suggesting that as sample sizes increase, effect sizes become more reliable and less likely to be a result of school, teacher, or class effects. Although programs evaluated in small studies may greatly overstate mean program effects, evidence found in this study does not justify ignoring results of small studies, but does suggests that with similar conditions in place, findings of large studies should be considered as more conclusive evidence of the effects of a given program than the findings of small studies.
Summary by: Michael Muzheve
Submitted on 2008-07-17
< Previous
|
Papers
|
Next >