We calculated fail-safe numbers indicating how many missing studies with an effect size of 0 would have to be published to reduce the overall effect sizes of 0.47 for examination performance and 1.95 for failure rate to preset levels that would be considered small or moderate—in this case, 0.20 and 1.1, respectively.The fail-safe numbers were high: 114 studies on examination performance and 438 studies on failure rate ().

This is the largest and most comprehensive metaanalysis of undergraduate STEM education published to date.

The results raise questions about the continued use of traditional lecturing as a control in research studies, and support active learning as the preferred, empirically validated teaching practice in regular classrooms.

Thus, the overall effect size for examination data appears robust to variation in the methodological rigor of published studies.

The data reported here indicate that active learning increases examination performance by just under half a SD and that lecturing increases failure rates by 55%.

The heterogeneity analyses indicate that () active learning is particularly beneficial in small classes and at increasing performance on concept inventories.

