Computation and Analysis of Effect Sizes

The effect size calculated is g, the difference between the means of the intervention group and the control group, or the difference between the pretest and posttest

group means, divided by the pooled standard deviation. The sign of the difference

was positive when a treatment had a positive effect (thus, those that reduced

learning pathologies such as anxiety, surface approaches, and negative attitudes

were coded as positive effects). The gs were converted to ds by correcting them

for bias (as the gs overestimate the population effect size, particularly in small

samples; see Hedges & Olkin, 1985). To determine whether each set of ds shared

a common effect size (i.e., was consistent across the studies),

we calculated a

homogeneity statistic Qw, which has an approximate chi-square distribution with

k - 1 degrees of freedom, where k is the number of effect sizes (Hedges & Olkin,

1985). Given the large number of effect sizes that are combined into the various

categories, and the sensitivity of the chi-square statistic to this number, it is not

surprising that nearly all homogeneity statistics are significant. As the most

critical comparisons are presented in interaction tables between at least two

variables, we are more confident that these means are sufficiently homogeneous

to use the means as reasonable estimates of the typical value.

We then used categorical models to determine the relation between the study

111

This content downloaded on Sun, 3 Feb 2013 08:00:24 AM

All use subject to JSTOR Terms and Conditions

Hattie, Biggs, and Purdie

characteristics and the magnitude of the effect sizes, using the procedures outlined

by Hedges and Olkin (1985). These models provide a between-classes effect

(analogous to a main effect in an ANOVA design) and a test of homogeneity of

the effect sizes within each class. The between-classes effect is estimated by QB'

which has an approximate chi-square distribution with p -

1 degrees of freedom,

where p is the number of classes. The statistical significance of this betweenclasses effect can be used to determine whether the average effect size differs over

classes. The tables reporting tests of categorical models also include the mean

weighted effect size for each class, calculated with each effect size weighted by

the reciprocal of its variance, and the 95% confidence interval of this mean. If this

confidence interval does not include zero, then the mean weighted effect size can

be considered significantly different from zero.