Analysis of Covariance
Analysis of Covariance
Analysis of covariance (ANCOVA) is a general linear model which blends ANOVA and regression.
ANCOVA evaluates whether the means of a dependent variable (DV) are equal across levels of one or
more categorical independent variables and across one or more continuous variables. For example, the
categorical variable(s) might describe treatment and the continuous variable(s) might be covariates or
nuisance variables; or vice versa. Mathematically, ANCOVA decomposes the variance in the DV into
variance explained by the CV(s), variance explained by the categorical IV, and residual variance.
Intuitively, ANCOVA can be thought of as 'adjusting' the DV by the group means of the CV(s).[1]
The ANCOVA model assumes a linear relationship between the response (DV) and covariate (CV):
In this equation, the DV, is the jth observation under the ith categorical group; the CV, is the jth
observation of the covariate under the ith group. Variables in the model that are derived from the observed
data are (the grand mean) and (the global mean for covariate ). The variables to be fitted are (the
effect of the ith level of the categorical IV), (the slope of the line) and (the associated unobserved
error term for the jth observation in the ith group).
Under this specification, the categorical treatment effects sum to zero The standard
assumptions of the linear regression model are also assumed to hold, as discussed below.[2]
Uses
Increase power
ANCOVA can be used to increase statistical power (the probability a significant difference is found
between groups when one exists) by reducing the within-group error variance.[3] In order to understand
this, it is necessary to understand the test used to evaluate differences between groups, the F-test. The F-test
is computed by dividing the explained variance between groups (e.g., medical recovery differences) by the
unexplained variance within the groups. Thus,
If this value is larger than a critical value, we conclude that there is a significant difference between groups.
Unexplained variance includes error variance (e.g., individual differences), as well as the influence of other
factors. Therefore, the influence of CVs is grouped in the denominator. When we control for the effect of
CVs on the DV, we remove it from the denominator making F larger, thereby increasing our power to find
a significant effect if one exists at all.
Assumptions
There are several key assumptions that underlie the use of ANCOVA and affect interpretation of the
results.[2] The standard linear regression assumptions hold; further we assume that the slope of the covariate
is equal across all treatment groups (homogeneity of regression slopes).
The regression relationship between the dependent variable and concomitant variables must be linear.
The error is a random variable with conditional zero mean and equal variances for different treatment
classes and observations.
The errors are uncorrelated. That is, the error covariance matrix is diagonal.
Assumption 5: homogeneity of
regression slopes
Conducting an ANCOVA
Test multicollinearity
If a CV is highly related to another CV (at a correlation of 0.5 or more), then it will not adjust the DV over
and above the other CV. One or the other should be removed since they are statistically redundant.
Tested by Levene's test of equality of error variances. This is most important after adjustments have been
made, but if you have it before adjustment you are likely to have it afterwards.
To see if the CV significantly interacts with the categorical IV, run an ANCOVA model including both the
IV and the CVxIV interaction term. If the CVxIV interaction is significant, ANCOVA should not be
performed. Instead, Green & Salkind[5] suggest assessing group differences on the DV at particular levels
of the CV. Also consider using a moderated regression analysis, treating the CV and its interaction as
another IV. Alternatively, one could use mediation analyses to determine if the CV accounts for the IV's
effect on the DV.
If the CV×IV interaction is not significant, rerun the ANCOVA without the CV×IV interaction term. In this
analysis, you need to use the adjusted means and adjusted MSerror. The adjusted means (also referred to as
least squares means, LS means, estimated marginal means, or EMM) refer to the group means after
controlling for the influence of the CV on the DV.
Follow-up analyses
Power considerations
While the inclusion of a covariate into an ANOVA generally increases statistical power by accounting for
some of the variance in the dependent variable and thus increasing the ratio of variance explained by the
independent variables, adding a covariate into ANOVA also reduces the degrees of freedom. Accordingly,
adding a covariate which accounts for very little variance in the dependent variable might actually reduce
power.
See also
MANCOVA (Multivariate analysis of covariance)
References
1. Keppel, G. (1991). Design and analysis: A researcher's handbook (3rd ed.). Englewood
Cliffs: Prentice-Hall, Inc.
2. Montgomery, Douglas C. "Design and analysis of experiments" (8th Ed.). John Wiley &
Sons, 2012.
3. Tabachnick, B. G.; Fidell, L. S. (2007). Using Multivariate Statistics (5th ed.). Boston:
Pearson Education.
4. Miller, G. A.; Chapman, J. P. (2001). "Misunderstanding Analysis of Covariance". Journal of
Abnormal Psychology. 110 (1): 40–48. doi:10.1037/0021-843X.110.1.40 (https://doi.org/10.1
037%2F0021-843X.110.1.40). PMID 11261398 (https://pubmed.ncbi.nlm.nih.gov/1126139
8).
5. Green, S. B., & Salkind, N. J. (2011). Using SPSS for Windows and Macintosh: Analyzing
and Understanding Data (6th ed.). Upper Saddle River, NJ: Prentice Hall.
6. Howell, D. C. (2009) Statistical methods for psychology (7th ed.). Belmont: Cengage
Wadsworth.
External links
Examples of all ANOVA and ANCOVA models with up to three treatment factors, including
randomized block, split plot, repeated measures, and Latin squares, and their analysis in R
(https://www.southampton.ac.uk/~cpd/anovas/datasets/index.htm) (University of
Southampton)
One-Way Analysis of Covariance for Independent Samples (http://vassarstats.net/ancova2L.
html)
What is analysis of covariance used for? (https://spss-tutor.com/ancova.php)
Use of covariates in randomized controlled trials by G.J.P. Van Breukelen and K.R.A. Van
Dijk (2007) (http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=129
6348)