Zach’s facts

Zach’s facts have been extracted from the book to remind you of the key concepts you and Zach have learned in each chapter.

Zach's Facts 16.1 Comparing more than two means

You can compare several means from different groups of entities using a linear model, but it often gets referred to as an ANOVA (analysis of variance).
It is a special case of the linear model in which an outcome variable is predicted from membership of three or more groups, and these groups are dummy-coded. The dummy coding represents groups as a series of variables in which groups are coded with 0s or 1s. You use k - 1 dummy variables, where k is the number of groups. Choose a baseline category and assign it 0 on every dummy variable. Other groups are coded with a 1 on one of the dummy variables and 0 on all others. For a particular dummy variable only one group should be coded with a 1, whereas all other groups receive a 0 code.
The b for the predictor variables represents the difference between each group and the baseline category.
The F-statistic, like in any linear model, tests whether the model overall is a significant fit to the data. In this special case that means it tests whether the difference between group means is zero.
If the p-value for F is less than 0.05 then we assume that the differences between means are ‘significant’. We would need follow-up tests to determine which means specifically differ.

Zach's Facts 16.2 Planned contrasts

When you have generated specific hypotheses before the experiment, use planned contrasts.
Each contrast compares two ‘chunks’ of variance. (A chunk can contain one or more groups.)
The first contrast will usually be experimental groups vs. control groups.
The next contrast will be to take one of the chunks that contained more than one group (if there were any) and divide it into two chunks.
You repeat this process: if there are any chunks in previous contrasts that contained more than one group that haven’t already been broken down into smaller chunks, then create a new contrast that breaks it down into smaller chunks.
Carry on creating contrasts until each group has appeared in a chunk on its own in one of your contrasts.
You should end up with one less contrast than the number of experimental conditions. If not, you’ve done it wrong.
In each contrast assign a ‘weight’ to each group that is the value of the number of groups in the opposite chunk in that contrast.
For a given contrast, randomly select one chunk, and for the groups in that chunk change their weights to be negative numbers.

Zach's Facts 16.3 Post hoc tests

When you have no specific hypotheses before the experiment, follow up the main model with post hoc tests.
If you want guaranteed control over the Type I error rate then use Bonferroni.
If sample sizes are slightly different use Gabriel’s test, but if sample sizes are very different use Hochberg’s GT2.
If variances are not equal use the Games–Howell procedure.

Zach's Facts 16.4 One-way ANOVA

The general linear model can also be used to compare several means from different groups of entities.
When you have generated specific hypotheses before the experiment use planned comparisons, but if you don’t have specific hypotheses use post hoc tests. If the overall analysis is not significant you should not interpret these follow-up tests.
There are lots of different post hoc tests: the Bonferroni method is a staple choice, but if there is any doubt about homogeneity of variance use the Games–Howell procedure.
You can test for homogeneity of variance using Levene’s test, but often it is better simply to interpret Welch’s F or the Brown–Forsythe F, which correct for the degree of heterogeneity. If the F has a p-value less than 0.05 this is generally taken to mean that the group means are significantly different.
Planned contrasts and post hoc tests can help to determine specifically how the means differ. If the p-value for a contrast or post hoc test is below 0.05, this is generally taken to mean that the means being compared in that contrast are significantly different.

An Adventure in Statistics: The Reality Enigma