9.4 R-specific considerations

This page discusses some basic considerations about performing factorial ANOVAs in R.

9.4.1 The afex package

In the sections on one-way ANOVAs we largely stuck to using base R’s aov() function, or the anova_test() function from the rstatix package. While we can absolutely continue to use these for factorial ANOVAs, one problem is that factorial ANOVAs are inherently more complex models than their one-way progenitors. For instance, aov() only really works with balanced designs, which is where every cell (i.e. group x group combination of your IVs/predictors) has the same number of participants.

To use an example, a 2 x 2 ANOVA where each of the four possibilities (e.g. Level 1 Variable A + Level 1 Variable B…) has 20 participants is a balanced design. In contrast, if A1-B2 had 40 participants and the other possibilities had 20 participants, this would be an unbalanced design. Unbalanced designs introduce several complexities for ANOVAs.

The afex package is designed for factorial ANOVAs in particular. It provides a very nice and convenient way of running factorial ANOVAs in a way that’s not too hard, while still being flexible enough for more hardcore R users.

In particular, the function you will want to keep in mind is aov_ez().

9.4.2 Writing interactions in R

The aov_ez() function that was just mentioned doesn’t use the same kind of formula notation that we saw in the other test sections, and that’s to make it easier to run. However, these analyses absolutely can be recreated in formula notation, and you will see many instances of this.

Recall the basic layout for a formula in R:

lm(outcome ~ predictor, data = dataset)
aov(outcome ~ predictor, data = dataset)

To extend this to two variables with no interaction, you can simply add the second predictor as such:

aov(outcome ~ predictor_1 + predictor_2, data = dataset)

To code for an interaction term, however, there either needs to be an explicit term for the interaction (denoted using a colon between the two interaction variables):

aov(outcome ~ predictor_1 + predictor_2 + predictor_1:predictor_2, data = dataset)

OR as the shorthand for above, which uses the asterisk. Note that the asterisk will include both the interaction term and all of its main effects (a la principle of marginality):

aov(outcome ~ predictor_1 * predictor_2, data = dataset)

For what we do in this section, this is ok - but you may find in certain circumstances that you don’t want all of these terms together, so the first approach will let you specify what terms to include in the model.

9.4.3 Simple effects tests in R

Simple effects tests, weirdly, are difficult to do in Jamovi - the gamlj module in Jamovi will do simple effects tests for linear models (which includes ANOVAs), but it’s difficult to do so for anything other than a two-way between-groups ANOVA without some prerequisite knowledge on linear mixed models.

In R, simple effects are relatively easy to estimate through the use of the emmeans package. All we need to do is just extend our emmeans calls that we have been using for one-way ANOVAs to incorporate a second variable.

Imagine an ANOVA model named model_aov and two predictors named A and `B. Recall the basic syntax for generating comparisons for a one-way ANOVA:

emmeans(model_aov, ~ A)

To run this as a simple effects test, we simply need to tell emmeans to conduct these comparisons by another variable. This can be easily done by specifying the by argument in emmeans:

emmeans(model_aov, ~ A, by = "B")

This will estimate the EM means for every level of A, at each level of B. To run simple effects tests for variable two, the variable names simply need to be swapped.

Just like before, we can pipe from from emmeans() to pairs():

emmeans(model_aov, ~ A, by = "B") %>%
  pairs(infer = TRUE)

Final note on this matter: Rstatix does provide a function called emmeans_test() to perform a similar test. However, using the original emmeans package and its functions give you much greater flexibility, so it helps to learn how to use this as is.