9.2 Factorial designs
In Week 9, when we talk about ANOVAs we conduct one-way ANOVAs. These tests from Week 9 are called one-way ANOVAs because there is only one IV with multiple levels being tested. However, in many research designs we will want to test the effect of two or more categorical variables at the same time (for example, experimental conditions or variables that capture important categories, such as participant sex).
When we want to test the effect of more than one IV, we start getting into what we call factorial ANOVAs. Factorial ANOVAs are used when we have two or more IVs, each with at least two levels per variable. This is common in a lot of research designs, where either multiple categorical variables are collected as part of the data collection phase or categorical variables are created as part of the analysis process.
We will talk about this more on the next page, but factorial designs are particularly useful for testing interactions, and the effects of both of your IVs together.
When reporting results for factorial designs, it is expected that you report how many levels each variable had. For instance, let’s say your two IVs are participant sex (male and female) and experimental group (group A, B and C). If you were to test this factorial design, there are a couple of ways you could report this:
- A Sex (2) x Group (3) factorial ANOVA
- A 2 x 3 factorial ANOVA
- A Sex x Group (2 x 3) factorial ANOVA
The first one is the preferable because it lays out the conditions clearly.
9.2.1 What is an interaction?
Pretend you’ve been enacting a singing intervention in a school of kids, where one group of kids have been singing daily and another group have not been. You’re interested in whether the singing intervention has an effect on their wellbeing. By and large, the singing intervention does - there is a clear difference between the kids who get singing sessions and kids who don’t. However, you notice that how effective the intervention is depends on whether they are boys or girls. The girls appear to benefit the most, while the boys don’t seem to as much. In other words, the effectiveness of the intervention is contingent on the biological sex of the child.
This is an example of an interaction, where the effect of one IV depends on the effect of another IV. The consequence of an interaction is that the two IVs both influence the DV together (in a non-additive manner). Interactions can be important for understanding how certain phenomena work.
Consider the two plots below, that show the relationship between two predictors (X and Group) and one outcome (on the y-axis).
- In the graph on the left, there is a clear difference between groups 1 and 2. There is also a clear difference between A, B and C on X; however, this is constant.
- In the graph on the right, there’s still a clear difference between groups 1 and 2. However, the difference is greater between different groups. For group 1, there is no change from A to C, but there is for group 2; in other words, the effect of X depends on the effect of Group.
The easiest way of demonstrating an interaction is by using an interaction plot, like the one above. This kind of graph plots means as dots, and joins different groups/IVs together by lines. Interaction plots with error bars (e.g. +/- 1 standard error) provide the clearest way of graphing of an interaction effect.
9.2.2 Testing for interactions
We can test for interactions when we have at least two independent variables/predictors, using both ANOVAs and regressions. The majority of this module will focus on instances with two predictors in an ANOVA context, as they are easiest to conceptualise.
By default, if we have two predictors - A, and B, and an outcome, Y - our model will have the following terms:
- A, or the main effect of A (i.e. of A only)
- B, the other main effect
- A x B, which is our interaction effect
Therefore, we end up with two types of effects that we need to interpret: main effects, and interaction effects. An interaction effect is what we call a higher-order term, in that it is a more complex term in our model. We test the significance of each term, giving us three p-values and sets of test statistics.