14.1 Parametric vs non-parametric tests
14.1.1 Introduction
Recall the aim of much of the statistics we do:
When we estimate these parameters using statistical tests, we make certain assumptions about data in order for our tests to be valid. Many of those assumptions involve some degree of normality - whether the data/outcome needs to be normally distributed, or the residuals in the model need to be normally distributed. The tests that we cover in this subject - t-tests and ANOVAs especially - are called parametric tests, because they make an assumption about the distribution. But what happens if those assumptions aren’t met?
14.1.2 Non-parametric tests
Non-parametric tests do not make assumptions about the underlying distributions of data (and hence are sometimes called distribution-free tests). Instead, they are more general tests that make the following (broad) hypotheses:
- \(H_0\): The underlying distributions are equal
- \(H_1\): The underlying distributions are not equal
So when should they be used, and what are their pros and cons? In general, non-parametric tests should be considered when a) assumptions for parametric tests are not met and b) you are working with small samples. As noted below, with large samples a lot of the parametric assumptions in tests are fairly robust (unless deviations are particularly severe).
Below are the non-parametric equivalents to the major tests that we cover, with their associated datasets.
Note that chi-square tests are already non-parametric, while non-parametric regression is just a headache.