Science

Fact-checked

What Is a Nonparametric Test?

Matt Hubbard

Last Modified Date: February 07, 2024

A nonparametric test is a type of statistical hypothesis testing that doesn’t assume a normal distribution. For this reason, nonparametric tests are sometimes referred to as distribution-free. A nonparametric test is more robust than a standard test, generally requires smaller samples, is less likely to be affected by outlying observations and can be applied with fewer assumptions. On the other hand, nonparametric tests can be less efficient than their standard counterparts, particularly if the population truly is normally distributed. Nonparametric testing is particularly effective for questions dealing with frequencies and proportions.

Standard hypothesis testing compares a sample from a test population to a sample from a control population to determine whether the test population is statistically comparable to the control population. If the difference between the sample parameter or parameters — usually the mean and/or variance — is large enough, then the test sample can be judged to be distinct from the control population. Such parametric testing requires that the parameters come from a normal distribution.

It has been mathematically proven that a sample size of 30 or more will behave approximately like a normal distribution, so this requirement is generally assumed. If the assumption isn’t justified, however, the results of the testing might not be valid. Nonparametric testing avoids this assumption.

Instead, nonparametric hypothesis testing commonly examines data either by categorizing it or by ordering it. If the sample and control populations are the same and if the data was gathered correctly, any differences between their categories or rankings are strictly the results of chance. If the probability that those differences could have occurred by random chance, also called a P-value, is less than a chosen significant probability, usually either 5 percent or 1 percent, then the tester rejects the hypothesis that the sample and control populations are the same and concludes that they are different.

One common nonparametric test is a Chi-square test, used to compare observed frequencies or proportions. When only one set of frequencies is examined, this is often called a goodness-of-fit test and is used to determine whether the observed frequencies fit within the range that would be expected. For example, a goodness-of-fit test could be used to determine whether a roulette table had been rigged by comparing table results to the results that probability theory predicts or to determine whether a headache medicine was effective by comparing the proportion of people whose headache improved on the medicine to the proportion of people whose headache improved when they took a placebo. If two frequencies are examined, then the Chi-square nonparametric test can be used to test for correlation or independence between factors. Political pollsters often look for correlation between social, economic or demographic factors and political beliefs, such as seeing whether there is a correlation between a person’s education and whether he or she approves of how an elected official is performing.

Another nonparametric test is the Wilcoxon rank sum test, which generally is used in the same situations as standard parametric hypothesis testing. Instead of examining the mean of each sample, however, the Wilcoxon test examines the rank of each value if the two samples are ordered from least to greatest. If the two samples are the same, each group should be scattered evenly through the ranking. If one group is clustered at the lower or upper end of the ranking, this indicates that the two groups are different.

For example, suppose that someone wanted to determine whether animated movies are longer or shorter than non-animated movies. For a standard test, he or she would determine the average duration for a sample of animated movies and for a sample of live-action movies and compare the difference to the variance of the samples. For the Wilcoxon nonparametric test, the movie times are put in order from least to greatest, and the ranks of the animated movie times are summed.

The person could calculate the probability that the rank sum would be that size or smaller by determining the number of possible orderings with a given rank sum and the total number of possible ordering, a calculation that is simple given enough brute force calculation strength. With two small samples of six movies each, there are already 924 possible arrangements of rankings, a number that rapidly grows much bigger as movies are added. Alternatively, there are published tables that give probabilities corresponding to given rank sums for given sample sizes. These can be found in statistics texts or online.

Nonparametric testing is a growing field. It can be applied in any field in which more conventional statistics have been used as well. Applications are particularly common in the social sciences and medicine, however, particularly when normal distribution cannot apply.