Introduction to Non-Parametric Statistics

Non-parametric statistics are something that I don't recall learning much about in any classes I took—just picked them up from my advisors and literature I read while I was working on my master's thesis—but they're something I think more people should be aware of. I've been wanting to collect my notes on non-parametric statistics for a while and finally got around to it (albeit taking two weeks to do so, instead of sticking to my regular publishing schedule).


Nonparametric statistics differ from better-known statistics in that they do not assume that data points come from a population fitting a normal distribution (or any specified distribution). Many of the techniques work by replacing measured values with their rank. They are useful in situations where the underlying distribution is unknown, or you have data that can be ranked more easily than a precise magnitude can be determined. These types of situations occur commonly in environmental data. I've used non-parametric statistics in previous posts on river ice and the Rio Grande, for example.

D.R. Helsel (2006) discusses an important application for nonparametric statistics—when you have small datasets that include non-detects (values less than the detection limit for an analytical instrument or procedure).

Maximum likelihood methods generally do not work well for small data sets (fewer than 30–50 detected values), in which one or two outliers throw off the estimation, or where there is insufficient evidence to know whether the assumed distribution fits the data well (Singh and Nocerino, 2002; Shumway et al., 2002). For these cases, nonparametric methods that do not assume a specific distribution and shape of data would be preferred. See Helsel (2005b) for a full list of nonparametric procedures for censored data, including variations on the familiar rank-sum test and Kendall’s s correlation coefficient.

Values that are below a detection level (or over-range at the other end) can often still be ranked accurately, making non-parametric statistics a good choice in such a situation.

The Concise Encyclopedia of Statistics (Dodge, 2008) describes nonparametric statistics as follows:

[Nonparametric statistics] allow us to process data from small samples, on variables about which nothing is known concerning their distribution. ...

[They] do not rely on the estimation of parameters (such as the mean or the standard deviation) describing the distribution of the variable of interest in the population. ...

The most frequently used nonparametric tests are Anderson-Darling test, Chi-square test, Kendall's tau, Kolmogorov-Smirnov test, Kruskal-Wallis test, Wilcoxon rank sum test, Spearman's rank correlation coefficient, and Wilcoxon sign rank test.

Nonparametric tests have less power than the appropriate parametric tests, but are more robust when the assumptions underlying the parametric tests are not satisfied.

That final paragraph brings up one of the weaknesses of non-parametric statistics. That is, if the underlying statistical distribution (e.g. normal) is known, you're better off to take advantage of that knowledge by using an appropriate parametric test. I don't think Nate Silver would be a fan of non-parametric statistics because he prefers a Bayesian approach where as much as possible prior knowledge and reasonable assumptions are incorporated.

Some Specific Non-Parametric Statistics

The following subsections describe a few specific nonparametric statistic tests, along with the commands for applying them in R.

Spearman Correlation Coefficient

In a paper all the way back in 1904, Charles Spearman defined a non-parametric alternative to the normal (Pearson) correlation coefficient
(i.e. r2).

It is calculated as (from Dodge, 2008):

ρ = 1 - ((6 Σi=1n   di2)/(n(n2 - 1)))

where di is the difference in rank between corresponding entries in two sets of samples. Observations with the same value are given an average rank to share, but a correction should be made if there are many ties.

In R, the Spearman correlation coefficient can be calculated between two variables, with a test to see if it is statistically significant:

cor.test(var1, var2, method="spearman")

Or it can be calculated between all the variables in a dataset at once:



The Kruskal-Wallis test is a non-parametric ANOVA analogue, used to determine if all groups of samples come from the same population.

In R, the Kruskal-Wallis test can be applied as follows, assuming one variable contains sample observations and another contains a code or label for the group they belong to:

ans <- kruskal.test(obs_var, grp_var)

If the null hypothesis is rejected, some groups can be concluded to differ—you next need to apply a post-hoc test (such as Mann-Whitney/Wilcoxon testing pairwise on promising groups) to determine which ones.


The Wilcoxon signed-rank test is an analogue for the paired t-test. It involves applying the sign of the difference between paired values to the rank of the absolute value of the difference.

In R, it is applied as follows:

wilcox.test(var1, var2, paired=T)


The Mann-Whitney U-test (also known as the Wilcoxon rank sum test, in case you wanted a bit of confusion) is a non-parametric analogue for the t-test (on independent/unpaired samples).

In R, it is applied as follows:

wilcox.test(var1, var2, alternative="two.sided")

Note that this is the same command as the previous test, only without the "paired" condition. It is shown here as a two-sided test (alternative hypothesis: ≠), but can also be specified with a one-sided (greater or less than) alternative hypothesis—see the link to the R manual for details.

Further Reading

See these links for more information on the topic of this post: