Chapter 14. Statistical Description of Data

Codes for implementing algorithms mentioned in this chapter are available at my Github repository. Each script runs the respective algorithm with example data.

1. Moments (of a distribution)

Moments are statistics that describe the data’s tendencies to cluster around certain values, or the shape of the distribution. The $n$-th moment of a continuous function $f(x)$ about a value $c$ is:

\[\mu_n = \int_{-\infty}^{\infty} (x-c)^n f(x) \mathop{dx}\]

For a probability density function (pdf), the zeroth moment is obviously $1$. The first moment would be the expected value (or mean), i.e. $\mathrm{E}[X]$ (we usually assume $c = 0$ for the first moment). The second (central) moment would be the variance, i.e. $\mathrm{E}[(X-\mu)^2]$ (we usually assume $c = \mathrm{E}[X]$ for second or higher moments).

For discrete distributions, the first moment (mean) is: $\mu = \frac{1}{N} \sum_{i=1}^N x_i$. The second central moment (variance) is $\mathrm{Var}(X) = \frac{1}{N-1} \sum_{i=1}^N (x_i - \mu)^2$. We use $N-1$ instead of $N$ as denominator to account for the bias in sample variance in comparison to population variance, which is by a factor of $\frac{N-1}{N}$ (see Wikipedia). The square root of the variance is the standard deviation: $\sigma(x) = \sqrt{\mathrm{Var}(X)}$.

The third moment (skewness) is a measure of asymmetry, usually evaluated as $\mathrm{Skew}(X) = \frac{1}{N} \sum_{i=1}^N (\frac{x_i - \mu}{\sigma(x)})^3$. A positive value of skewness suggests that the distribution is right-skewed (or right-tailed), with a long tail extending in the positive $x$-axis direction.

Caution from the book: In real life it is good practice to believe in skewnesses only when they are several or many times as large as $\sqrt{\frac{15}{N}}$ (when $\mu$ is true mean) or $\sqrt{\frac{6}{N}}$ (when $\mu$ is sample mean).

The fourth moment (kurtosis) is a measure of how peaked (leptokurtic) or flat (platykurtic) of a distribution is, compared to the normal distribution. This is usually evaluated as $\mathrm{Kurt}(X) = [ \frac{1}{N} \sum_{i=1}^N (\frac{x_i - \mu}{\sigma(x)})^4 ] - 3$, where the last term ensures that a normal distribution has a kurtosis of $0$.

In practice, roundoff errors could be magnified when computing central moments, for first computing $\mu$ then subtracting it from every sample. To correct these errors, a corrected two-pass algorithm is usually used for large samples, where a second term summing up the roundoff errors is subtracted. For example, the sample variance could be calculated as:

\[\mathrm{Var}(X) = \frac{1}{N-1} \{ \sum_{i=1}^N (x_i - \mu)^2 - \frac{1}{N} [\sum_{i=1}^N (x_i - \mu)]^2 \}\]

Chapter 14. Statistical Description of Data

Welcome, delicious friend!

1. Moments (of a distribution)

2. Comparing Distributions

2.1. Student’s t-test (comparing means)

2.2. F-test (comparing variances)

2.3. Chi-Square Test (comparing histograms)

2.4. Kolmogorov-Smirnov Test (comparing CDFs)