EasyFit – Easily Fit Distributions to Your Data!

Kolmogorov-Smirnov Test

This test is used to decide if a sample comes from a hypothesized continuous distribution. It is based on the empirical cumulative distribution function (ECDF). Assume that we have a random sample x1, ... , xn from some distribution with CDF F(x). The empirical CDF is denoted by

Definition

The Kolmogorov-Smirnov statistic (D) is based on the largest vertical difference between the theoretical and the empirical cumulative distribution function:

Hypothesis Testing

The null and the alternative hypotheses are:

• H0: the data follow the specified distribution;
• HA: the data do not follow the specified distribution.

The hypothesis regarding the distributional form is rejected at the chosen significance level () if the test statistic, D, is greater than the critical value obtained from a table. The fixed values of (0.01, 0.05 etc.) are generally used to evaluate the null hypothesis (H0) at various significance levels. A value of 0.05 is typically used for most applications, however, in some critical industries, a lower value may be applied.

The standard tables of critical values used for this test are only valid when testing whether a data set is from a completely specified distribution. If one or more distribution parameters are estimated, the results will be conservative: the actual significance level will be smaller than that given by the standard tables, and the probability that the fit will be rejected in error will be lower.

P-Value

The P-value, in contrast to fixed values, is calculated based on the test statistic, and denotes the threshold value of the significane level in the sense that the null hypothesis (H0) will be accepted for all values of less than the P-value. For example, if P=0.025, the null hypothesis will be accepted at all significance levels less than P (i.e. 0.01 and 0.02), and rejected at higher levels, including 0.05 and 0.1.

The P-value can be useful, in particular, when the null hypothesis is rejected at all predefined significance levels, and you need to know at which level it could be accepted.

EasyFit displays the P-values based on the Kolmogorov-Smirnov test statistics (D) calculated for each fitted distribution.