# Chi-square distribution

Template:Probability distribution For any positive integer [itex]k[itex], the chi-square distribution with [itex]k[itex] degrees of freedom is the probability distribution of the random variable

[itex]X=Z_1^2 + \cdots + Z_k^2[itex]

where the [itex]Z_i[itex] are independent standard normal variables (zero expected value and unit variance). This distribution is usually written

[itex]X\sim\chi^2_k[itex]

The chi-square test can be used to test independence as well as goodness of fit.

An example of a test of independence would be if sex and political affiliation are connected. So you would gather your sample, your expected value, find your critical value, and if the chi-square test is greater than the critical value, you can reject the null, otherwise, you fail to reject the null. (you never accept the null)

The chi-square probability density function is

[itex]

f_k(x)= \frac{(1/2)^{k/2}}{\Gamma(k/2)} x^{k/2 - 1} e^{-x/2} [itex]

where [itex]x \ge 0[itex] and [itex]f_k(x) = 0[itex] for [itex]x \le 0[itex]. Here [itex]\Gamma[itex] denotes the Gamma function. The cumulative distribution function is:

[itex]F_k(x)=\frac{\gamma(k/2,x/2)}{\Gamma(k/2)}\,[itex]

where [itex]\gamma(k,z)[itex] is the incomplete Gamma function.

Tables of this distribution — usually in its cumulative form — are widely available (see the External links below for online versions), and the function is included in many spreadsheets (for example OpenOffice.org calc or Microsoft Excel) and all statistical packages.

If [itex]p[itex] independent linear homogeneous constraints are imposed on these variables, the distribution of [itex]X[itex] conditional on these constraints is [itex]\chi^2_{k-p}[itex], justifying the term "degrees of freedom". The characteristic function of the Chi-square distribution is

[itex]\phi(t)=(1-2it)^{-k/2}\,[itex]

The chi-square distribution has numerous applications in inferential statistics, for instance in chi-square tests and in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a regression line via its role in Student's t-distribution. It enters all analysis of variance problems via its role in the F-distribution, which is the distribution of the ratio of two independent chi-squared random variables.

 Contents

#### The normal approximation

If [itex]X\sim\chi^2_k[itex], then as [itex]k[itex] tends to infinity, the distribution of [itex]X[itex] tends to normality. However, the tendency is slow (the skewness is [itex]\sqrt{8/k}[itex] and the kurtosis is [itex]12/k[itex]) and two transformations are commonly considered, each of which approaches normality faster than [itex]X[itex] itself:

Fisher showed that [itex]\sqrt{2X}[itex] is approximately normally distributed with mean [itex]\sqrt{2k-1}[itex] and unit variance.

Wilson and Hilferty showed in 1931 that [itex]\sqrt{X/k}[itex] is approximately normally distributed with mean [itex]1-2/(9k)[itex] and variance [itex]2/(9k)[itex].

The expected value of a random variable having chi-square distribution with [itex]k[itex] degrees of freedom is [itex]k[itex] and the variance is [itex]2k[itex]. The median is given approximately by

[itex]k-\frac{2}{3}+\frac{4}{27k}-\frac{8}{729k^2}[itex]

Note that 2 degrees of freedom leads to an exponential distribution.

The chi-square distribution is a special case of the gamma distribution.

The information entropy is given by:

[itex]

H = \int_{-\infty}^\infty f(x)\ln(f(x)) dx = \frac{k}{2} + \ln

\left(
2 \Gamma
\left(
\frac{k}{2}
\right)
\right)


+ \left(1 - \frac{k}{2}\right) \psi(k/2) [itex]

where [itex]\psi(x)[itex] is the Digamma function.

## Related distributions

• [itex]X \sim \mathrm{Exponential}(\lambda = 2)[itex] is an exponential distribution if [itex]X \sim \chi_2^2[itex] (with 2 degrees of freedom).
• [itex]Y \sim \chi_k^2[itex] is a chi-square distribution if [itex]Y = \sum_{m=1}^k X_m^2[itex] for [itex]X_i \sim N(0,1)[itex] independent that are normally distributed.
• [itex]Y \sim \mathrm{F}(\nu_1, \nu_2)[itex] is an F-distribution if [itex]Y = (X_1 / \nu_1)/(X_2 / \nu_2)[itex] where [itex]X_1 \sim \chi_{\nu_1}^2[itex] and [itex]X_2 \sim \chi_{\nu_2}^2[itex] are independent with their respective degrees of freedom.
• [itex]Y \sim \chi^2(\bar{\nu})[itex] is a chi-square distribution if [itex]Y = \sum_{m=1}^N X_m[itex] where [itex]X_m \sim \chi^2(\nu_m)[itex] are independent and [itex]\bar{\nu} = \sum_{m=1}^N \nu_m[itex].

• Art and Cultures
• Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
• Space and Astronomy