Probability 5 -- Selected Distributions

This paper aims at including some useful distributions. For example: Beta, Gamma, etc.

Before introduction of distribution formula, we need to see a little functions.

Gamma Function: \[ \Gamma(x) = \int_0^{\infty} t^{x-1}e^{-t} dt, \Re(x) > 0 \] Beta Function: \[ \text{B}(\alpha, \beta) = \int_0^1x^{\alpha-1}(1-x)^{\beta-1} dx, \Re(\alpha), \Re(\beta) > 0 \] Error Function: \[ \text{erf} z = \frac{2}{\sqrt{\pi}} \int_0^z e^{-t^2} dt. \] Similarly, a complementary error function can be defined, \[ \text{erfc} z = 1 - \text{erf} z. \] Corollary: \[ \begin{aligned} \text{B}(\alpha, \beta) &= \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha + \beta)} \\ \Gamma(n) &= (n-1)! \end{aligned} \]

One Dimensional Distributions

Beta Distribution

Beta distribution is a generalization of uniform distribution. It defines a probability in a bounded region. \[ f(x \mid \alpha, \beta) = \frac{1}{\text{B}(\alpha, \beta)} x^{\alpha-1}(1-x)^{\beta-1}, x \in [0, 1] \] We denote it as \(X \sim Beta(\alpha, \beta).\)

Gamma Distribution

Gamma distribution is a generalization of exponential distribution. \[ f(x \mid k, \theta) = \frac{1}{\Gamma(k)\theta^k}x^{k-1}e^{-\frac{x}{\theta}}, x \in (0, \infty) \]We denote it as \(X \sim \Gamma(k, \theta).\)

Inverse Gamma Distribution

Inverse gamma distribution defines the probability distribution of a random variable as the reciprocal of gamma distribution. \[ f(x \mid \alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)}(\frac{1}{x})^{\alpha+1}\exp(-\beta/x), x \in (0, \infty) \]

Normal Distribution

\[ f(x \mid \mu, \sigma^2) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}, x \in (-\infty, \infty) \]

The CDF of standard normal distribution is defined as \[ \Phi(x) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^x e^{-t^2/2} dt = \frac{1}{2}\left[1 + \text{erf}(\frac{x}{\sqrt{2}}) \right]. \]

Multi Dimensional Distributions

Dirichlet Distribution

Dirichlet distribution is a multivariate generalization of Beta distribution. let \(\mathbf{x} = (x_1, x_2, ... x_K)^T, \boldsymbol{\alpha} = (\alpha_1, \alpha_2, ... \alpha_K)^T\) with \[ \sum_{i=1}^K x_i = 1, x_i \in [0, 1]. \]The probability density function is defined as \[ f(\mathbf{x} \mid \boldsymbol{\alpha}) = \frac{1}{B(\boldsymbol{\alpha})} \prod_{i=1}^K x_i^{\alpha_i-1}. \]Here \(B(\boldsymbol{\alpha})\) is defined as \[ B(\boldsymbol{\alpha}) = \frac{\prod_{i=1}^K \Gamma(\alpha_i)}{\Gamma(\sum_{i=1}^K\alpha_i)}. \]Denote it as \(\mathbf{x} \sim Dir(\boldsymbol{\alpha}).\)

Wishart Distribution

Wishart distribution is a multivariate generalization of Gamma distribution. Let \(\mathbf{X}\) be a \(p \times p\) symmetric matrix of random variables, \(\mathbf{V}\) be a \(p \times p\) positive definite symmetric matrix. The probability function is defined as \[ f_\mathbf{X}(\mathbf{X}) = \frac{1}{2^{np/2}|\mathbf{V}|^{n/2}\Gamma_p(n/2)}|\mathbf{X}|^{(n-p-1)/2}e^{-1/2 tr(\mathbf{V}^{-1}\mathbf{X})}, \]where \(\Gamma_p(n/2)\) is defined as \[ \Gamma_p(n/2) = \pi^{p(p-1)/4}\prod_{j=1}^p \Gamma(\frac{n}{2} - \frac{j-1}{2}). \]Denote it as \(\mathbf{X} \sim W_p(\mathbf{V}, n).\) If \(n \ge p,\) we say the degree of freedom is \(n.\)

Inverse Wishart Distribution

Similar to inverse Gamma distribution, we can define inverse Wishart distribution. Due to complexity, here we omit the formula. Inverse Wishart Distribution

Useful Distribution in Statistics

Chi-Square Distribution

Chi-Square distribution is the sum of multiple normal distribution. Let \(X = \sum_{i=1}^k X_i^2\) where \(X_i \sim \mathcal{N}(0, 1).\) The distribution of \(X\) is defined as Chi-Square distribution, denoted as \(X \sim \chi^2(k)\). The probability density function is \[ f(x \mid k) = \frac{1}{\Gamma(k/2)2^{k/2}} x^{k/2-1}e^{-x/2}, x > 0, \]which is the same as \(\Gamma(\frac{k}{2}, 2).\)

\(\text{F}\)-Distribution

Let \(S_1, S_2\) be Chi-Square distribution with degree of freedom \(d_1, d_2.\) Let \(X = (S_1/d_1)/(S_2/d_2).\) The distribution of \(X\) is defined as \(\text{F}\)-distribution, denoted as \(X \sim \text{F}(d_1, d_2).\) The probability density function is \[ f(x \mid d_1, d_2) = \frac{1}{\text{B}(d_1/2, d_2/2)}\left(\frac{d_1}{d_2}\right)^{d_1/2}x^{d_1/2-1}\left(1+\frac{d_1}{d_2}x\right)^{-(d_1+d_2)/2}. \]

t-Distribution

Let \(\nu\) be the degree of freedom, the probability density function of \(t\)-distribution is defined as \[ f(t) = \frac{\Gamma(\frac{\nu + 1}{2})}{\sqrt{\nu \pi} \Gamma(\frac{\nu}{2})}\left(1 + \frac{t^2}{\nu}\right)^{-(\nu+1)/2} = \frac{1}{\sqrt{\nu}\text{B}(1/2, \nu/2)}\left(1 + \frac{t^2}{\nu}\right)^{-(\nu+1)/2}. \]

Distributions and Stochastic Process

Arcsine Distribution

Arcsine distribution is the probability distribution whose cumulative distribution function involves arcsine and square root. The probability density function is defined as \[ f(x \mid a, b) = \frac{1}{\pi \sqrt{(x-a)(b-x)}}, x \in (a, b). \] The CDF is \[ F(x) = \frac{2}{\pi}\arcsin(\sqrt{\frac{x-a}{b-a}}). \] The arcsine distribution is closely related to random process. The Levy's laws below state the relation.

First Arcsine Law

The proportion of time that the one-dimensional Wiener process is positive follows an arcsine distribution. \[ T_+ = |\{t \in [0,1] : W_t > 0\}| \] is arcsine distributed.

Second Arcsine Law

The distribution of the last time the Wiener process changes sign follows arcsine distribution. \[ L = \sup \{t \in [0, 1]: W_t = 0\} \] is arcsine distributed.

Third Arcsine Law

The time at which a Wiener process achieves its maximum is arcsine distributed. \[ W_M = \sup \{W_s : s \in [0, t]\} \]is arcsine distributed.

Levy Distribution

The probability density function is defined as \[ f(x \mid \mu, c) = \sqrt{\frac{c}{2\pi}}\frac{e^{-\frac{c}{2(x-\mu)}}}{(x-\mu)^{3/2}}, x \in (\mu, \infty). \]The \(\mu\) is called location parameter and \(c\) is scale parameter. The CDF is defined as \[ F(x) = \text{erfc}(\sqrt{\frac{c}{2(x-\mu)}}). \]

cite:

Arcsine laws (Wiener process)

概率分布是什么——一些常见的概率分布

Beta distribution

Gamma distribution

Dirichlet distribution

Wishart distribution

Lévy distribution