Probability 5 -- Selected Distributions
This paper aims at including some useful distributions. For example: Beta, Gamma, etc.
Before introduction of distribution formula, we need to see a little functions.
Gamma Function: \[ \Gamma(x) = \int_0^{\infty} t^{x-1}e^{-t} dt, \Re(x) > 0 \] Beta Function: \[ \text{B}(\alpha, \beta) = \int_0^1x^{\alpha-1}(1-x)^{\beta-1} dx, \Re(\alpha), \Re(\beta) > 0 \] Error Function: \[ \text{erf} z = \frac{2}{\sqrt{\pi}} \int_0^z e^{-t^2} dt. \] Similarly, a complementary error function can be defined, \[ \text{erfc} z = 1 - \text{erf} z. \] Corollary: \[ \begin{aligned} \text{B}(\alpha, \beta) &= \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha + \beta)} \\ \Gamma(n) &= (n-1)! \end{aligned} \]
One Dimensional Distributions
Beta Distribution
Beta distribution is a generalization of uniform distribution. It defines a probability in a bounded region. \[ f(x \mid \alpha, \beta) = \frac{1}{\text{B}(\alpha, \beta)} x^{\alpha-1}(1-x)^{\beta-1}, x \in [0, 1] \] We denote it as \(X \sim Beta(\alpha, \beta).\)
Gamma Distribution
Gamma distribution is a generalization of exponential distribution. \[ f(x \mid k, \theta) = \frac{1}{\Gamma(k)\theta^k}x^{k-1}e^{-\frac{x}{\theta}}, x \in (0, \infty) \]We denote it as \(X \sim \Gamma(k, \theta).\)
Inverse Gamma Distribution
Inverse gamma distribution defines the probability distribution of a random variable as the reciprocal of gamma distribution. \[ f(x \mid \alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)}(\frac{1}{x})^{\alpha+1}\exp(-\beta/x), x \in (0, \infty) \]
Normal Distribution
\[ f(x \mid \mu, \sigma^2) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}, x \in (-\infty, \infty) \]
The CDF of standard normal distribution is defined as \[ \Phi(x) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^x e^{-t^2/2} dt = \frac{1}{2}\left[1 + \text{erf}(\frac{x}{\sqrt{2}}) \right]. \]
Multi Dimensional Distributions
Dirichlet Distribution
Dirichlet distribution is a multivariate generalization of Beta distribution. let \(\mathbf{x} = (x_1, x_2, ... x_K)^T, \boldsymbol{\alpha} = (\alpha_1, \alpha_2, ... \alpha_K)^T\) with \[ \sum_{i=1}^K x_i = 1, x_i \in [0, 1]. \]The probability density function is defined as \[ f(\mathbf{x} \mid \boldsymbol{\alpha}) = \frac{1}{B(\boldsymbol{\alpha})} \prod_{i=1}^K x_i^{\alpha_i-1}. \]Here \(B(\boldsymbol{\alpha})\) is defined as \[ B(\boldsymbol{\alpha}) = \frac{\prod_{i=1}^K \Gamma(\alpha_i)}{\Gamma(\sum_{i=1}^K\alpha_i)}. \]Denote it as \(\mathbf{x} \sim Dir(\boldsymbol{\alpha}).\)
Wishart Distribution
Wishart distribution is a multivariate generalization of Gamma distribution. Let \(\mathbf{X}\) be a \(p \times p\) symmetric matrix of random variables, \(\mathbf{V}\) be a \(p \times p\) positive definite symmetric matrix. The probability function is defined as \[ f_\mathbf{X}(\mathbf{X}) = \frac{1}{2^{np/2}|\mathbf{V}|^{n/2}\Gamma_p(n/2)}|\mathbf{X}|^{(n-p-1)/2}e^{-1/2 tr(\mathbf{V}^{-1}\mathbf{X})}, \]where \(\Gamma_p(n/2)\) is defined as \[ \Gamma_p(n/2) = \pi^{p(p-1)/4}\prod_{j=1}^p \Gamma(\frac{n}{2} - \frac{j-1}{2}). \]Denote it as \(\mathbf{X} \sim W_p(\mathbf{V}, n).\) If \(n \ge p,\) we say the degree of freedom is \(n.\)
Inverse Wishart Distribution
Similar to inverse Gamma distribution, we can define inverse Wishart distribution. Due to complexity, here we omit the formula. Inverse Wishart Distribution
Useful Distribution in Statistics
Chi-Square Distribution
Chi-Square distribution is the sum of multiple normal distribution. Let \(X = \sum_{i=1}^k X_i^2\) where \(X_i \sim \mathcal{N}(0, 1).\) The distribution of \(X\) is defined as Chi-Square distribution, denoted as \(X \sim \chi^2(k)\). The probability density function is \[ f(x \mid k) = \frac{1}{\Gamma(k/2)2^{k/2}} x^{k/2-1}e^{-x/2}, x > 0, \]which is the same as \(\Gamma(\frac{k}{2}, 2).\)
\(\text{F}\)-Distribution
Let \(S_1, S_2\) be Chi-Square distribution with degree of freedom \(d_1, d_2.\) Let \(X = (S_1/d_1)/(S_2/d_2).\) The distribution of \(X\) is defined as \(\text{F}\)-distribution, denoted as \(X \sim \text{F}(d_1, d_2).\) The probability density function is \[ f(x \mid d_1, d_2) = \frac{1}{\text{B}(d_1/2, d_2/2)}\left(\frac{d_1}{d_2}\right)^{d_1/2}x^{d_1/2-1}\left(1+\frac{d_1}{d_2}x\right)^{-(d_1+d_2)/2}. \]
t-Distribution
Let \(\nu\) be the degree of freedom, the probability density function of \(t\)-distribution is defined as \[ f(t) = \frac{\Gamma(\frac{\nu + 1}{2})}{\sqrt{\nu \pi} \Gamma(\frac{\nu}{2})}\left(1 + \frac{t^2}{\nu}\right)^{-(\nu+1)/2} = \frac{1}{\sqrt{\nu}\text{B}(1/2, \nu/2)}\left(1 + \frac{t^2}{\nu}\right)^{-(\nu+1)/2}. \]
Distributions and Stochastic Process
Arcsine Distribution
Arcsine distribution is the probability distribution whose cumulative distribution function involves arcsine and square root. The probability density function is defined as \[ f(x \mid a, b) = \frac{1}{\pi \sqrt{(x-a)(b-x)}}, x \in (a, b). \] The CDF is \[ F(x) = \frac{2}{\pi}\arcsin(\sqrt{\frac{x-a}{b-a}}). \] The arcsine distribution is closely related to random process. The Levy's laws below state the relation.
First Arcsine Law
The proportion of time that the one-dimensional Wiener process is positive follows an arcsine distribution. \[ T_+ = |\{t \in [0,1] : W_t > 0\}| \] is arcsine distributed.
Second Arcsine Law
The distribution of the last time the Wiener process changes sign follows arcsine distribution. \[ L = \sup \{t \in [0, 1]: W_t = 0\} \] is arcsine distributed.
Third Arcsine Law
The time at which a Wiener process achieves its maximum is arcsine distributed. \[ W_M = \sup \{W_s : s \in [0, t]\} \]is arcsine distributed.
Levy Distribution
The probability density function is defined as \[ f(x \mid \mu, c) = \sqrt{\frac{c}{2\pi}}\frac{e^{-\frac{c}{2(x-\mu)}}}{(x-\mu)^{3/2}}, x \in (\mu, \infty). \]The \(\mu\) is called location parameter and \(c\) is scale parameter. The CDF is defined as \[ F(x) = \text{erfc}(\sqrt{\frac{c}{2(x-\mu)}}). \]
cite: