Random Walk Problems
This paper aims at solving all random walk related problems with probability.
Problem:
Assume that there is an ant located at point 0, at each time \(t = 0, 1, 2, ...\) the ant has probability \(p\) moving forward and \(1-p\) moving backward. Then what is the probability distribution of the position of this ant?
Based on this fundamental problem, we can further raise the following problems.
- Starting from position \(x = 0,\) what is the probability of reaching position \(y\) after \(n\) steps?
- Starting from position \(x = 0,\) what is the expectation of first hitting time to reach position \(x = y?\)
- Starting from position \(x = 0,\) what is the expectation of first hitting time to reach position \(x = -L\) or \(x = R?\)
- What is the expectation of \(W_n^2\) and \(W_n\) at time \(n\)? What if the random walk is in d-dimensional space?
Basic Analysis of Problem:
Characteristic Function
Let \(X_t\) be a random variable representing the movement of the ant at time \(t.\) We have the probability distribution as \(P(X_t = 1) = p, P(X_t = -1) = 1-p.\) Now denote \(W_n\) as the position of the ant at time \(n\) as \[ W_n = \sum_{t=0}^{n-1} X_t. \] The characteristic function of \(X_t\) is defined as \(\varphi_{X_t}(\theta) =E[e^{i\theta X_t}].\) It can be calculated as \[ \varphi_{X_t}(\theta) = p\cdot e^{i\theta} + (1-p)\cdot e^{-i\theta}. \] Here we consider a naive version that \(p = \frac{1}{2}.\) Then we have \(\varphi_{X_t}(\theta) = \frac{e^{i\theta} + e^{-i\theta}}{2} = \cos(\theta).\)
Lemma 1: Given \(X_1, X_2, ... X_m\) as \(m\) random variables with characteristic function \(\varphi_{X_i}(\theta),\) the characteristic function of \(W = \sum_{i=1}^m X_i\) is \[ \varphi_W(\theta) = \Pi_{i=1}^m \varphi_{X_i}(\theta). \]
By Lemma 1 we get the characteristic function \(\varphi_{W_n}(\theta)\) as \[ \varphi_{W_n}(\theta) = \Pi_{i=0}^{n-1} \varphi_{X_i}(\theta) = \cos^n(\theta). \]
Probability Density Function
Lemma 2: Given that \(\varphi_X(\theta)\) is the characteristic function, the probability density function \(f_X(x)\) can be defined as \[ f_X(x) = \frac{1}{2\pi}\int_{\mathbb{R}} e^{-ix\theta}\varphi_X(\theta) d\theta. \]
By Lemma 2 we get the probability density function \(f_{W_n}(x)\) as \[ f_{W_n}(x) = \frac{1}{2\pi}\int_{\mathbb{R}}e^{-ix\theta} \cos^n(\theta) d\theta. \]
Probability of Hitting and First Hitting
We denote \(p_n(x, y)\) as the probability of reaching position \(y\) starting from position \(x\) after \(n\) steps. The probability is equal to having \(\frac{n+y-x}{2}\) steps forward and \(\frac{n-y+x}{2}\) steps backward. \[ p_n(x, y) = p_n(0, y-x) = C_{n}^{\frac{n+y-x}{2}}p^{\frac{n+y-x}{2}}(1-p)^{\frac{n-y+x}{2}} = C_{n}^{\frac{n+y-x}{2}}(\frac{1}{2})^n.\] Now we can define a stopping time \(\tau_y = \inf \{n \ge 1, W_n = y\}\) as the first hitting time to reach position \(y.\) Next we can define the probability of starting from position \(x\), first hitting position \(y\) at time \(k\) as \(P_x(\tau_y = k).\) The expectation of \(\tau_y\) is defined as \[ E[\tau_y] = \sum_{k=1}^{\infty}kP_x(\tau_y = k). \] The momentum generating function of probability distribution of starting from position \(x\), first hitting position \(y\) at time \(k\) is defined as \[ M_x(u) = \sum_{k = 0}^{\infty} P_x(\tau_y = k) u^k. \] Then \(E[\tau_y] = \frac{d}{du}M_x(u)\mid_{u = 1}\)
Q1:
\[ p_n(x, y) = C_{n}^{\frac{n+y-x}{2}}(\frac{1}{2})^n \]
Q2:
Consider the momentum generating function \[ M_x(u) = \sum_{k = 0}^{\infty} P_x(\tau_y = k) u^k. \] Directly we can have the following probability calculated: \(P_y(\tau_y = 0) = 1,\) then \[\forall s < y, s \in \mathbb{Z}, P_s(\tau_y = k) = \frac{1}{2}(P_{s-1}(\tau_y = k) + P_{s+1}(\tau_y = k)).\] Plug in the formula in momentum generating function we have the following iterating relations: \[ \begin{array}{c} M_y(u) = 1,\\ M_i(u) = \frac{u}{2}[M_{i-1}(u)+M_{i+1}(u)]\\ \end{array} \] Therefore, the final result is to calculate \(M_0(u).\) As we already have \(M_y(u) = 1,\) what we need is another \(M_{y-1}(u).\) \[ \begin{aligned} M_{y-1}(u) &= \sum_{k = 1}^\infty P_{y-1}(\tau_y = 2k-1)u^{2k-1} \\ & = \sum_{k = 1}^\infty \frac{1}{2k-1}C_{2k}^k (\frac{1}{2})^{2k}u^{2k-1} \\ & = \frac{1}{u} - \sqrt{\frac{1}{u^2} - 1} \end{aligned} \] Now the final solution is to calculate \(M_0(u)\) from the difference equation and two initial values \(M_y(u), M_{y-1}(u)\). Omitting details, we have final result as \[ M_0(u) = \left(\frac{1}{u} - \sqrt{\frac{1}{u^2} - 1}\right)^y \] Then \(E[\tau_y] = y(\frac{1}{u} - \sqrt{\frac{1}{u^2} - 1})^{y-1}(-u^{-2}-\frac{1}{2}(\frac{1}{u^2}-1)^{-\frac{1}{2}}(-2u^{-3})) = \infty.\)
The expectation is actually infinity. But at the same time, the probability of hitting the position is also 1.
Q3:
From Q2, we can define the momentum generating function for starting from position x, reaching either \(-L\) or \(R.\) \[ M_x(u) = \sum_{k = 0}^{\infty}[ P_x(\tau_{-L} = k) + P_x(\tau_R = k) ]u^k. \] Boundary values can be calculated as below. \[ P_{-L}(\tau_{-L} = 0) = 1, P_{R}(\tau_{-L} = k) = 0, P_{-L}(\tau_R = k) = 0, P_{R}(\tau_R = 0) = 1. \] Therefore, the solution is to calculate \(M_x(u)\) from a difference equations system. \[ \begin{aligned} M_R(u) = 1, M_{-L}(u) = 1 \\ M_{k}(u) = \frac{u}{2}[M_{k-1}(u) + M_{k+1}(u)] \end{aligned} \] Omitting tedious calculation, we get the final expectation as \(E[\tau_{-L} = k || \tau_{R} = k] = (R-x)(x+L).\)
Q4:
Here we directly talk about cases in \(d\)-dimensional space. Let \(X_t\) denote the random variable at time \(t.\) Then we have \[ \forall i = 1,2,...d, P(X_t = e_i) = \frac{1}{d}. \]Here \(e_i\) is the unit vector at dimension \(i.\) Below square is defined as the norm.
The expectation of \(W_n^2:\) \[ E[W_n^2] = E[(\sum_{t=0}^{n-1} X_t)^2] = E[\sum_{t=0}^{n-1}<X_t,X_t>] + 2E[\sum_{i<j}<X_i, X_j>] = E[\sum_{t=0}^{n-1}X_t^2] = n. \]
The expectation of \(W_n:\)
Suppose that \(W_n = (w_1, w_2, ..., w_d).\) We interpret each component as \(w_i = x_i - y_i\) where \(x_i\) denotes number of trails that moving forward in dimension \(i,\) \(y_i\) denotes number of trails that moving backward in dimension \(i.\) \[ E[W_n] = E\left[\sqrt{<W_n, W_n>}\right] = E\left[\sqrt{\sum_{i=1}^{d}w_i^2}\right] \] Now we need to study the probability distribution of \(w_i.\) Consider the set \(\{x_i, y_i\}_{i=1}^d.\) Then we have \(\sum_{i=1}^d (x_i + y_i) = n.\) Then we can conclude that these variables follow a multinomial distribution \(MN(n, p_1, p_2, ..., p_{2d})\) where \(p_i = \frac{1}{2d}\).
Lemma 3: Let \(X_1, X_2, ..., X_k\) be \(k\) random variables fitting the multinomial distribution \(MB(n, p_1, p_2, ..., p_k)\) with \(X_1 + X_2 + ... + X_k = n.\) Then when \(n\) is sufficiently large and each \(p_i\) small, we can approximately think \(X_i \sim Poi(np_i)\) and \(X_i\) are i.i.d.
By Lemma 3, we have that \(x_i, y_i \sim Poi(\frac{n}{2d}).\) Then by CLT(Central Limit Theorem), we have \(x_i, y_i \sim Poi(\frac{n}{2d}) \sim N(\frac{n}{2d}, \frac{n}{2d}).\) Therefore, \(w_i = x_i - y_i \sim N(0, \frac{n}{d}).\)
Lemma 4: Let \(\{X_i\}_{i=1}^n\) be \(n\) i.i.d random variables such that \(X_i \sim N(0, 1).\) Then \(Z = X_1 + ... + X_n \sim Chi(n)\) follows Chi-Square distribution.
\[ W_n = \sqrt{\sum_{i=1}^dw_i^2} = \sqrt{\frac{n}{d}}\sqrt{\sum_{i=1}^d \frac{d\cdot w_i^2}{n}}. \] We know that \(\sqrt{\frac{d}{n}}W_n \sim Chi(d).\) The mean of \(\sqrt{\frac{d}{n}}W_n\) is \(\sqrt{2}\frac{\Gamma(\frac{k+1}{2})}{\Gamma(\frac{1}{2})}.\) Then \(E[W_n] = \sqrt{\frac{n}{d}}E[\sqrt{\frac{d}{n}}W_n] = \sqrt{\frac{n}{d}}\cdot \sqrt{2}\frac{\Gamma(\frac{k+1}{2})}{\Gamma(\frac{1}{2})} = \sqrt{\frac{2n}{d}}\frac{\Gamma(\frac{k+1}{2})}{\Gamma(\frac{1}{2})}.\)
cite: