Home

Stochastic Calculus

Sum of Centered Gaussians

Consider the sum of two independent random variables XX and YY, where each variable is normally distributed with zero mean. Denote the variance of XX as σX2\sigma^{2}_{X} and the variance of YY as σY2\sigma^{2}_{Y}. XN(0,σX2);YN(0,σY2).X \sim \mathcal{N}(0, \sigma^{2}_{X}) \quad ; \quad Y \sim \mathcal{N}(0, \sigma^{2}_{Y}). Recall that the sum of XX and YY is distributed with variance σX2+σY2\sigma^{2}_{X} + \sigma^{2}_{Y}.

Z=X+Y    ZN(0,σX2+σY2).\begin{equation} \label{eqn-1}\tag{1} Z = X + Y \implies Z \sim \mathcal{N}(0, \sigma^{2}_{X} + \sigma^{2}_{Y}). \end{equation}

A Continuous Random Walk

Imagine a continuous-time random walk W(t)W(t) for which observed differences between W(ti)W(t_{i}) and W(tj)W(t_j) are always normally distributed with zero mean:

W(tj)W(ti)N(0,σij2).\begin{equation} \label{eqn-2}\tag{2} W(t_j) - W(t_i) \sim \mathcal{N}(0, \sigma^{2}_{ij}). \end{equation}

Assume also that the random walk is uncorrelated with itself when comparing non-overlapping time intervals.

Consistency between Eq. (2) and Eq. (1), i.e.,  ti<tj<tk,σij2+σjk2=σik2,\forall ~ t_{i} < t_{j} < t_{k}, \quad \sigma^{2}_{ij} + \sigma^{2}_{jk} = \sigma^{2}_{ik}, dictates that σij2tjti.\sigma^{2}_{ij} \propto |t_j - t_i|. Let us choose to define WW such that the constant of proportionality is 1, identifying this random walk as a Wiener process. W(tj)W(ti)N(0,tjti).W(t_{j}) - W(t_{i}) \sim \mathcal{N}\big(0, |t_{j} - t_{i}|\big). For positive, infinitesimal differences in time, we may write

dWN(0,dt).\begin{equation} \label{eqn-3}\tag{3} {\rm d}W \sim \mathcal{N}(0, {\rm d} t). \end{equation}

Quadratic Variation

Recall that, for any random variable ZZ, Var[Z]E[Z2]E[Z]2.{\rm Var}[Z] \equiv \mathbb{E}[Z^{2}] - \mathbb{E}[Z]^{2}. For a Wiener process, this implies Var[dW]=E[(dW)2].{\rm Var}[{\rm d}W] = \mathbb{E}[({\rm d}W)^{2}]. We may therefore rewrite Eq. (3) in terms of a quadratic variation: E[(dW)2]=dt.\mathbb{E}[({\rm d}W)^{2}] = {\rm d}t. This observation has an important consequence: Approximations of a process X(t)X(t) depending on W(t)W(t), to first-order in dt{\rm d}t, must account for the quadratic variation in WW. That is,

dX(t)Xtdt+XWdW(t)+122XW2(dW(t))2=(Xt+122XW2quadratic dependence!)dt+XWdW(t).\begin{equation} \label{eqn-4}\tag{4} \begin{aligned} {\rm d}X(t) &\approx {\frac{{\partial} X}{{\partial} t}}{\rm d}t + {\frac{{\partial} X}{{\partial} W}}{\rm d}W(t) + \frac{1}{2} {\frac{{\partial}^{2} X}{{\partial} W^{2}}} ({\rm d}W(t))^{2} \\ &= \bigg( {\frac{{\partial} X}{{\partial} t}} + \underbrace{\frac{1}{2} {\frac{{\partial}^{2} X}{{\partial} W^{2}}}}_{\mkern-1.5em\text{quadratic dependence!}\mkern-1.5em} \bigg) {\rm d}t + {\frac{{\partial} X}{{\partial} W}}{\rm d}W(t). \end{aligned} \end{equation}

Drift-Diffusion Processes

An Itô drift-diffusion process may be represented in terms of differentials in tt and WW as dX(t)=μX(t)dt+σX(t)dW(t),{\rm d}X(t) = \mu_{X}(t){\rm d}t + \sigma_{X}(t) {\rm d}W(t), where μX(t)\mu_{X}(t) and σX(t)\sigma_{X}(t) are given by

μX(t)=Xt+122XW2.σX(t)=XW.\begin{align} \mu_{X}(t) &= {\frac{{\partial} X}{{\partial} t}} + \frac{1}{2} {\frac{{\partial}^{2} X}{{\partial} W^{2}}}. \\ \sigma_{X}(t) &= {\frac{{\partial} X}{{\partial} W}}. \end{align}

The functions μX\mu_{X} and σX\sigma_{X} are deterministic (e.g., as functions of tt and the history of XX), while W(t)W(t) is stochastic (a Wiener process as described above). Regarding notation, observe that μX(t)\mu_{X}(t) and σ2(t)\sigma^{2}(t) are the mean and variance, respectively, for dX{\rm d}X, not for XX!

Itô’s Lemma

Twice-differentiable functions applied to drift-diffusion stochastic processes also define drift-diffusion stochastic processes. For example, consider a twice-differentiable function f(t,x)f(t, x) where the second argument is given by a stochastic process XX, such that dX(t)=μX(t)dt+σX(t)dW(t).{\rm d}X(t) = \mu_{X}(t){\rm d}t + \sigma_{X}(t) {\rm d}W(t). We may express F=f(t,X)F = f(t, X) as a drift-diffusion process: dF(t)=μF(t)dt+σF(t)dW(t).{\rm d}F(t) = \mu_{F}(t){\rm d}t + \sigma_{F}(t) {\rm d}W(t). To relate the factors of μF,μX,σF\mu_{F}, \mu_{X}, \sigma_{F}, and σX\sigma_{X}, first Taylor expand FF according to Eq. (4), i.e., dF=((Ft)W+122FW2)dt+FWdW(t).{\rm d}F = \bigg( \Big({\frac{{\partial} F}{{\partial} t}}\Big)_{W} + \frac{1}{2} {\frac{{\partial}^{2} F}{{\partial} W^{2}}} \bigg) {\rm d}t + {\frac{{\partial} F}{{\partial} W}}{\rm d}W(t). Next, apply the chain rule to second-order, i.e.,

FW=fxXW;(Ft)W=ft+fxXt;2FW2=2fx2(XW)2+fx2XW2.\begin{align} \frac{{\partial}F}{{\partial}W} &= \frac{{\partial}f}{{\partial}x} \frac{{\partial}X}{{\partial}W}; \quad \Big(\frac{{\partial}F}{{\partial}t}\Big)_{W} = \frac{\partial f}{\partial t} + \frac{{\partial}f}{{\partial}x} \frac{{\partial}X}{{\partial}t}; \\ \frac{{\partial}^{2} F}{{\partial} W^{2}} &= \frac{{\partial}^{2} f}{{\partial} x^{2}} \bigg(\frac{{\partial} X}{{\partial} W}\bigg)^{2} + \frac{{\partial} f}{{\partial} x} \frac{{\partial}^{2} X}{{\partial} W^{2}}. \end{align}

We see that dF=(ft+fx(Xt+122XW2μX)+122fx2(XWσX)2)dt+fxXWσXdW(t).{\rm d}F = \bigg( \frac{{\partial}f}{{\partial}t} + \frac{{\partial}f}{{\partial}x} \bigg( \underbrace{ \frac{{\partial}X}{{\partial}t} + \frac{1}{2} \frac{{\partial}^{2} X}{{\partial} W^{2}}}_{\mu_{X}} \bigg) + \frac{1}{2} \frac{{\partial}^{2} f}{{\partial} x^{2}} \bigg(\underbrace{\frac{{\partial} X}{{\partial} W}}_{\sigma_{X}}\bigg)^{2} \bigg) {\rm d}t + \frac{{\partial}f}{{\partial}x} \underbrace{\frac{{\partial}X}{{\partial}W}}_{\sigma_{X}} {\rm d}W(t). We have thus derived Itô’s Lemma: μf(t)=ft+fxμX(t)+122fx2σX2(t);σf(t)=fxσX(t).\mu_{f}(t) = {\frac{\partial f}{\partial t}} + {\frac{\partial f}{\partial x}} \mu_{X}(t) + {\frac{1}{2} \frac{\partial^{2} f}{\partial x^{2}}} \sigma^2_{X}(t) \quad ; \quad \sigma_{f}(t) = {\frac{\partial f}{\partial x}} \sigma_{X}(t). Importantly, our result differs from the classical chain rule! dF=ftdt+fxdX+122fx2(XW)2non-classical term.{\rm d}F = \frac{\partial f}{\partial t} {\rm d}t + \frac{\partial f}{\partial x} {\rm d}X + \underbrace{ \frac{1}{2} \frac{\partial^{2} f}{\partial x^{2}} \bigg(\frac{\partial X}{\partial W} \bigg)^{2} }_{\text{non-classical term}}.

Geometric Brownian Motion

Consider a stochastic process XX for which the proportional growth dXX\frac{{\rm d}X}{X} is an affine transformation of a Wiener process, i.e.,

dX(t)=X(t)(μdt+σdW(t))\begin{equation} \label{eqn-7}\tag{7} {\rm d}X(t) = X(t) \bigg( \mu{\rm d}t + \sigma {\rm d}W(t) \bigg) \end{equation}

We provide two means of solving for X(t)X(t):

By Itô’s Lemma

We have the drift-diffusion process dX=(μX)dt+(σX)dW.{\rm d}X = (\mu X) {\rm d}t + (\sigma X) {\rm d}W. Let us apply Itô’s Lemma for the mapping f(t,x)=logxf(t, x) = \log x, where F=f(t,X)F = f(t, X): dF=(fx(μX)+122fx2(σX)2)dt+fx(σX)dW.{\rm d}F = \bigg( \frac{\partial f}{\partial x} (\mu X) + \frac{1}{2}\frac{\partial^{2} f}{\partial x^{2}} (\sigma X)^{2} \bigg) {\rm d}t + \frac{\partial f}{ \partial x} (\sigma X) {\rm d}W. Substituting fxx=X=1X;2fx2x=X=1X2,\frac{\partial f}{\partial x}\bigg\rvert_{x{=}X} = \frac{1}{X} \quad ; \quad \frac{\partial^{2} f}{\partial x^{2}}\bigg\rvert_{x{=}X} = -\frac{1}{X^{2}}, we obtain dF=(μ12σ2)dt+σdW.{\rm d}F = \bigg( \mu - \frac{1}{2} \sigma^{2} \bigg) {\rm d}t + \sigma {\rm d}W. For constant μ,σ\mu, \sigma, this differential equation has solution F(t)=(μ12σ2)t+σW(t)+C,F(t) = \bigg(\mu - \frac{1}{2}\sigma^{2}\bigg) t + \sigma W(t) + C, where CC is given by boundary conditions (i.e., the value of X(0)X(0)). Substituting X(t)=X0eF(t)X(t) = X_{0} e^{F(t)}, we conclude X(t)=X0e(μ12σ2)t+σW(t).X(t) = X_{0} e^{\big(\mu - \frac{1}{2}\sigma^{2}\big) t + \sigma W(t)}.

By Quadratic Variation

Another approach to solving for XX is to note that both the definition of geometric Brownian motion (Eq. (7)) and a Taylor expansion of XX (Eq. (4)) must be consistent.

dX(t)=X(t)(μdt+σdW(t)).dX(t)=(Xt+122XW2)dt+XWdW(t).\begin{align*} {\rm d}X(t) &= X(t) \bigg( \mu{\rm d}t + \sigma {\rm d}W(t) \bigg). \\ {\rm d}X(t) &= \bigg( {\frac{{\partial} X}{{\partial} t}} + \frac{1}{2} {\frac{{\partial}^{2} X}{{\partial} W^{2}}} \bigg) {\rm d}t + {\frac{{\partial} X}{{\partial} W}}{\rm d}W(t). \end{align*}

By the independence of dt{\rm d}t and dW{\rm d}W, this provides two equations: μX=Xt+122XW2;σX=XW.\mu X = {\frac{{\partial} X}{{\partial} t}} + \frac{1}{2} {\frac{{\partial}^{2} X}{{\partial} W^{2}}} \quad ; \quad \sigma X = {\frac{{\partial} X}{{\partial} W}}. From the second equation, we note that XeσW and therefore 2XW2=σ2X.X \propto e^{\sigma W} \quad \text{ and therefore } \quad {\frac{{\partial}^{2} X}{{\partial} W^{2}}} = \sigma^{2} X. Substituting into the first equation, it follows that Xt=(μ12σ2)X.{\frac{{\partial} X}{{\partial} t}} = \bigg(\mu - \frac{1}{2} \sigma^{2}\bigg) X. Recognizing an exponential as the solution class for X(t)X(t) again, we arrive at the unique solution: X(t)=X0e(μ12σ2)t+σW(t).X(t) = X_{0} e^{\big(\mu - \frac{1}{2}\sigma^{2}\big) t + \sigma W(t)}.

Properties

Recall that W(t)N(0,t)W(t) \sim \mathcal{N}(0, t).

It follows that, for constant μ\mu and σ\sigma, X(t)X(t) describes a Galton distribution, i.e., X(t)X0exp((μ12σ2)t+Zσt),X(t) \sim X_{0}\exp\left( \Big(\mu - \frac{1}{2}\sigma^{2}\Big) t + Z \sigma \sqrt{t} \right), for ZN(0,1)Z \sim \mathcal{N}(0, 1).

It follows that E[Xt]=X0eμt;Var[Xt]=e2μt(e(σ2t)1).\mathbb{E}[X_{t}] = X_{0} e^{\mu t} \quad ; \quad {\rm Var}[X_{t}] = e^{2 \mu t} \bigg(e^{(\sigma^{2} t)} - 1\bigg).

vs Discrete-Time

Imagine investing in a security XX, the valuation of which (e.g., relative to USD) grows by a ratio rtN(μ~,σ~2)r_{t} \sim \mathcal{N}(\tilde{\mu}, \tilde{\sigma}^{2}) each period, for constants μ~\tilde{\mu} and σ~\tilde{\sigma}.

Should we model the security according to geometric Brownian motion, which applies in the continuous-time limit, or is this inappropriate when changes happen in discrete time?

First, what is the statistical behavior of XX after tt periods when the process evolves discretely? For Xt=X0s=0t1rs;rsN(μ~,σ~2)(independently)X_{t} = X_{0} \prod_{s=0}^{t-1}r_{s} \quad ; \quad r_{s} \sim \mathcal{N}(\tilde{\mu}, \tilde{\sigma}^{2}) \quad \text{(independently)} we have E[Xt]=X0μ~t;Var[Xt]=μ~2t((σ~2μ~2+1)t1)\mathbb{E}[X_{t}] = X_{0} \tilde{\mu}^{t} \quad ; \quad {\rm Var}[X_{t}] = \tilde{\mu}^{2t} \bigg(\Big(\frac{\tilde{\sigma}^{2}}{\tilde{\mu}^{2}} + 1\Big)^{t} - 1 \bigg)

The statistics of this process agree with those of geometric Brownian motion when we identify μ=logμ~;σ2=log(σ~2μ~2+1).\mu = \log \tilde{\mu} \quad ; \quad \sigma^{2} = \log \left( \frac{\tilde{\sigma}^{2}}{\tilde{\mu}^{2}} + 1 \right).