How to estimate the parameters of a geometric Brownian motion (GBM)? It seems rather simple but actually took me quite some time to solve it. The most intuitive way is by using the method of moments.

Estimation of ABM

First let us consider a simpler case, an arithmetic Brownian motion (ABM). The evolution is given by

\[ dS = \mu dt + \sigma dW. \]

By integrating both sides over \((t,t+T]\) we have

\[ \Delta \equiv S(t+T) - S(t) = \left(\mu - \frac{\sigma^2}{2}\right) T + \sigma W(T) \]

which follows a normal distribution with mean \((\mu - \sigma^2/2)T\) and variance \(\sigma^2 T\). That is to say, given \(T\) and i.i.d. observations \(\\{\Delta_1,\Delta_2,\ldots,\Delta_n\\}\) for different \(t\) values[1], with sample mean

\[ \hat{\mu}_{\Delta} = \frac{\sum_{i=1}^n\Delta_i}{n}\overset{p}{\to}\left(\mu - \frac{\sigma^2}{2}\right)T \]

and modified sample variance

\[ \hat{\sigma}_{\Delta}^2 = \frac{\sum_{i=1}^n (\Delta_i - \hat{\mu}_{\Delta})^2}{n-1} \overset{p}{\to} \sigma^2 T, \]

we have unbiased estimator for \(\mu\)

\[ \hat{\mu} = \frac{2\hat{\mu}_{\Delta} + \hat{\sigma}_{\Delta}^2}{2T} \]

and for \(\sigma^2\) we have

\[ \hat{\sigma}^2 = \frac{\hat{\sigma}_{\Delta}^2}{T}. \]

Now we prove the consistency. First we consider the variance of \(\hat{\mu}_{\Delta}\)

\[ Var(\hat{\mu}_{\Delta}) = \frac{Var(\Delta_1)}{n} = \frac{\sigma^2 T}{n} \]

and the variance of \(\hat{\sigma}_{\Delta}^2\)

\[ Var(\hat{\sigma}_{\Delta}^2) = E(\hat{\sigma}_{\Delta}^4) - E(\hat{\sigma}_{\Delta}^2)^2 = \frac{n E[(\Delta_1-\hat{\mu}_{\Delta})^4] + n(n-1) E[(\Delta_1-\hat{\mu}_{\Delta})^2]^2}{(n-1)^2} - \sigma^4T^2 = \frac{2\sigma^4T^2}{n}. \]

The variance of \(\hat{\mu}\) is therefore given by

\[ Var(\hat{\mu}) = \frac{4Var(\hat{\mu}_{\Delta}) + Var(\hat{\sigma}_{\Delta}^2)}{4T^2} = \frac{\sigma^2 (2 + \sigma^2T)}{2nT} \]

and the variance of \(\hat{\sigma}^2\) is given by

\[ Var(\hat{\sigma}^2) = \frac{Var(\hat{\sigma}_{\Delta}^2)}{T^2} = \frac{2\sigma^4}{n}. \]

So the two estimators are also both consistent. It should be noticed that there exists certain "trade-off" between the efficiency of \(\hat{\mu}_{\Delta}\) and \(\hat{\sigma}_{\Delta}^2\) by varying the value of \(T\).

Estimation of GBM

For a general GBM with drift \(\mu\) and diffusion \(\sigma\), we have PDE

\[ \frac{dS}{S} = \mu dt + \sigma dW, \]

so we can integrate[2] the both sides within \((t,t+T]\) for any \(t\) and get

\[ \Delta \equiv \ln S(t+T) - \ln S(t) = \left(\mu - \frac{\sigma^2}{2}\right) T + \sigma W(T). \]

The rest derivation is exactly the same.

Validation

Now we numerically validate this against monte Carlo simulation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import numpy as np
import pandas as pd
from scipy.stats import ttest_1samp as ttest

np.random.seed(1)

timespan = 10000
mu = 0.002
sigma = 0.06
T = 50
S0 = 10
n_sim = 20000
sigma2 = sigma**2
mu_hat_list = []
sigma2_hat_list = []

for i in range(n_sim):
# simulate log_S
log_S0 = np.log(S0)
log_St = mu - sigma2 / 2 + sigma * np.random.normal(size=timespan)
log_S = np.cumsum(np.append(log_S0, log_St))
# calculate delta
delta = log_S[T::T] - log_S[:-T:T]
n = len(delta)
# estimate mu and sigma2
mu_delta = delta.mean()
sigma2_delta = ((delta - mu_delta)**2).sum() / (n - 1)
mu_hat = (2 * mu_delta + sigma2_delta) / T / 2
sigma2_hat = sigma2_delta / T
mu_hat_list.append(mu_hat)
sigma2_hat_list.append(sigma2_hat)

E_mu_hat_mc = np.mean(mu_hat_list)
Var_mu_hat_mc = np.var(mu_hat_list)
Var_mu_hat_mm = sigma2 * (2 + sigma2 * T) / n / T / 2
E_sigma2_hat_mc = np.mean(sigma2_hat_list)
Var_sigma2_hat_mc = np.var(sigma2_hat_list)
Var_sigma2_hat_mm = 2 * sigma2**2 / n

result = pd.DataFrame([
[E_mu_hat_mc, mu, ttest(mu_hat_list, mu)[1]],
[Var_mu_hat_mc, Var_mu_hat_mm, '-'],
[E_sigma2_hat_mc, sigma2, ttest(sigma2_hat_list, sigma2)[1]],
[Var_sigma2_hat_mc, Var_sigma2_hat_mm, '-']
], columns = ['monte Carlo', 'Method of moment', 'P Value'],
index = ['E(mu_hat)', 'Var(mu_hat)', 'E(sigma2_hat)', 'Var(sigma2_hat)'])
result
Statistics monte Carlo Method of moment P Value
E(mu_hat) 1.994533e-03 2.000000e-03 0.222191
Var(mu_hat) 4.010866e-07 3.924000e-07 -
E(sigma2_hat) 3.596733e-03 3.600000e-03 0.201573
Var(sigma2_hat) 1.308537e-07 1.296000e-07 -

Now we may safely apply this estimation in application.


  1. 1.To assure that \(\{\Delta_i\}\) are i.i.d., we need time slots \((t,t+T]\) consecutive and have no overlay. Furthermore, in order to achieve the most efficient estimators for a given \(T\), it is clear that we opt for end-to-end slots over the total timespan. ↩︎
  2. 2.Although the LHS looks related to \(\ln(S)\), it's actually not. We need to use Itō calculus to derive this stochastic integral. ↩︎