Notes on Mathematical Market Microstructure


Following are my lecture notes from Prof. Yuri Balasanov’s course Mathematical Market Microstructure.\(\newcommand{1}[1]{\unicode{x1D7D9}_{\{#1\}}}\newcommand{Cov}{\text{Cov}}\newcommand{P}{\text{P}}\newcommand{E}{\text{E}}\newcommand{V}{\text{V}}\newcommand{bs}{\boldsymbol}\newcommand{R}{\mathbb{R}}\newcommand{rank}{\text{rank}}\newcommand{\norm}[1]{\left\lVert#1\right\rVert}\newcommand{diag}{\text{diag}}\newcommand{tr}{\text{tr}}\newcommand{braket}[1]{\left\langle#1\right\rangle}\newcommand{C}{\mathbb{C}}\)


In this section we start with an overview of market microstructure as a whole.

Definition of Market Microstructure

Maureen O’Hara defines market microstructure as

… the study of the process and outcomes of exchanging assets under explicit trading rules. While much of economics abstracts from the mechanics of trading, microstructure literature analyzes how specific trading mechanisms affect the price formation process.

which is generally shown by high frequency trading.

Frog’s Eye View


Principle of Ma

Ma (間) means empty, spatial void, and interval of space or time in Japanese. The Zen Principle of Ma, when in microstructure context, basically emphasizes that the more “micro” we go into the data, the more randomness we’ll observe.

Characteristics of Transactions Data

Characteristics of Nonsynchronous Trading Data

Example Stocks A and B are independent. Stock A is traded more frequently than B. News arriving at the very end of day session will more likely a§ect stock A than B. Stock B will react more the next day. Then in daily prices there will be a 1-day lag due to di§erence in trading frequency even when the two stocks are independent.


In this section, we will introduce a series of mathematical models that explain the abovementioned nonsynchronous characteristics.

Compound Poisson Model

Let \(r_t\) be continuously compounded return at time \(t\). Assume that \(r_t\) are i.i.d. latent variables, \(\E[r_t] = μ\), \(\V[r_t]=\sigma\). For each \(t\) probability that the asset is not traded is \(\pi\). Let \(r_t^0\) be the manifest return variable. If at \(t\) there is no trade \(r_t^0 = 0\). If at \(t\) there is a trade then \(r_t^0\) is the cumulative return since the previous trade.

It can be shown that

\[ \begin{align} &\P[r_t^0=\textstyle{\sum_{i=0}^k} r_{t-i}] = \pi^2(1-\pi)^2,\quad\E[r_t^0] = \mu,\\&\V[r_t^0]=\sigma^2+\frac{2\pi\mu^2}{1-\pi},\quad \Cov(r_t^0, r_{t-1}^0) = -\pi\mu^2. \end{align} \]

This simple model explains negative autocorrelation induced by nonsynchronous trading.

Ordered Probit Model

Let \(y_t\) be a latent variable depending on time. Observed variable is \(u_t\). Assume \(u_t\) is an ordered \(k\)-categorical variable:

\[ u_t = \begin{cases} u^{(0)} & \text{if }y_t\in (-\infty,\theta_1),\\ u^{(i)} & \text{if }y_t\in [\theta_i,\theta_{i+1}),\ i=1,2,\ldots,k-1,\\ u^{(k)} & \text{if }y_t\in [\theta_k,\infty). \end{cases} \]

Variable \(y_t\) is predicted using a linear model \(y_t=\bs{\beta}\bs{X}_t + \epsilon_t\), which gives

\[ \begin{align} \P[u_t=u^{(i)}\mid \bs{X}_t] &= \P[\theta_{i-1}\le \bs{\beta}\bs{X}_t < \theta_i\mid \bs{X}_t]\\ &= \begin{cases} \Phi\!\left(\frac{\theta_1-\bs{\beta X}_t}{\sigma_t}\right) & i=0,\\ \Phi\!\left(\frac{\theta_{i+1}-\bs{\beta X}_t}{\sigma_t}\right) - \Phi\!\left(\frac{\theta_{i}-\bs{\beta X}_t}{\sigma_t}\right) & i=1,2,\ldots,k-1,\\ 1 - \Phi\!\left(\frac{\theta_{k}-\bs{\beta X}_t}{\sigma_t}\right) & i=k. \end{cases} \end{align} \]

Note here we assume \(\epsilon_t\sim\mathcal{N}(0,\sigma_t^2)\) and thus applied \(\Phi(\cdot)\) as link function, which explains why it’s a Probit model.

Decomposition Model

Assume the price change \(y_i = P_{t_i} - P_{t_{i-1}}\) can be decomposed into product of three components:

Specifically, for \(p_i=\P[A_i=1]\) we let

\[ \ln\left(\frac{p_i}{1-p_i}\right) = \bs{\beta X}_i\Rightarrow p_i = \frac{\exp(\bs{\beta X}_i)}{1 + \exp(\bs{\beta X}_i)}. \]

For \(\delta_i=\P[D_i=1\mid A_i=1]\) we let

\[ \ln\left(\frac{\delta_i}{1-\delta_i}\right) = \bs{\gamma Z}_i\Rightarrow \delta_i = \frac{\exp(\bs{\gamma Z}_i)}{1 + \exp(\bs{\gamma Z}_i)}. \]

For \(S_i\) we let

\[ S_i\mid (D_i,A_i=1)\sim 1 + g(\lambda_{u,i})\1{D_i=+1} + g(\lambda_{d,i})\1{D_i=-1} \]

where \(g(\lambda_{\xi,i})\) is geometric distribution with parameter \(\lambda_{\xi,i}\) estimated from

\[ \ln\left(\frac{\lambda_{\xi,i}}{1-\lambda_{\xi,i}}\right) = \bs{\theta}_\xi\bs{W}_i\Rightarrow \lambda_{\xi,i} = \frac{\exp(\bs{\theta}_\xi\bs{W}_i)}{1 + \exp(\bs{\theta}_\xi\bs{W}_i)}, \quad \xi=u,d. \]

Examples We can choose features as below

\[ \bs{X}_i = (1, A_{i-1}),\ \bs{Z}_i=(1,D_{i-1})\ \text{and}\ \bs{W}_i = (1,S_{i-1}). \]

from which we can train a simple decomposition model using in-sample data.

To Be Continued

  1. One solution to cope with this discrepancy, is to allow infinite volatility. ↩︎
  2. Thanks to Heisenberg, we can gauge this uncertainty in quantum mechanics. ↩︎
  3. Microwave travels faster and easier to deploy, but suffers from less bandwidth and sensitivity to weather conditions. ↩︎