# Notes on Mathematical Market Microstructure

### 2019-10-04


# Introduction

In this section we start with an overview of market microstructure as a whole.

## Definition of Market Microstructure

Maureen O’Hara defines market microstructure as

… the study of the process and outcomes of exchanging assets under explicit trading rules. While much of economics abstracts from the mechanics of trading, microstructure literature analyzes how specific trading mechanisms affect the price formation process.

which is generally shown by high frequency trading.

## Frog’s Eye View

• (Fundamental Assumption) Central Limit Theorem does not work. Price is not observable unless there’s a trade and thus neither number or size of price movements during a period of time is not garanteed. In fact, no matter how many points we sample from historical data, the mass distribution of price jumps has fatter tails than normal distribution, which means CLT is not working.1
• (Price Formation and Discovery) Last price is not necessarily an indicator of where it has now formed. Also, price discovery is a destructive experiment involving unique counterpart.
• (Uncertainty Principle) Like quantum mechanics, we can never measure simultaneously price and its volatility manifested in a derivative product. Instead of a number, price is considered a distribution.2
• (The Two Slits Experiment) An order which passed through the previous slit may pass again or be submitted one of the following: hit, lift or join. This activity affacts the state of the trader’ss decision at subsequent times.
• (Technology) Colocated servers; GPS antennas for timing; fiber optics vs. microwave3; Field-Programmable Gate Array (FPGA) and Graphics Processing Unit (GPU); big data.
• (Regulation) Spoofing (also see figure below); Rule 610 (locking the market); Dodd-Frank Act.
• (Future) Direct Market Access (DMA); dark pools; cost of connectivity; speed of light.

## Principle of Ma

Ma (間) means empty, spatial void, and interval of space or time in Japanese. The Zen Principle of Ma, when in microstructure context, basically emphasizes that the more “micro” we go into the data, the more randomness we’ll observe.

## Characteristics of Transactions Data

• Randomly spaced time intervals (Principle of Ma). Trading intensity contains important information.
• Discrete-valued prices can only be multiples of tick size.
• Diurnal patterns: periodic intensity. For example, high at the beginning and at the end of the trading session.
• To observe microstructure time resolution currently needs to be in microseconds.

## Characteristics of Nonsynchronous Trading Data

• Cross-correlation between stock returns at lag 1
• Autocorrelation at lag 1 in portfolio returns
• (Bid-Ask Bounce) Negative autocorrelations in returns of a single stock

Example Stocks A and B are independent. Stock A is traded more frequently than B. News arriving at the very end of day session will more likely a§ect stock A than B. Stock B will react more the next day. Then in daily prices there will be a 1-day lag due to di§erence in trading frequency even when the two stocks are independent.

# Models

In this section, we will introduce a series of mathematical models that explain the abovementioned nonsynchronous characteristics.

## Compound Poisson Model

Let $$r_t$$ be continuously compounded return at time $$t$$. Assume that $$r_t$$ are i.i.d. latent variables, $$\E[r_t] = μ$$, $$\V[r_t]=\sigma$$. For each $$t$$ probability that the asset is not traded is $$\pi$$. Let $$r_t^0$$ be the manifest return variable. If at $$t$$ there is no trade $$r_t^0 = 0$$. If at $$t$$ there is a trade then $$r_t^0$$ is the cumulative return since the previous trade.

It can be shown that

\begin{align} &\P[r_t^0=\textstyle{\sum_{i=0}^k} r_{t-i}] = \pi^2(1-\pi)^2,\quad\E[r_t^0] = \mu,\\&\V[r_t^0]=\sigma^2+\frac{2\pi\mu^2}{1-\pi},\quad \Cov(r_t^0, r_{t-1}^0) = -\pi\mu^2. \end{align}

This simple model explains negative autocorrelation induced by nonsynchronous trading.

## Ordered Probit Model

Let $$y_t$$ be a latent variable depending on time. Observed variable is $$u_t$$. Assume $$u_t$$ is an ordered $$k$$-categorical variable:

$u_t = \begin{cases} u^{(0)} & \text{if }y_t\in (-\infty,\theta_1),\\ u^{(i)} & \text{if }y_t\in [\theta_i,\theta_{i+1}),\ i=1,2,\ldots,k-1,\\ u^{(k)} & \text{if }y_t\in [\theta_k,\infty). \end{cases}$

Variable $$y_t$$ is predicted using a linear model $$y_t=\bs{\beta}\bs{X}_t + \epsilon_t$$, which gives

\begin{align} \P[u_t=u^{(i)}\mid \bs{X}_t] &= \P[\theta_{i-1}\le \bs{\beta}\bs{X}_t < \theta_i\mid \bs{X}_t]\\ &= \begin{cases} \Phi\!\left(\frac{\theta_1-\bs{\beta X}_t}{\sigma_t}\right) & i=0,\\ \Phi\!\left(\frac{\theta_{i+1}-\bs{\beta X}_t}{\sigma_t}\right) - \Phi\!\left(\frac{\theta_{i}-\bs{\beta X}_t}{\sigma_t}\right) & i=1,2,\ldots,k-1,\\ 1 - \Phi\!\left(\frac{\theta_{k}-\bs{\beta X}_t}{\sigma_t}\right) & i=k. \end{cases} \end{align}

Note here we assume $$\epsilon_t\sim\mathcal{N}(0,\sigma_t^2)$$ and thus applied $$\Phi(\cdot)$$ as link function, which explains why it’s a Probit model.

## Decomposition Model

Assume the price change $$y_i = P_{t_i} - P_{t_{i-1}}$$ can be decomposed into product of three components:

• Indicator of price change $$A_i\in\{0,1\}$$.
• Direction of price change $$D_i\in\{-1,+1\}$$.
• Size of price change $$S_i\in\mathbb{N}_+$$.

Specifically, for $$p_i=\P[A_i=1]$$ we let

$\ln\left(\frac{p_i}{1-p_i}\right) = \bs{\beta X}_i\Rightarrow p_i = \frac{\exp(\bs{\beta X}_i)}{1 + \exp(\bs{\beta X}_i)}.$

For $$\delta_i=\P[D_i=1\mid A_i=1]$$ we let

$\ln\left(\frac{\delta_i}{1-\delta_i}\right) = \bs{\gamma Z}_i\Rightarrow \delta_i = \frac{\exp(\bs{\gamma Z}_i)}{1 + \exp(\bs{\gamma Z}_i)}.$

For $$S_i$$ we let

$S_i\mid (D_i,A_i=1)\sim 1 + g(\lambda_{u,i})\1{D_i=+1} + g(\lambda_{d,i})\1{D_i=-1}$

where $$g(\lambda_{\xi,i})$$ is geometric distribution with parameter $$\lambda_{\xi,i}$$ estimated from

$\ln\left(\frac{\lambda_{\xi,i}}{1-\lambda_{\xi,i}}\right) = \bs{\theta}_\xi\bs{W}_i\Rightarrow \lambda_{\xi,i} = \frac{\exp(\bs{\theta}_\xi\bs{W}_i)}{1 + \exp(\bs{\theta}_\xi\bs{W}_i)}, \quad \xi=u,d.$

Examples We can choose features as below

$\bs{X}_i = (1, A_{i-1}),\ \bs{Z}_i=(1,D_{i-1})\ \text{and}\ \bs{W}_i = (1,S_{i-1}).$

from which we can train a simple decomposition model using in-sample data.

# To Be Continued

1. One solution to cope with this discrepancy, is to allow infinite volatility. ↩︎
2. Thanks to Heisenberg, we can gauge this uncertainty in quantum mechanics. ↩︎
3. Microwave travels faster and easier to deploy, but suffers from less bandwidth and sensitivity to weather conditions. ↩︎