Price prediction is thought to be very difficult for any equity, however the price spread between two exchanges may not. We look into the price spread between two of the largest Bitcoin exchanges: Bitstamp and BTC-E.

## Step 2: Data Reading and Cleaning

Data source: http://api.bitcoincharts.com/v1/csv/

(11044600, 3)
(29688118, 3)
1315922016 5.800000000000
0 1315922024 5.83
1 1315922029 5.90
2 1315922034 6.00
3 1315924373 5.95
4 1315924504 5.88

The column names are actually timestamp (in the form of 10 digits), prices and volumes.

Here as an illustration we only use price itself as the indicator.

time price
0 2011-09-13 13:53:44 5.83
1 2011-09-13 13:53:49 5.90
2 2011-09-13 13:53:54 6.00
3 2011-09-13 14:32:53 5.95
4 2011-09-13 14:35:04 5.88

What I do here is merely adding two columns called "date" and then "hour". Then I combine these two columns and compute the hourly average prices.

Below I reindex these two datasets to make subscripts consistent.

bitstamp btce
2011-09-13 13:00:00 5.9100 5.601000
2011-09-13 14:00:00 5.8675 5.981700
2011-09-13 15:00:00 5.6500 5.563333
2011-09-13 16:00:00 5.6500 5.481429
2011-09-13 17:00:00 5.6500 5.316154

## Step 3: Data Analysis

Only look at data between Aug 2014 and Feb 2015. Before Jan 2015 is the training dataset and the month after is saved as testing dataset.

count    3672.000000
mean        5.159550
std         4.879880
min       -25.431669
25%         2.789031
50%         4.658016
75%         7.054407
max        49.270572
dtype: float64

Now let's plot the price spread.

0.0

Truncate data that exceed $3\sigma$ boundaries. Those are though outliers.

count    3672.000000
mean        5.156679
std         3.686989
min        -8.397757
25%         2.806244
50%         4.697575
75%         7.080410
max        19.488954
dtype: float64

3.1396679727739154e-06

Rolling smooth the data, window at 24 hours.

count    3672.000000
mean        5.161474
std         3.214953
min        -3.364170
25%         2.917758
50%         4.664176
75%         6.803972
max        16.700286
dtype: float64

8.4737772982965124e-05

ACF slowly decreasing and PACF truncated after 3. Try ARIMA(3,0,0) and if non-stationary, use ARIMA(3,1,0)

ARIMA(3,0,0) not stationary, using ARIMA(3,1,0)

ARIMA Model Results
==============================================================================
Dep. Variable:                    D.y   No. Observations:                 3671
Model:                 ARIMA(3, 1, 0)   Log Likelihood                4041.229
Method:                       css-mle   S.D. of innovations              0.080
Date:                Thu, 20 Apr 2017   AIC                          -8072.458
Time:                        21:04:19   BIC                          -8041.417
Sample:                    08-01-2014   HQIC                         -8061.407
- 12-31-2014
==============================================================================
coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const        9.24e-05      0.009      0.010      0.992      -0.018       0.018
ar.L1.D.y      0.7278      0.016     44.224      0.000       0.696       0.760
ar.L2.D.y      0.0560      0.020      2.750      0.006       0.016       0.096
ar.L3.D.y      0.0748      0.016      4.546      0.000       0.043       0.107
Roots
=============================================================================
Real           Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
AR.1            1.1284           -0.0000j            1.1284           -0.0000
AR.2           -0.9381           -3.3109j            3.4412           -0.2939
AR.3           -0.9381           +3.3109j            3.4412            0.2939
-----------------------------------------------------------------------------
8.4737772982965124e-05
2.0080126845985315

Normality test.

NormaltestResult(statistic=105.33826651992865, pvalue=1.3368603987900652e-23)

## Step 4: Prediction

Since we have ARIMA(3,1,0) model as below

$\Delta y_t=\phi_0 + \phi_1 \Delta y_{t-1} + \phi_2 \Delta y_{t-2} + \phi_3 \Delta y_{t-3} + \epsilon_t$

which is equivalent to

$y_t - y_{t-1}=\phi_0 + \phi_1 y_{t-1} - \phi_1 y_{t-2} + \phi_2 y_{t-2} - \phi_2 y_{t-3} + \phi_3 y_{t-3} - \phi_3 y_{t-4} + \epsilon_t,$

and this simplifies to

$y_t=\phi_0 + (1 + \phi_1) y_{t-1} + (\phi_2-\phi_1) y_{t-2} + (\phi_3-\phi_2) y_{t-3} - \phi_3 y_{t-4} + \epsilon_t.$

So $\hat y_t$ is given by

$\hat y_t=\hat\phi_0 + (1 + \hat\phi_1) y_{t-1} + (\hat\phi_2-\hat\phi_1) y_{t-2} + (\hat\phi_3-\hat\phi_2) y_{t-3} - \hat\phi_3 y_{t-4}.$

MSE training is 0.006472583700667411
MSE testing is 0.008814919494937557

Terribly good in the sense of MSE. Maybe there're some problems... Let's look closer.