Today a quant posed me a question:

If I had a sorted timeseries, how would I know if it was ordered correctly? What if it's in reverse?

After having an interesting conversation about how I would solve the issue, he infomed me that a straightforward way was to fit a GARCH model, and that the model fit would be much higher if the timeseries was sorted in the right direction.

So I thought to myself, I'm going to try it out and see if this is true as I personally was not convinced. This code uses the ARCH package written by Prof. Kevin Sheppard from Oxford.

A brief introduction to GARCH follows. A simple regression model can be defined as followws

$$ r_{t} = m_{t} + \sqrt{h_{t}} \epsilon_{t} $$

We can see that the variance of the residuals are being explicitly modeled using a GARCH model as below:

$$ h_{t+1} = \omega + \alpha(r_{t} - m_{t})^2 + \beta h_{t} = \omega + \alpha h_{t} \epsilon^{2}_{t} + \beta h_{t} $$

THe intuition behind the GARCH model is fairly simple. The model itself is asserts that the best predictor of variance/volatility in the next period is the weighted average of the following:

Long-run average variance. ($\omega$)
Variance predicted for this period. ($h_{t}$)
New information in this period that is captured by the most recent squared residual. ($h_{t} \epsilon^{2}_{t}$)

Thus, the weights that need to be estimated are $\omega$, $\alpha$, and $\beta$, and the inputs are the previous forecast ($h$) and the residual ($\epsilon$). The long-run average variance is given by $\sqrt{\omega/(1-\alpha-\beta)}$.

Importing all necessary modules¶

In [35]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn
import os

from arch import arch_model
from statsmodels.tsa.stattools import acf, pacf

Reading the Google timeseries¶

In [36]:

dir = os.getcwd() + '/inputs/'
filename = 'stk_GOOG.csv'
readFile = dir + filename
print(readFile)

stk_df = pd.read_csv(readFile,index_col='Date',parse_dates=True)
stk_df.info()
stk_df.head()

/home/randlow/github/blog/content/articles/Econometrics/inputs/stk_GOOG.csv
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1931 entries, 2010-12-31 to 2018-08-31
Data columns (total 6 columns):
Open         1931 non-null float64
High         1931 non-null float64
Low          1931 non-null float64
Close        1931 non-null float64
Adj Close    1931 non-null float64
Volume       1931 non-null int64
dtypes: float64(5), int64(1)
memory usage: 105.6 KB

Out[36]:

	Open	High	Low	Close	Adj Close	Volume
Date
2010-12-31	296.441925	297.276489	294.102142	295.065887	295.065887	3098500
2011-03-01	296.312775	300.838348	296.312775	300.222351	300.222351	4761100
2011-04-01	300.853241	301.131439	298.121002	299.114563	299.114563	3672700
2011-05-01	298.096161	303.193024	298.086243	302.567078	302.567078	5097500
2011-06-01	303.366882	307.216858	303.053925	304.767792	304.767792	4142300

Converting prices to returns¶

In [37]:

stk_price = stk_df['Adj Close']
stk_ret = stk_price.pct_change().dropna()
stk_ret.plot(title='Google Returns')

Out[37]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f8780606080>

Calculating ACF¶

In [44]:

stk_ret_acf_1 =  acf(stk_ret)[1:32]
stk_ret_acf_2 = [stk_ret.autocorr(i) for i in range(1,32)]

test_df = pd.DataFrame([stk_ret_acf_1, stk_ret_acf_2]).T
test_df.columns = ['Pandas Autocorr', 'Statsmodels Autocorr']
test_df.index += 1
test_df.plot(kind='bar')

Out[44]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f8780168710>

Fitting a GARCH model¶

In [54]:

am = arch_model(stk_ret) 
res = am.fit(update_freq=5)
print(res.summary())

Optimization terminated successfully.    (Exit mode 0)
            Current function value: -5454.19535778409
            Iterations: 4
            Function evaluations: 39
            Gradient evaluations: 3
                     Constant Mean - GARCH Model Results                      
==============================================================================
Dep. Variable:              Adj Close   R-squared:                      -0.000
Mean Model:             Constant Mean   Adj. R-squared:                 -0.000
Vol Model:                      GARCH   Log-Likelihood:                5454.20
Distribution:                  Normal   AIC:                          -10900.4
Method:            Maximum Likelihood   BIC:                          -10878.1
                                        No. Observations:                 1930
Date:                Wed, Dec 05 2018   Df Residuals:                     1926
Time:                        05:33:53   Df Model:                            4
                                 Mean Model                                 
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
mu         8.8455e-04  3.710e-04      2.384  1.710e-02 [1.575e-04,1.612e-03]
                              Volatility Model                              
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
omega      6.6146e-05  8.934e-06      7.404  1.319e-13 [4.864e-05,8.366e-05]
alpha[1]       0.2000  7.609e-02      2.628  8.579e-03   [5.086e-02,  0.349]
beta[1]        0.5000  7.239e-02      6.907  4.945e-12     [  0.358,  0.642]
============================================================================

Covariance estimator: robust

Reversing the time-series¶

In [40]:

stk_ret_reverse = stk_ret.iloc[::-1]
plt.plot(stk_ret_reverse.values)
plt.title('Google Returns (Reversed)')

Out[40]:

Text(0.5,1,'Google Returns (Reversed)')

In [45]:

stk_ret_acf_1_rev =  acf(stk_ret_reverse)[1:32]
stk_ret_acf_2_rev = [stk_ret_reverse.autocorr(i) for i in range(1,32)]

test_df = pd.DataFrame([stk_ret_acf_1_rev, stk_ret_acf_2_rev]).T
test_df.columns = ['Pandas Autocorr', 'Statsmodels Autocorr']
test_df.index += 1
test_df.plot(kind='bar')

Out[45]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f8780086668>

In [55]:

am_rev = arch_model(stk_ret_reverse) 
res_rev = am_rev.fit(update_freq=5)
print(res_rev.summary())

Iteration:      5,   Func. Count:     57,   Neg. LLF: -5434.537980550844
Iteration:     10,   Func. Count:    101,   Neg. LLF: -5435.012878187394
Optimization terminated successfully.    (Exit mode 0)
            Current function value: -5435.012878206995
            Iterations: 12
            Function evaluations: 112
            Gradient evaluations: 10
                     Constant Mean - GARCH Model Results                      
==============================================================================
Dep. Variable:              Adj Close   R-squared:                      -0.001
Mean Model:             Constant Mean   Adj. R-squared:                 -0.001
Vol Model:                      GARCH   Log-Likelihood:                5435.01
Distribution:                  Normal   AIC:                          -10862.0
Method:            Maximum Likelihood   BIC:                          -10839.8
                                        No. Observations:                 1930
Date:                Wed, Dec 05 2018   Df Residuals:                     1926
Time:                        05:34:03   Df Model:                            4
                                 Mean Model                                 
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
mu         1.2909e-03  4.417e-04      2.923  3.468e-03 [4.253e-04,2.157e-03]
                              Volatility Model                              
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
omega      2.4191e-05  1.189e-06     20.343  5.308e-92 [2.186e-05,2.652e-05]
alpha[1]       0.1227  6.650e-02      1.845  6.497e-02  [-7.615e-03,  0.253]
beta[1]        0.7851  4.163e-02     18.859  2.464e-79     [  0.703,  0.867]
============================================================================

Covariance estimator: robust

So both the forward and reverse timeseries have GARCH models that can be estimated. Thus, I'm not sure what that quant meant. Technically, whether a time series is reversed or not, its just a set of returns thus its not certain how the fit of a GARCH model would lead to one knowing whether it is reversed or not. All I can see is that it took more iterations for a reverse timeseries to converge. The other part is that perhaps the p-values for the reverse case are very small indicating that all the variables of the GARCH model are extremely significant?

Fitting a GARCH-GJR model (Forward)¶

In [52]:

am = arch_model(stk_ret, p=1, o=1, q=1)
res = am.fit(update_freq=5, disp='off')
print(res.summary())

                   Constant Mean - GJR-GARCH Model Results                    
==============================================================================
Dep. Variable:              Adj Close   R-squared:                      -0.000
Mean Model:             Constant Mean   Adj. R-squared:                 -0.000
Vol Model:                  GJR-GARCH   Log-Likelihood:                5454.52
Distribution:                  Normal   AIC:                          -10899.0
Method:            Maximum Likelihood   BIC:                          -10871.2
                                        No. Observations:                 1930
Date:                Wed, Dec 05 2018   Df Residuals:                     1925
Time:                        05:33:16   Df Model:                            5
                                 Mean Model                                 
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
mu         8.6411e-04  3.594e-04      2.404  1.621e-02 [1.596e-04,1.569e-03]
                              Volatility Model                              
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
omega      6.6234e-05  1.033e-05      6.415  1.412e-10 [4.600e-05,8.647e-05]
alpha[1]       0.2000      0.137      1.461      0.144  [-6.830e-02,  0.468]
gamma[1]       0.0500      0.144      0.347      0.729     [ -0.233,  0.333]
beta[1]        0.4750  9.420e-02      5.043  4.591e-07     [  0.290,  0.660]
============================================================================

Covariance estimator: robust

Fitting a GARCH-GJR model (Reverse)¶

In [53]:

am = arch_model(stk_ret_reverse, p=1, o=1, q=1)
res = am.fit(update_freq=5, disp='off')
print(res.summary())

                   Constant Mean - GJR-GARCH Model Results                    
==============================================================================
Dep. Variable:              Adj Close   R-squared:                      -0.000
Mean Model:             Constant Mean   Adj. R-squared:                 -0.000
Vol Model:                  GJR-GARCH   Log-Likelihood:                5439.29
Distribution:                  Normal   AIC:                          -10868.6
Method:            Maximum Likelihood   BIC:                          -10840.7
                                        No. Observations:                 1930
Date:                Wed, Dec 05 2018   Df Residuals:                     1925
Time:                        05:33:20   Df Model:                            5
                                 Mean Model                                 
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
mu         1.1064e-03  3.408e-04      3.247  1.168e-03 [4.385e-04,1.774e-03]
                               Volatility Model                              
=============================================================================
                 coef    std err          t      P>|t|       95.0% Conf. Int.
-----------------------------------------------------------------------------
omega      2.2049e-05  9.655e-12  2.284e+06      0.000  [2.205e-05,2.205e-05]
alpha[1]       0.0100  1.187e-02      0.842      0.400 [-1.327e-02,3.327e-02]
gamma[1]       0.1000  5.319e-02      1.880  6.012e-02   [-4.260e-03,  0.204]
beta[1]        0.8400  2.399e-02     35.019 1.140e-268      [  0.793,  0.887]
=============================================================================

Covariance estimator: robust