Model validation for time series regression models
This is an overview of the diagnostic and performance tests that need to be performed to ensure the validity of a timeseries ARIMAX regression model.
In the above equation, $R_{t}$ is the differenced original series. For example, if we have stock prices ($P_{t}$), we perform a differencing operation of 1 (i.e., $R_{t} = P_{t}  P_{t1}$).
A time series is integrated of order $d$ if $(1L)^{d} X_{t}$ is a stationary process where $L$ is the lag operator and $1L$ is the first difference (i.e., $(1L)X_{t}=X_{t}X_{t1}$). A time series of order $d=2$ is as follows:
The values of $p$ and $q$ are the orders of the autoregressive (AR) and moving average (MA) terms. The value of $r$ denotes the number of independent variable (X) terms included in the model.
Contents
The assumptions of the ARIMAX regression model are:
 Stationarity

The stochastic process within the time series is time invariant (i.e., constant mean and variance through time). Stationarity testing is important on both the independent and dependent variables as two variables that are nonstationary that are regressed on one another can lead to spurious regressions.
 Normality

Error term is normally distributed.
 Independence

Error terms are statistically independent.
 Homoscedasticity

Error term has constant variance for all observations.
 Lack of multicollinearity

No excessive correlation between independent variables.
Data Diagnostics
The data used for modelling should be evaluated for the following:
 Compliance with relevant regulatory requirements

Often these requirements refer to data length requirements for different types of portfolios, ensuring the data length is representative of the economic cycle, and requirements for use of data proxies (e.g., BCC135 [Conservatism to risk parameters in Advanced Approaches], BCC143 [Selection of reference data periods and data deficiencies]).
 Outliers, missing or special values.

Outliers or influential data points should be identified (i.e., Cook's distance) and model performance should be evaluated with the exclusion of these outliers.
Model identification
These tests evaluate how well a regression model fits the data. The tests are formal regression statistics and descriptive fit statistics all of which assess the statistical significance of the independent variables individually and as a whole.
Test 
Description 

Autocorrelation (ACF) 
Autocorrelation describes the dependence (i.e., relationship) between a prior time step and the current observation. This dependence captured by ACF includes both direct and indirect dependence information. To decide the number of lags for the MA term, look at the spikes in the ACF plot. 
Partial autocorrelation (PACF) 
Partial autocorrelation describes only the direct dependence between an observation and its lag. The partial autocorrelation at lag kk is the correlation that results after removing the effect of any correlations due to the terms at shorter lags. To decide the number of lags for the AR term, look at the spikes in the PACF plot. 
Stationarity 
If the ACF/PACF values zeroes quickly, the series can be considered stationary. If the ACF/PACF values decrease slowly or oscillate, the series may be nonstationary and transformations may need to be applied to produce a stationary series (e.g., first/second differencing, log transformation). 
Seasonality 
If there are spikes in the autocorrelation values of the ACF/PACF plot at regular intervals (i.e., 12, 24, 36,..., etc. for monthly data; 1, 4, 8,..., etc. for quarterly data), there is seasonality. Transformations may need to be applied to remove the seasonality effect (i.e., seasonality differencing). 
Stationarity
 Strict stationary.

Time series with statistical properties such as mean, variance, autocorrelation, etc. are all constant over time.
 Trend stationary

Time series has no unit root but exhibits a trend. If the trend is removed from the trend stationary series, it becomes strict stationary.
 Difference stationary

Time series that can be made stationary via differencing.
Test 
Description 

Augmented DickyFuller (ADF) 
Detects whether a timeseries can be made strict stationary (or trend stationary) via differencing (removing the trend). Is parametric and requires selection of the level of serial correlation. The null hypotheses is that the process has a unit root. 
PhillipsPerron (PP) 
Detects whether a timeseries can be made strict stationary via differencing. Is nonparametric and improves the ADF by correcting for autocorrelations and heteroscedasticity (i.e., HAC type corrections). Requires large datasets. 
KwiatkowskiPhillipsSchmidtShin (KPSS) 
Detects whether a timeseries has a unit root. Thus, the timeseries can be strict stationary or trend stationary. The null hypotheses is that the process is trendstationary. 
Cointegration
Although regressing two nonstationary variables against each other leads to spurious regressions, it is acceptable to do if both variables are cointegrated. When you have two nonstationary processes (i.e., $X_1$ and $X_2$), there is a vector (i.e., cointegration vector) that can combine these two processes into a stationary process. Basically, the stochastic trends in both $X_1$ and $X_2$ are the same and can be cancelled out using the cointegration vector.
It is possible to difference these nonstationary variables, but often doing so can result in a loss of information regarding their longrun relationship. Thus, regressing two nonstationary variables that are cointegrated may be preferable.
Test 
Description 

EngleGranger 
Tests each time series for a unit root (i.e., nonstationarity) using an ADF test. If the time series has a unit root, an OLS is run between the time series to obtain the residuals. The residuals are tested using ADF and if they are stationary, then the original time series are cointegrated. The time series that are being considered must be of the same order of integration. 
PhillipsOuliaris 
This is an improvement on the EngleGranger test that is accounts the variability in the residuals since they are estimates, not actual parameter values. It is also invariant to the normalization of the cointegration relationship. 
Johansen test 
This is an improvement on the EngleGranger test that it avoids the issue of chossing a dependent variable and when errors are being used from one step to the next in the EngleGranger test. The Johansen test can detect multiple cointegrating vectors. 
Structural breaks
Test 
Description 

Chow Test 
The null hypotheses is that there is no structural break in the data. On a graphical or theoretical basis, the data is split into two samples and regressions are run on each sample The Chow test is used to evaluate whether the model paramters from the two data samples are statistically similar. Evidence of a structural break means that the model may need to be estimated using different specifications (i.e., spline functions) or data (i.e., data subsets, data exclusions). 
Comments
Comments powered by Disqus