Timeseries forecasting
Timeseries forecasting¶
Timeseries forecasting can be generally split into two categories
1) Signal processing. Signal processing is typically what is used in engineering and econometrics. ARIMA/GARCH models attempt to filter out the 'signals' from the noise and extrapolate the signals into the future. Famous models for interest rate pricing are 2-factor models (i.e., Vasicek models, Cox-Ingersoll-Ross) models. CIR models allow for mean-reversion,
Vasicek: $dr_t = a(b-r_t)dt+\sigma dW_t$
$a$: speed of reversion
$b$: long-term mean level
$\sigma$: volatility
CIR: $dr_t = a(b-r_t)dt+\sigma \sqrt{r_t} dW_t$
$\sigma \sqrt{r_t}$: removes the possibility of negative interest rates.
2) Curve fitting. Curve fitting is used in models like Facebook's Prophet model, Nelson-Siegel-Svensson models (i.e, term structure of interest rates), spline models. Curve fitting models are very popular in industry.
This post will look at comparing timeseries forecasting models from traditional econometrics vs machine-learning
Importing data science modules¶
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Preprocessing function to normalize data¶
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, Dropout, LTSM
scaler = MinMaxScaler(feature_range=(0,1))
Processing stock data¶
- Read in the data, tell it to parse dates and that dates are the index
df = pd.read_csv('AAPL.csv',parse_dates=[0],index_col=[0])
df.head()
df_price = df['Adj Close']
df_price.head()
df_price.index
df_price.plot()
plt.figure(figsize=(32,16))
plt.plot(df_price)
arr_price = df_price.values
train = arr_price[:1500]
test = arr_price[1501:]
arr_price[:5]
train[:5]
test[:5]
### Scaling dataset
scaler = MinMaxScaler(feature_range=(0,1))
scaled_data = scaler.fit_transform(arr_price)
Comments
Comments powered by Disqus