Introduction to time
series and
stationarity
F ORECA S TIN G US IN G A RIMA MODELS IN P YTH ON
James Fulton
Climate informatics researcher
Motivation
Time series are everywhere
Science
Technology
Business
Finance
Policy
FORECASTING USING ARIMA MODELS IN PYTHON
Course content
You will learn
Structure of ARIMA models
How to t ARIMA model
How to optimize the model
How to make forecasts
How to calculate uncertainty in predictions
FORECASTING USING ARIMA MODELS IN PYTHON
Loading and plotting
import pandas as pd
import matplotlib as plt
df = pd.read_csv('time_series.csv', index_col='date', parse_dates=True)
date values
2019-03-11 5.734193
2019-03-12 6.288708
2019-03-13 5.205788
2019-03-14 3.176578
FORECASTING USING ARIMA MODELS IN PYTHON
Trend
fig, ax = plt.subplots()
df.plot(ax=ax)
plt.show()
FORECASTING USING ARIMA MODELS IN PYTHON
Seasonality
FORECASTING USING ARIMA MODELS IN PYTHON
Cyclicality
FORECASTING USING ARIMA MODELS IN PYTHON
White noise
White noise series has uncorrelated values
Heads, heads, heads, tails, heads, tails, ...
0.1, -0.3, 0.8, 0.4, -0.5, 0.9, ...
FORECASTING USING ARIMA MODELS IN PYTHON
Stationarity
Stationary Not stationary
Trend stationary: Trend is zero
FORECASTING USING ARIMA MODELS IN PYTHON
Stationarity
Stationary Not stationary
Trend stationary: Trend is zero
Variance is constant
FORECASTING USING ARIMA MODELS IN PYTHON
Stationarity
Stationary Not stationary
Trend stationary: Trend is zero
Variance is constant
Autocorrelation is constant
FORECASTING USING ARIMA MODELS IN PYTHON
Train-test split
# Train data - all data up to the end of 2018
df_train = df.loc[:'2018']
# Test data - all data from 2019 onwards
df_test = df.loc['2019':]
FORECASTING USING ARIMA MODELS IN PYTHON
Let's Practice!
F ORECA S TIN G US IN G A RIMA MODELS IN P YTH ON
Making time series
stationary
F ORECA S TIN G US IN G A RIMA MODELS IN P YTH ON
James Fulton
Climate informatics researcher
Overview
Statistical tests for stationarity
Making a dataset stationary
FORECASTING USING ARIMA MODELS IN PYTHON
The augmented Dicky-Fuller test
Tests for trend non-stationarity
Null hypothesis is time series is non-stationary
FORECASTING USING ARIMA MODELS IN PYTHON
Applying the adfuller test
from statsmodels.tsa.stattools import adfuller
results = adfuller(df['close'])
FORECASTING USING ARIMA MODELS IN PYTHON
Interpreting the test result
print(results)
(-1.34, 0.60, 23, 1235, {'1%': -3.435, '5%': -2.913, '10%': -2.568}, 10782.87)
0th element is test statistic (-1.34)
More negative means more likely to be stationary
1st element is p-value: (0.60)
If p-value is small → reject null hypothesis. Reject non-stationary.
4th element is the critical test statistics
FORECASTING USING ARIMA MODELS IN PYTHON
Interpreting the test result
print(results)
(-1.34, 0.60, 23, 1235, {'1%': -3.435, '5%': -2.863, '10%': -2.568}, 10782.87)
0th element is test statistic (-1.34)
More negative means more likely to be stationary
1st element is p-value: (0.60)
If p-value is small → reject null hypothesis. Reject non-stationary.
4th element is the critical test statistics
1 https://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.adfuller.html
FORECASTING USING ARIMA MODELS IN PYTHON
The value of plotting
Plotting time series can stop you making wrong assumptions
FORECASTING USING ARIMA MODELS IN PYTHON
The value of plotting
FORECASTING USING ARIMA MODELS IN PYTHON
Making a time series stationary
FORECASTING USING ARIMA MODELS IN PYTHON
Taking the difference
Difference: Δyt = yt − yt−1
FORECASTING USING ARIMA MODELS IN PYTHON
Taking the difference
df_stationary = df.diff()
city_population
date
1969-09-30 NaN
1970-03-31 -0.116156
1970-09-30 0.050850
1971-03-31 -0.153261
1971-09-30 0.108389
FORECASTING USING ARIMA MODELS IN PYTHON
Taking the difference
df_stationary = df.diff().dropna()
city_population
date
1970-03-31 -0.116156
1970-09-30 0.050850
1971-03-31 -0.153261
1971-09-30 0.108389
1972-03-31 -0.029569
FORECASTING USING ARIMA MODELS IN PYTHON
Taking the difference
FORECASTING USING ARIMA MODELS IN PYTHON
Other transforms
Examples of other transforms
Take the log
np.log(df)
Take the square root
np.sqrt(df)
Take the proportional change
df.shift(1)/df
FORECASTING USING ARIMA MODELS IN PYTHON
Let's practice!
F ORECA S TIN G US IN G A RIMA MODELS IN P YTH ON
Intro to AR, MA and
ARMA models
F ORECA S TIN G US IN G A RIMA MODELS IN P YTH ON
James Fulton
Climate informatics researcher
AR models
Autoregressive (AR) model
AR(1) model :
yt = a1 yt−1 + ϵt
FORECASTING USING ARIMA MODELS IN PYTHON
AR models
Autoregressive (AR) model
AR(1) model :
yt = a1 yt−1 + ϵt
AR(2) model :
yt = a1 yt−1 + a2 yt−2 + ϵt
AR(p) model :
yt = a1 yt−1 + a2 yt−2 + ... + ap yt−p + ϵt
FORECASTING USING ARIMA MODELS IN PYTHON
MA models
Moving average (MA) model
MA(1) model :
yt = m1 ϵt−1 + ϵt
MA(2) model :
yt = m1 ϵt−1 + m2 ϵt−2 + ϵt
MA(q) model :
yt = m1 ϵt−1 + m2 ϵt−2 + ... + mq ϵt−q + ϵt
FORECASTING USING ARIMA MODELS IN PYTHON
ARMA models
Autoregressive moving-average (ARMA) model
ARMA = AR + MA
ARMA(1,1) model :
yt = a1 yt−1 + m1 ϵt−1 + ϵt
ARMA(p, q)
p is order of AR part
q is order of MA part
FORECASTING USING ARIMA MODELS IN PYTHON
Creating ARMA data
yt = a1 yt−1 + m1 ϵt−1 + ϵt
FORECASTING USING ARIMA MODELS IN PYTHON
Creating ARMA data
yt = 0.5yt−1 + 0.2ϵt−1 + ϵt
from statsmodels.tsa.arima_process import arma_generate_sample
ar_coefs = [1, -0.5]
ma_coefs = [1, 0.2]
y = arma_generate_sample(ar_coefs, ma_coefs, nsample=100, sigma=0.5)
FORECASTING USING ARIMA MODELS IN PYTHON
Creating ARMA data
yt = 0.5yt−1 + 0.2ϵt−1 + ϵt
FORECASTING USING ARIMA MODELS IN PYTHON
Fitting and ARMA model
from statsmodels.tsa.arima_model import ARMA
# Instantiate model object
model = ARMA(y, order=(1,1))
# Fit model
results = model.fit()
FORECASTING USING ARIMA MODELS IN PYTHON
Let's practice!
F ORECA S TIN G US IN G A RIMA MODELS IN P YTH ON