Dowjones time series with ARIMA
Visualisation of dowjones time series
dowjones time series
has a trend: the time plot shows an upward trend, the ACF shows very high correlations that are not decreasing quickly in amplitude as the lag \(k\) increases; the PACF indicates very high correlation between \(y_t\) and \(y_{t-1}\) but all other lags \(k>1\) have PACF equivalent to 0.
but no seasonality,
and has a noise component.
Fitting a constant model to the time series: \[ y_t=c+\epsilon_t \]
and visualising the residuals \(\lbrace \epsilon_t\rbrace_{1,\cdots,n}\) as follow:
Series: dowjones
ARIMA(0,0,0) with non-zero mean
Coefficients:
mean
115.6833
s.e. 0.6194
sigma^2 estimated as 30.32: log likelihood=-243.23
AIC=490.46 AICc=490.62 BIC=495.17
Removing the trend in dowjones by differencing
To remove the trend, first order differencing is applied:
Series: dowjones
ARIMA(0,1,0)
sigma^2 estimated as 0.1979: log likelihood=-46.86
AIC=95.73 AICc=95.78 BIC=98.07
Note that the model fitted here is (\(c=0\)): \[ y_t=y_{t-1}+\epsilon_t \]
The PACF indicates PACF(1) to be non-zero whereas all other \(PACF(k)=0\) for \(k>1\). On the ACF a quick decrease to 0 value can be observed as \(k\) increases. A AR(1) model can be suggested to fit this residuals.
Suggesting ARIMA(1,1,0)
Series: dowjones
ARIMA(1,1,0)
Coefficients:
ar1
0.4992
s.e. 0.1001
sigma^2 estimated as 0.1515: log likelihood=-36.19
AIC=76.38 AICc=76.54 BIC=81.07
Nothing more can be read with the time plot, ACF and PACF of the residuals. So the final model identified using visualisation is \[ (y_t-y_{t-1})=\phi_1 (y_{t-1}-y_{t-2}) +\epsilon_t \]
Using AIC and BIC
Goodness of fit criteria associated with ARIMA models are given in the R output. We note that starting with ARIMA(0,0,0) model, to ARIMA(0,1,0) model, and concluding with ARIMA(1,1,0) model then we have:
a decrease of BIC values (indicates an improvement in the goodness of fit of the model to the time series data)
a decrease of AIC values (indicates an improvement in the goodness of fit of the model to the time series data)
Using an automatic function in R a better model (based on AIC) is found ARIMA(1,1,1) (marginal improvement of AIC) but our ARIMA(1,1,0) is the best when BIC is chosen.
ARIMA(2,1,2) with drift : 81.9124
ARIMA(0,1,0) with drift : 90.60466
ARIMA(1,1,0) with drift : 76.62097
ARIMA(0,1,1) with drift : 81.31565
ARIMA(0,1,0) : 95.78286
ARIMA(2,1,0) with drift : 77.78036
ARIMA(1,1,1) with drift : 77.27229
ARIMA(2,1,1) with drift : 79.78963
ARIMA(1,1,0) : 76.54332
ARIMA(2,1,0) : 76.97825
ARIMA(1,1,1) : 75.70703
ARIMA(0,1,1) : 83.58248
ARIMA(2,1,1) : 79.20627
ARIMA(1,1,2) : 76.577
ARIMA(0,1,2) : 79.96298
ARIMA(2,1,2) : 78.66149
Best model: ARIMA(1,1,1)
Series: dowjones
ARIMA(1,1,1)
Coefficients:
ar1 ma1
0.8510 -0.5263
s.e. 0.1383 0.2548
sigma^2 estimated as 0.1474: log likelihood=-34.69
AIC=75.38 AICc=75.71 BIC=82.41
Series: dowjones
ARIMA(1,1,1)
Coefficients:
ar1 ma1
0.8510 -0.5263
s.e. 0.1383 0.2548
sigma^2 estimated as 0.1474: log likelihood=-34.69
AIC=75.38 AICc=75.71 BIC=82.41
Exercises: investigate R commands Arima, arima, arima.sim. Read https://otexts.com/fpp2/arima-r.html
Forecasts and prediction intervals
Once the best model is selected, it can be used for computing forecasts and prediction intervals:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
79 120.8456 120.3469 121.3444 120.0829 121.6084
80 120.6538 119.7550 121.5526 119.2792 122.0284
81 120.5580 119.3057 121.8103 118.6428 122.4732
82 120.5102 118.9480 122.0724 118.1210 122.8994
83 120.4863 118.6501 122.3226 117.6781 123.2946
84 120.4744 118.3929 122.5560 117.2909 123.6579
85 120.4685 118.1643 122.7727 116.9445 123.9925
86 120.4655 117.9568 122.9742 116.6288 124.3022
87 120.4640 117.7656 123.1624 116.3372 124.5909
88 120.4633 117.5873 123.3393 116.0649 124.8617
89 120.4629 117.4196 123.5063 115.8085 125.1173
90 120.4627 117.2607 123.6648 115.5656 125.3599
91 120.4627 117.1094 123.8160 115.3342 125.5911
92 120.4626 116.9646 123.9606 115.1128 125.8124
93 120.4626 116.8256 124.0996 114.9003 126.0249
[ reached 'max' / getOption("max.print") -- omitted 5 rows ]