Visualisation of dowjones time series

require(fma)
tsdisplay(dowjones)

dowjones time series

  • has a trend: the time plot shows an upward trend, the ACF shows very high correlations that are not decreasing quickly in amplitude as the lag \(k\) increases; the PACF indicates very high correlation between \(y_t\) and \(y_{t-1}\) but all other lags \(k>1\) have PACF equivalent to 0.

  • but no seasonality,

  • and has a noise component.

Fitting a constant model to the time series: \[ y_t=c+\epsilon_t \]

and visualising the residuals \(\lbrace \epsilon_t\rbrace_{1,\cdots,n}\) as follow:

require(fma)
Arima(dowjones,order=c(0,0,0))
Series: dowjones 
ARIMA(0,0,0) with non-zero mean 

Coefficients:
          mean
      115.6833
s.e.    0.6194

sigma^2 estimated as 30.32:  log likelihood=-243.23
AIC=490.46   AICc=490.62   BIC=495.17
tsdisplay(Arima(dowjones,order=c(0,0,0))$residuals)

Removing the trend in dowjones by differencing

To remove the trend, first order differencing is applied:

require(fma)
Arima(dowjones,order=c(0,1,0))
Series: dowjones 
ARIMA(0,1,0) 

sigma^2 estimated as 0.1979:  log likelihood=-46.86
AIC=95.73   AICc=95.78   BIC=98.07
tsdisplay(Arima(dowjones,order=c(0,1,0))$residuals)

Note that the model fitted here is (\(c=0\)): \[ y_t=y_{t-1}+\epsilon_t \]

The PACF indicates PACF(1) to be non-zero whereas all other \(PACF(k)=0\) for \(k>1\). On the ACF a quick decrease to 0 value can be observed as \(k\) increases. A AR(1) model can be suggested to fit this residuals.

Suggesting ARIMA(1,1,0)

require(fma)
Arima(dowjones,order=c(1,1,0))
Series: dowjones 
ARIMA(1,1,0) 

Coefficients:
         ar1
      0.4992
s.e.  0.1001

sigma^2 estimated as 0.1515:  log likelihood=-36.19
AIC=76.38   AICc=76.54   BIC=81.07
tsdisplay(Arima(dowjones,order=c(1,1,0))$residuals)

Nothing more can be read with the time plot, ACF and PACF of the residuals. So the final model identified using visualisation is \[ (y_t-y_{t-1})=\phi_1 (y_{t-1}-y_{t-2}) +\epsilon_t \]

Using AIC and BIC

Goodness of fit criteria associated with ARIMA models are given in the R output. We note that starting with ARIMA(0,0,0) model, to ARIMA(0,1,0) model, and concluding with ARIMA(1,1,0) model then we have:

  • a decrease of BIC values (indicates an improvement in the goodness of fit of the model to the time series data)

  • a decrease of AIC values (indicates an improvement in the goodness of fit of the model to the time series data)

Using an automatic function in R a better model (based on AIC) is found ARIMA(1,1,1) (marginal improvement of AIC) but our ARIMA(1,1,0) is the best when BIC is chosen.

require(fma)
auto.arima(dowjones,trace=TRUE)

 ARIMA(2,1,2) with drift         : 81.9124
 ARIMA(0,1,0) with drift         : 90.60466
 ARIMA(1,1,0) with drift         : 76.62097
 ARIMA(0,1,1) with drift         : 81.31565
 ARIMA(0,1,0)                    : 95.78286
 ARIMA(2,1,0) with drift         : 77.78036
 ARIMA(1,1,1) with drift         : 77.27229
 ARIMA(2,1,1) with drift         : 79.78963
 ARIMA(1,1,0)                    : 76.54332
 ARIMA(2,1,0)                    : 76.97825
 ARIMA(1,1,1)                    : 75.70703
 ARIMA(0,1,1)                    : 83.58248
 ARIMA(2,1,1)                    : 79.20627
 ARIMA(1,1,2)                    : 76.577
 ARIMA(0,1,2)                    : 79.96298
 ARIMA(2,1,2)                    : 78.66149

 Best model: ARIMA(1,1,1)                    
Series: dowjones 
ARIMA(1,1,1) 

Coefficients:
         ar1      ma1
      0.8510  -0.5263
s.e.  0.1383   0.2548

sigma^2 estimated as 0.1474:  log likelihood=-34.69
AIC=75.38   AICc=75.71   BIC=82.41
Arima(dowjones,order=c(1,1,1))
Series: dowjones 
ARIMA(1,1,1) 

Coefficients:
         ar1      ma1
      0.8510  -0.5263
s.e.  0.1383   0.2548

sigma^2 estimated as 0.1474:  log likelihood=-34.69
AIC=75.38   AICc=75.71   BIC=82.41

Exercises: investigate R commands Arima, arima, arima.sim. Read https://otexts.com/fpp2/arima-r.html

Forecasts and prediction intervals

Once the best model is selected, it can be used for computing forecasts and prediction intervals:

require(fma)
plot(forecast(Arima(dowjones,order=c(1,1,0)),h=20,level = c(80, 95)))

forecast(Arima(dowjones,order=c(1,1,0)),h=20,level = c(80, 95))
   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
79       120.8456 120.3469 121.3444 120.0829 121.6084
80       120.6538 119.7550 121.5526 119.2792 122.0284
81       120.5580 119.3057 121.8103 118.6428 122.4732
82       120.5102 118.9480 122.0724 118.1210 122.8994
83       120.4863 118.6501 122.3226 117.6781 123.2946
84       120.4744 118.3929 122.5560 117.2909 123.6579
85       120.4685 118.1643 122.7727 116.9445 123.9925
86       120.4655 117.9568 122.9742 116.6288 124.3022
87       120.4640 117.7656 123.1624 116.3372 124.5909
88       120.4633 117.5873 123.3393 116.0649 124.8617
89       120.4629 117.4196 123.5063 115.8085 125.1173
90       120.4627 117.2607 123.6648 115.5656 125.3599
91       120.4627 117.1094 123.8160 115.3342 125.5911
92       120.4626 116.9646 123.9606 115.1128 125.8124
93       120.4626 116.8256 124.0996 114.9003 126.0249
 [ reached 'max' / getOption("max.print") -- omitted 5 rows ]