29 Classical Forecasting: ARIMA and Exponential Smoothing

Author

Bongo Adi

Published

January 1, 2024

📋 Learning Objectives

By the end of this chapter, you will:

Understand exponential smoothing as a recursive forecasting framework (SES, Holt, Holt-Winters)
Implement ARIMA(p,d,q) models: identify orders using ACF/PACF, fit via MLE, diagnose residuals
Choose between seasonal (SARIMA) and non-seasonal ARIMA using information criteria
Apply auto.arima to automate parameter selection
Quantify forecast uncertainty via prediction intervals
Communicate forecasts and uncertainty to non-technical stakeholders
Build production-ready forecasts for Nigerian FX rates and economic metrics

29.1 Simple Exponential Smoothing (SES): The Forgetting Principle

Exponential smoothing is an elegant recursive framework for forecasting. The core idea: recent observations are more informative than distant ones, so weight them more heavily. But rather than explicitly choosing weights (difficult!), use a single smoothing parameter \(\alpha\) to control decay.

29.1.1 The SES Recursion

Simple exponential smoothing for a series without trend or seasonality:

\[\hat{Y}_{t+1} = \alpha Y_t + (1 - \alpha) \hat{Y}_{t}\]

where \(\hat{Y}_{t+1}\) is the forecast for time \(t+1\) given data through time \(t\), and \(\alpha \in [0, 1]\) is the smoothing parameter.

Intuition: The forecast is a weighted average of the current observation (\(Y_t\), weight \(\alpha\)) and the prior forecast (\(\hat{Y}_t\), weight \(1 - \alpha\)).

If \(\alpha\) is close to 1, we trust recent data heavily (forecast reacts quickly to changes).
If \(\alpha\) is close to 0, we update slowly (forecast is more stable, less reactive).

29.1.2 Equivalently: SES as an Infinite Moving Average

Expanding the recursion:

\[\hat{Y}_{t+1} = \alpha Y_t + (1 - \alpha) \hat{Y}_t = \alpha Y_t + (1 - \alpha) [\alpha Y_{t-1} + (1 - \alpha) \hat{Y}_{t-1}]\]

\[= \alpha Y_t + \alpha(1 - \alpha) Y_{t-1} + \alpha(1 - \alpha)^2 Y_{t-2} + \cdots\]

So the forecast is a weighted average of all past observations, with exponentially decaying weights. Distant observations have negligible influence.

29.1.3 Fitting SES: Choosing the Optimal \(\alpha\)

In practice, \(\alpha\) is chosen by minimising the sum of squared one-step-ahead forecast errors (SSE):

\[\text{SSE} = \sum_{t=2}^{T} (Y_t - \hat{Y}_t)^2\]

This is done via numerical optimisation. Most software (R, Python) defaults provide good \(\alpha\) estimates.

29.1.4 Applying SES to Nigerian Petrol Price

Show code

library(forecast)
library(tidyverse)

# Generate synthetic Nigerian petrol price data (monthly, 2015-2023)
set.seed(42)
n_months_petrol <- 108
dates_petrol <- seq(from = ymd("2015-01-01"), by = "month", length.out = n_months_petrol)

# Petrol price: upward trend with some noise
base_price <- 50
trend_petrol <- base_price + 0.1 * (1:n_months_petrol) + 10 * sin(2 * pi * (1:n_months_petrol) / 24)
noise_petrol <- rnorm(n_months_petrol, 0, 2)
petrol_price <- trend_petrol + noise_petrol

petrol_ts <- ts(petrol_price, start = c(2015, 1), frequency = 12)

# Fit SES with optimal alpha
ses_fit <- ses(petrol_ts, alpha = NULL, initial = "optimal", h = 12)

cat("=== Simple Exponential Smoothing (SES) ===\n")
#> === Simple Exponential Smoothing (SES) ===
cat("Optimal smoothing parameter (alpha):", round(ses_fit$model$par['alpha'], 3), "\n")
#> Optimal smoothing parameter (alpha): 0.968
cat("AIC:", round(ses_fit$model$aic, 1), "\n\n")
#> AIC: 776.6

# Forecast next 12 months
forecast_ses <- forecast(ses_fit, h = 12)

# Plot
autoplot(petrol_ts) +
  autolayer(forecast_ses, alpha = 0.7, PI = TRUE) +
  theme_minimal() +
  labs(title = "Nigerian Petrol Price: SES Forecast",
       x = "Date", y = "Price (NGN/litre)") +
  theme(legend.position = "bottom")

Show code


# Print forecast
cat("=== SES Forecast (next 12 months) ===\n")
#> === SES Forecast (next 12 months) ===
print(forecast_ses)
#>          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> Jan 2024       60.61828 56.20826 65.02830 53.87374 67.36282
#> Feb 2024       60.61828 54.47922 66.75735 51.22940 70.00717
#> Mar 2024       60.61828 53.13978 68.09678 49.18091 72.05566
#> Apr 2024       60.61828 52.00621 69.23035 47.44726 73.78931
#> May 2024       60.61828 51.00539 70.23117 45.91664 75.31993
#> Jun 2024       60.61828 50.09937 71.13719 44.53100 76.70557
#> Jul 2024       60.61828 49.26543 71.97114 43.25559 77.98098
#> Aug 2024       60.61828 48.48869 72.74788 42.06766 79.16890
#> Sep 2024       60.61828 47.75877 73.47779 40.95136 80.28521
#> Oct 2024       60.61828 47.06812 74.16844 39.89510 81.34146
#> Nov 2024       60.61828 46.41101 74.82556 38.89013 82.34643
#> Dec 2024       60.61828 45.78297 75.45360 37.92963 83.30694

Show code

from statsmodels.tsa.holtwinters import SimpleExpSmoothing, ExponentialSmoothing
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime, timedelta

# Generate synthetic Nigerian petrol price
np.random.seed(42)
n_months_petrol = 108
dates_petrol = pd.date_range(start='2015-01-01', periods=n_months_petrol, freq='MS')

base_price = 50
trend_petrol = base_price + 0.1 * np.arange(1, n_months_petrol + 1) + 10 * np.sin(2 * np.pi * np.arange(1, n_months_petrol + 1) / 24)
noise_petrol = np.random.normal(0, 2, n_months_petrol)
petrol_price = trend_petrol + noise_petrol

petrol_ts = pd.Series(petrol_price, index=dates_petrol)

# Fit SES
ses_model = SimpleExpSmoothing(petrol_ts)
ses_fit = ses_model.fit(optimized=True)

print("=== Simple Exponential Smoothing (SES) ===")
#> === Simple Exponential Smoothing (SES) ===
print(f"Optimal smoothing parameter (alpha): {ses_fit.params['smoothing_level']:.3f}")
#> Optimal smoothing parameter (alpha): 1.000
print(f"AIC: {ses_fit.aic:.1f}\n")
#> AIC: 251.3

# Forecast (SimpleExpSmoothing uses .forecast(), not .get_forecast())
n_forecast = 12
forecast_values = ses_fit.forecast(steps=n_forecast)
forecast_dates = pd.date_range(start=petrol_ts.index[-1] + pd.DateOffset(months=1),
                               periods=n_forecast, freq='MS')
forecast_mean = pd.Series(forecast_values.values, index=forecast_dates)

# Approximate 95% CI from in-sample residuals
sigma = ses_fit.resid.std()
lower_ci = forecast_mean - 1.96 * sigma
upper_ci = forecast_mean + 1.96 * sigma

# Plot
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(petrol_ts.index, petrol_ts.values, color='black', linewidth=1.5, label='Historical Data')
ax.plot(forecast_mean.index, forecast_mean.values, color='blue', linewidth=2, label='SES Forecast')
ax.fill_between(forecast_mean.index, lower_ci, upper_ci,
                 alpha=0.2, color='blue', label='95% Prediction Interval')
ax.set_title('Nigerian Petrol Price: SES Forecast', fontsize=12, fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('Price (NGN/litre)')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('ses_forecast.png', dpi=150, bbox_inches='tight')
print("Plot saved as 'ses_forecast.png'")
#> Plot saved as 'ses_forecast.png'

# Print forecast
print("\n=== SES Forecast (next 12 months) ===")
#> 
#> === SES Forecast (next 12 months) ===
forecast_df = pd.DataFrame({
    'Forecast': forecast_mean.values,
    'Lower_CI': lower_ci.values,
    'Upper_CI': upper_ci.values
}, index=forecast_mean.index)
print(forecast_df.round(2))
#>             Forecast  Lower_CI  Upper_CI
#> 2024-01-01     61.15     54.96     67.33
#> 2024-02-01     61.15     54.96     67.33
#> 2024-03-01     61.15     54.96     67.33
#> 2024-04-01     61.15     54.96     67.33
#> 2024-05-01     61.15     54.96     67.33
#> 2024-06-01     61.15     54.96     67.33
#> 2024-07-01     61.15     54.96     67.33
#> 2024-08-01     61.15     54.96     67.33
#> 2024-09-01     61.15     54.96     67.33
#> 2024-10-01     61.15     54.96     67.33
#> 2024-11-01     61.15     54.96     67.33
#> 2024-12-01     61.15     54.96     67.33

SES automatically chooses \(\alpha\) to minimise forecast error, then projects forward. The forecast is flat (constant) because SES assumes no trend or seasonality—a reasonable assumption for many stationary series but crude for trending data.

📝 Section 24.1 Review Questions

If \(\alpha = 0.9\), does the forecast adapt quickly or slowly to new data?
SES forecast is always a constant (horizontal line into the future). Is this a strength or weakness?
How would you choose between \(\alpha = 0.2\) and \(\alpha = 0.8\) without fitting both? (Hint: look at the data visually.)
What is the initial condition \(\hat{Y}_1\)? How does it affect the forecast?

29.2 Holt’s Linear Trend Method

If data exhibits a trend, SES is inappropriate. Holt’s method extends SES with a level and trend component:

\[\ell_t = \alpha Y_t + (1 - \alpha)(\ell_{t-1} + b_{t-1})\] \[b_t = \beta (\ell_t - \ell_{t-1}) + (1 - \beta) b_{t-1}\] \[\hat{Y}_{t+h} = \ell_t + h \cdot b_t\]

where \(\ell_t\) is the estimated level (baseline), \(b_t\) is the estimated trend (slope), \(\alpha\) controls level smoothing, and \(\beta\) controls trend smoothing.

The forecast is a linear extrapolation: current level plus \(h\) periods of the current trend.

29.2.1 Fitting Holt’s Method

Similar to SES, \(\alpha\) and \(\beta\) are chosen to minimise forecast errors. Most software provides automatic selection.

Show code

# Fit Holt's linear trend method to petrol price
holt_fit <- holt(petrol_ts, alpha = NULL, beta = NULL, initial = "optimal", h = 12)

cat("=== Holt's Linear Trend Method ===\n")
#> === Holt's Linear Trend Method ===
cat("Smoothing parameter (alpha):", round(holt_fit$model$par['alpha'], 3), "\n")
#> Smoothing parameter (alpha): 0.463
cat("Trend parameter (beta):", round(holt_fit$model$par['beta'], 3), "\n")
#> Trend parameter (beta): 0.415
cat("AIC:", round(holt_fit$model$aic, 1), "\n\n")
#> AIC: 769.3

# Forecast
forecast_holt <- forecast(holt_fit, h = 12)

# Plot
autoplot(petrol_ts) +
  autolayer(forecast_holt, alpha = 0.7, PI = TRUE) +
  theme_minimal() +
  labs(title = "Nigerian Petrol Price: Holt's Linear Trend Forecast",
       x = "Date", y = "Price (NGN/litre)") +
  theme(legend.position = "bottom")

Show code


# Compare SES vs Holt
cat("=== SES vs. Holt Comparison ===\n")
#> === SES vs. Holt Comparison ===
cat("SES AIC:", round(ses_fit$model$aic, 1), "\n")
#> SES AIC: 776.6
cat("Holt AIC:", round(holt_fit$model$aic, 1), "\n")
#> Holt AIC: 769.3
cat("Better: ", ifelse(holt_fit$model$aic < ses_fit$model$aic, "Holt", "SES"), "\n")
#> Better:  Holt

Show code

from statsmodels.tsa.holtwinters import ExponentialSmoothing

# Fit Holt's linear trend method
holt_model = ExponentialSmoothing(petrol_ts, trend='add', seasonal=None, initialization_method='estimated')
holt_fit = holt_model.fit(optimized=True)

print("=== Holt's Linear Trend Method ===")
#> === Holt's Linear Trend Method ===
print(f"Smoothing parameter (alpha): {holt_fit.params['smoothing_level']:.3f}")
#> Smoothing parameter (alpha): 0.603
print(f"Trend parameter (beta): {holt_fit.params['smoothing_trend']:.3f}")
#> Trend parameter (beta): 0.603
print(f"AIC: {holt_fit.aic:.1f}\n")
#> AIC: 244.4

# Forecast (ExponentialSmoothing uses .forecast(), not .get_forecast())
n_steps_holt = 12
forecast_holt_values = holt_fit.forecast(steps=n_steps_holt)
forecast_holt_dates = pd.date_range(start=petrol_ts.index[-1] + pd.DateOffset(months=1),
                                    periods=n_steps_holt, freq='MS')
forecast_holt_mean = pd.Series(forecast_holt_values.values, index=forecast_holt_dates)
sigma_holt = holt_fit.resid.std()
holt_lower = forecast_holt_mean - 1.96 * sigma_holt
holt_upper = forecast_holt_mean + 1.96 * sigma_holt

# Plot
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(petrol_ts.index, petrol_ts.values, color='black', linewidth=1.5, label='Historical Data')
ax.plot(forecast_holt_mean.index, forecast_holt_mean.values, color='darkgreen', linewidth=2, label="Holt's Trend Forecast")
ax.fill_between(forecast_holt_mean.index, holt_lower, holt_upper,
                 alpha=0.2, color='darkgreen', label='95% Prediction Interval')
ax.set_title("Nigerian Petrol Price: Holt's Linear Trend Forecast", fontsize=12, fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('Price (NGN/litre)')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('holt_forecast.png', dpi=150, bbox_inches='tight')
print("Plot saved as 'holt_forecast.png'")
#> Plot saved as 'holt_forecast.png'

# Compare
print("\n=== SES vs. Holt Comparison ===")
#> 
#> === SES vs. Holt Comparison ===
ses_aic = ses_fit.aic
holt_aic = holt_fit.aic
print(f"SES AIC: {ses_aic:.1f}")
#> SES AIC: 251.3
print(f"Holt AIC: {holt_aic:.1f}")
#> Holt AIC: 244.4
print(f"Better: {'Holt' if holt_aic < ses_aic else 'SES'}")
#> Better: Holt

Holt’s method produces a sloped forecast line, capturing the trend. The forecast continues linearly into the future. This is more realistic for trending data than SES’s flat forecast.

📝 Section 24.2 Review Questions

Holt’s forecast continues linearly forever. Is this realistic for petrol prices? What could go wrong?
What is the relationship between \(\beta\) (trend smoothing) and trend persistence? If \(\beta = 0\), what happens?
How would you modify Holt’s method to forecast a trend that is decelerating (getting flatter over time)?

29.3 Holt-Winters Seasonal Method

The final extension adds seasonality via Holt-Winters:

\[\ell_t = \alpha \frac{Y_t}{s_{t-m}} + (1 - \alpha)(\ell_{t-1} + b_{t-1})\] \[b_t = \beta (\ell_t - \ell_{t-1}) + (1 - \beta) b_{t-1}\] \[s_t = \gamma \frac{Y_t}{\ell_t} + (1 - \gamma) s_{t-m}\] \[\hat{Y}_{t+h} = (\ell_t + h \cdot b_t) \cdot s_{t+h-m}\]

(This is the multiplicative version; an additive version exists for series without proportional seasonality.)

Three smoothing parameters: - \(\alpha\): level smoothing. - \(\beta\): trend smoothing. - \(\gamma\): seasonal smoothing.

The seasonal indices \(s_t\) are stored and recycled. If \(m = 12\) (monthly), the seasonal factor repeats every 12 months.

29.3.1 Applying Holt-Winters to Nigerian Retail Revenue

Show code

# Generate synthetic Nigerian retail revenue (monthly, strong seasonality)
set.seed(42)
n_months_retail <- 120
dates_retail <- seq(from = ymd("2014-01-01"), by = "month", length.out = n_months_retail)

# Retail revenue: upward trend + strong seasonality (December peak)
base_revenue <- 1000
trend_retail <- base_revenue + 1.5 * (1:n_months_retail)
seasonality_retail <- 200 * sin(2 * pi * (1:n_months_retail) / 12 - pi/2) +
                       150 * sin(2 * pi * (1:n_months_retail) / 6)  # Semi-annual
noise_retail <- rnorm(n_months_retail, 0, 50)
revenue <- trend_retail + seasonality_retail + noise_retail

revenue_ts <- ts(revenue, start = c(2014, 1), frequency = 12)

# Fit Holt-Winters (multiplicative for growing seasonality)
hw_fit <- hw(revenue_ts, seasonal = "mult", h = 24)

cat("=== Holt-Winters Seasonal Method ===\n")
#> === Holt-Winters Seasonal Method ===
cat("Smoothing parameters:\n")
#> Smoothing parameters:
cat("  Alpha (level):", round(hw_fit$model$par['alpha'], 3), "\n")
#>   Alpha (level): 0.045
cat("  Beta (trend):", round(hw_fit$model$par['beta'], 3), "\n")
#>   Beta (trend): 0.02
cat("  Gamma (seasonal):", round(hw_fit$model$par['gamma'], 3), "\n")
#>   Gamma (seasonal): 0
cat("AIC:", round(hw_fit$model$aic, 1), "\n\n")
#> AIC: 1571.6

# Forecast 24 months ahead
forecast_hw <- forecast(hw_fit, h = 24)

# Plot
autoplot(revenue_ts) +
  autolayer(forecast_hw, alpha = 0.7, PI = TRUE) +
  theme_minimal() +
  labs(title = "Nigerian Retail Revenue: Holt-Winters Seasonal Forecast",
       x = "Date", y = "Revenue (NGN millions)") +
  theme(legend.position = "bottom")

Show code


print(forecast_hw)
#>          Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
#> Jan 2024      1114.7304 1035.8555 1193.6052  994.1017 1235.3590
#> Feb 2024      1202.3793 1117.1188 1287.6398 1071.9846 1332.7739
#> Mar 2024      1160.8589 1078.2400 1243.4778 1034.5042 1287.2137
#> Apr 2024      1158.5369 1075.6250 1241.4487 1031.7341 1285.3396
#> May 2024      1244.2715 1154.5323 1334.0108 1107.0272 1381.5159
#> Jun 2024      1399.4963 1297.5264 1501.4663 1243.5467 1555.4459
#> Jul 2024      1511.3998 1399.8477 1622.9519 1340.7956 1682.0041
#> Aug 2024      1457.2582 1347.9986 1566.5177 1290.1601 1624.3562
#> Sep 2024      1201.3434 1109.5801 1293.1067 1061.0035 1341.6833
#> Oct 2024       948.9579  874.8989 1023.0169  835.6945 1062.2213
#> Nov 2024       829.4102  763.0906  895.7299  727.9830  930.8374
#> Dec 2024       980.8944  900.3220 1061.4667  857.6696 1104.1192
#> Jan 2025      1131.9457 1036.1931 1227.6982  985.5048 1278.3866
#> Feb 2025      1220.9243 1114.3329 1327.5157 1057.9069 1383.9418
#> Mar 2025      1178.7406 1072.3258 1285.1555 1015.9932 1341.4881
#> Apr 2025      1176.3599 1066.3590 1286.3609 1008.1280 1344.5918
#> May 2025      1263.3890 1140.8521 1385.9260 1075.9849 1450.7931
#> Jun 2025      1420.9712 1277.8598 1564.0827 1202.1012 1639.8413
#> Jul 2025      1534.5622 1373.9388 1695.1857 1288.9099 1780.2146
#> Aug 2025      1479.5624 1318.5094 1640.6154 1233.2531 1725.8717
#> Sep 2025      1219.7073 1081.5772 1357.8373 1008.4556 1430.9590
#> Oct 2025       963.4453  849.9017 1076.9889  789.7953 1137.0953
#> Nov 2025       842.0564  738.7763  945.3366  684.1030 1000.0099
#> Dec 2025       995.8314  868.7215 1122.9412  801.4336 1190.2291

Show code

# Generate synthetic Nigerian retail revenue
np.random.seed(42)
n_months_retail = 120
dates_retail = pd.date_range(start='2014-01-01', periods=n_months_retail, freq='MS')

base_revenue = 1000
trend_retail = base_revenue + 1.5 * np.arange(1, n_months_retail + 1)
seasonality_retail = (200 * np.sin(2 * np.pi * np.arange(1, n_months_retail + 1) / 12 - np.pi/2) +
                      150 * np.sin(2 * np.pi * np.arange(1, n_months_retail + 1) / 6))
noise_retail = np.random.normal(0, 50, n_months_retail)
revenue = trend_retail + seasonality_retail + noise_retail

revenue_ts = pd.Series(revenue, index=dates_retail)

# Fit Holt-Winters (multiplicative)
hw_model = ExponentialSmoothing(revenue_ts, trend='add', seasonal='mul', seasonal_periods=12, initialization_method='estimated')
hw_fit = hw_model.fit(optimized=True)

print("=== Holt-Winters Seasonal Method ===")
#> === Holt-Winters Seasonal Method ===
print(f"Smoothing parameters:")
#> Smoothing parameters:
print(f"  Alpha (level): {hw_fit.params['smoothing_level']:.3f}")
#>   Alpha (level): 0.000
print(f"  Beta (trend): {hw_fit.params['smoothing_trend']:.3f}")
#>   Beta (trend): 0.000
print(f"  Gamma (seasonal): {hw_fit.params['smoothing_seasonal']:.3f}")
#>   Gamma (seasonal): 0.000
print(f"AIC: {hw_fit.aic:.1f}\n")
#> AIC: 941.4

# Forecast 24 months (ExponentialSmoothing uses .forecast(), not .get_forecast())
n_steps_hw = 24
forecast_hw_values = hw_fit.forecast(steps=n_steps_hw)
forecast_hw_dates = pd.date_range(start=revenue_ts.index[-1] + pd.DateOffset(months=1),
                                  periods=n_steps_hw, freq='MS')
forecast_hw_mean = pd.Series(forecast_hw_values.values, index=forecast_hw_dates)
sigma_hw = hw_fit.resid.std()
hw_lower = forecast_hw_mean - 1.96 * sigma_hw
hw_upper = forecast_hw_mean + 1.96 * sigma_hw

# Plot
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(revenue_ts.index, revenue_ts.values, color='black', linewidth=1.5, label='Historical Data')
ax.plot(forecast_hw_mean.index, forecast_hw_mean.values, color='purple', linewidth=2, label='Holt-Winters Forecast')
ax.fill_between(forecast_hw_mean.index, hw_lower, hw_upper,
                 alpha=0.2, color='purple', label='95% Prediction Interval')
ax.set_title('Nigerian Retail Revenue: Holt-Winters Seasonal Forecast', fontsize=12, fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('Revenue (NGN millions)')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('hw_forecast.png', dpi=150, bbox_inches='tight')
print("Plot saved as 'hw_forecast.png'")
#> Plot saved as 'hw_forecast.png'

print("\n=== Holt-Winters Forecast (next 24 months, first 12 shown) ===")
#> 
#> === Holt-Winters Forecast (next 24 months, first 12 shown) ===
print(forecast_hw_mean.head(12).round(1))
#> 2024-01-01    1136.1
#> 2024-02-01    1193.8
#> 2024-03-01    1143.0
#> 2024-04-01    1159.6
#> 2024-05-01    1221.7
#> 2024-06-01    1433.0
#> 2024-07-01    1525.7
#> 2024-08-01    1451.5
#> 2024-09-01    1183.9
#> 2024-10-01     929.2
#> 2024-11-01     895.6
#> 2024-12-01     979.7
#> Freq: MS, dtype: float64

Holt-Winters captures both the upward trend and the seasonal pattern, producing a forecast that rises over time while oscillating with the seasonal cycle. This is much more realistic for retail or other seasonal businesses.

📝 Section 24.3 Review Questions

When would you use additive seasonality vs. multiplicative seasonality?
Holt-Winters has three parameters (\(\alpha, \beta, \gamma\)). Are they always all estimated? What if the data has no trend?
If you forecast 24 months and the seasonal pattern repeats every 12 months, will the forecast eventually converge to the trend line?

29.4 The ARIMA Model Family: AR, MA, I, and Combinations

ARIMA (AutoRegressive Integrated Moving Average) is a powerful framework unifying three concepts:

AR(p): Autoregressive

The observation is a linear combination of its past values:

\[Y_t = c + \phi_1 Y_{t-1} + \phi_2 Y_{t-2} + \cdots + \phi_p Y_{t-p} + \epsilon_t\]

The series “remembers” its past. An AR(1) process with \(\phi_1 = 0.7\) means today’s value is 70% of yesterday plus noise.

MA(q): Moving Average

The observation is a linear combination of past shocks (forecast errors):

\[Y_t = \mu + \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \cdots + \theta_q \epsilon_{t-q}\]

Current value depends on recent noise terms, not on past Y values directly.

I(d): Integrated (differenced)

Take the original series and difference it \(d\) times to achieve stationarity:

\[\Delta^d Y_t\]

\(d = 0\) means no differencing (series is already stationary). \(d = 1\) means first differences (common for trending data).

ARIMA(p, d, q): Combine all three.

Fit an AR(p) model to the \(d\)-times-differenced series, with an MA(q) error term.

29.4.1 Example: ARIMA(1, 1, 1)

\[\Delta Y_t = c + \phi_1 \Delta Y_{t-1} + \epsilon_t + \theta_1 \epsilon_{t-1}\]

First difference the original series, then fit an ARMA(1, 1) model (AR with MA errors).

📝 Section 24.4 Review Questions

If a series is non-stationary, should you set \(d = 0\) or \(d > 0\)? Why?
An AR(1) with \(\phi_1 = 0.95\) vs. \(\phi_1 = 0.5\): which has more persistence? Which is more predictable?
Can you have an ARIMA(0, 0, q) model? What would it be?
If after first differencing, the differenced series is still non-stationary, what would you do?

29.5 Model Identification via ACF and PACF

Before fitting ARIMA, identify the orders \(p\), \(d\), \(q\) using ACF and PACF:

Plot the original series. If it has a trend, you likely need differencing.
Perform stationarity tests (ADF, KPSS). If non-stationary, set \(d = 1\). If still non-stationary after first differencing, try \(d = 2\) (rare).
Plot ACF and PACF of the (possibly differenced) series. Look for cutoff patterns:
- AR(p): PACF cuts off at lag \(p\); ACF decays gradually.
- MA(q): ACF cuts off at lag \(q\); PACF decays gradually.
- ARMA(p, q): Both ACF and PACF decay gradually.
Try a few candidate models (e.g., ARIMA(1,1,1), ARIMA(2,1,1), ARIMA(1,1,2)) and compare AIC/BIC.

29.5.1 Identifying ARIMA Order for NGN/USD Exchange Rate

Show code

library(tseries)

# Recreate NGN/USD exchange rate data (same seed as Chapter 23)
set.seed(42)
n_days_fx <- 2500
dates_fx <- seq(from = lubridate::ymd("2015-01-01"), by = "day", length.out = n_days_fx)
initial_rate <- 200
trend_fx <- initial_rate + 0.02 * (1:n_days_fx) + 5 * sin(2 * pi * (1:n_days_fx) / 365)
random_walk_fx <- 5 * cumsum(rnorm(n_days_fx, 0, 0.1))
fx_ts <- ts(trend_fx + random_walk_fx, frequency = 252)

# First difference it
diff_fx <- diff(fx_ts)

# Test stationarity of differenced series
adf_diff_fx <- adf.test(diff_fx)
cat("ADF test on differenced FX rate:\n")
#> ADF test on differenced FX rate:
cat("  P-value:", round(adf_diff_fx$p.value, 4), "\n")
#>   P-value: 0.01
cat("  Conclusion: Series is stationary (safe to fit ARIMA)\n\n")
#>   Conclusion: Series is stationary (safe to fit ARIMA)

# Plot ACF and PACF of differenced series
par(mfrow = c(2, 1), mar = c(3.5, 4, 2.5, 1))
acf(diff_fx, lag.max = 40, main = "ACF of Differenced FX Rate")
pacf(diff_fx, lag.max = 40, main = "PACF of Differenced FX Rate")

Show code

par(mfrow = c(1, 1), mar = c(5.1, 4.1, 4.1, 2.1))

cat("\n=== ARIMA Identification ===\n")
#> 
#> === ARIMA Identification ===
cat("From ACF/PACF inspection of differenced series:\n")
#> From ACF/PACF inspection of differenced series:
cat("  ACF: Decays gradually (no sharp cutoff)\n")
#>   ACF: Decays gradually (no sharp cutoff)
cat("  PACF: One significant spike at lag 1, then mostly insignificant\n")
#>   PACF: One significant spike at lag 1, then mostly insignificant
cat("  Interpretation: Suggests AR(1) or ARMA(1,1) on differenced data\n")
#>   Interpretation: Suggests AR(1) or ARMA(1,1) on differenced data
cat("  Possible models: ARIMA(1,1,0), ARIMA(1,1,1), ARIMA(2,1,1)\n\n")
#>   Possible models: ARIMA(1,1,0), ARIMA(1,1,1), ARIMA(2,1,1)

# Fit several candidates
candidates <- list(
  ARIMA_1_1_0 = arima(fx_ts, order = c(1, 1, 0)),
  ARIMA_1_1_1 = arima(fx_ts, order = c(1, 1, 1)),
  ARIMA_2_1_1 = arima(fx_ts, order = c(2, 1, 1)),
  ARIMA_1_1_2 = arima(fx_ts, order = c(1, 1, 2))
)

# Compare via information criteria
ics <- sapply(candidates, function(m) c(AIC = AIC(m), BIC = BIC(m)))
rownames(ics) <- c("AIC", "BIC")

cat("=== Information Criteria Comparison ===\n")
#> === Information Criteria Comparison ===
print(ics)
#>     ARIMA_1_1_0 ARIMA_1_1_1 ARIMA_2_1_1 ARIMA_1_1_2
#> AIC    3617.730    3619.728    3621.698    3621.616
#> BIC    3629.378    3637.199    3644.992    3644.910

best_model_name <- names(candidates)[which.min(ics["AIC", ])]
best_model <- candidates[[best_model_name]]

cat("\nBest model by AIC:", best_model_name, "\n")
#> 
#> Best model by AIC: ARIMA_1_1_0

Show code

from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.stattools import adfuller

# Recreate NGN/USD exchange rate data (same seed as Chapter 23)
np.random.seed(42)
n_days_fx = 2500
dates_fx_24 = pd.date_range(start='2015-01-01', periods=n_days_fx, freq='D')
initial_rate = 200
trend_fx = initial_rate + 0.02 * np.arange(1, n_days_fx + 1) + 5 * np.sin(2 * np.pi * np.arange(1, n_days_fx + 1) / 365)
random_walk_fx = 5 * np.cumsum(np.random.normal(0, 0.1, n_days_fx))
fx_ts = pd.Series(trend_fx + random_walk_fx, index=dates_fx_24)

# Differenced FX series
diff_fx = fx_ts.diff().dropna()

# Test stationarity of differenced series
adf_diff = adfuller(diff_fx, autolag='AIC')
print(f"ADF test on differenced FX rate:")
#> ADF test on differenced FX rate:
print(f"  P-value: {adf_diff[1]:.4f}")
#>   P-value: 0.0000
print(f"  Conclusion: Series is stationary\n")
#>   Conclusion: Series is stationary

# Plot ACF and PACF
fig, axes = plt.subplots(2, 1, figsize=(12, 8))
plot_acf(diff_fx, lags=40, ax=axes[0], title='ACF of Differenced FX Rate')
plot_pacf(diff_fx, lags=40, ax=axes[1], title='PACF of Differenced FX Rate', method='ywm')
plt.tight_layout()
plt.savefig('arima_identification.png', dpi=150, bbox_inches='tight')
print("Plot saved as 'arima_identification.png'")
#> Plot saved as 'arima_identification.png'

print("\n=== ARIMA Identification ===")
#> 
#> === ARIMA Identification ===
print("From ACF/PACF inspection of differenced series:")
#> From ACF/PACF inspection of differenced series:
print("  ACF: Decays gradually (no sharp cutoff)")
#>   ACF: Decays gradually (no sharp cutoff)
print("  PACF: One significant spike at lag 1")
#>   PACF: One significant spike at lag 1
print("  Interpretation: Suggests AR(1) or ARMA(1,1)")
#>   Interpretation: Suggests AR(1) or ARMA(1,1)
print("  Possible models: ARIMA(1,1,0), ARIMA(1,1,1), ARIMA(2,1,1), ARIMA(1,1,2)\n")
#>   Possible models: ARIMA(1,1,0), ARIMA(1,1,1), ARIMA(2,1,1), ARIMA(1,1,2)

# Fit candidates
candidates = {
    'ARIMA(1,1,0)': ARIMA(fx_ts, order=(1, 1, 0)),
    'ARIMA(1,1,1)': ARIMA(fx_ts, order=(1, 1, 1)),
    'ARIMA(2,1,1)': ARIMA(fx_ts, order=(2, 1, 1)),
    'ARIMA(1,1,2)': ARIMA(fx_ts, order=(1, 1, 2))
}

results = {}
for name, model in candidates.items():
    results[name] = model.fit()

# Compare
aic_bic = pd.DataFrame({
    'AIC': [results[name].aic for name in candidates.keys()],
    'BIC': [results[name].bic for name in candidates.keys()]
}, index=candidates.keys())

print("=== Information Criteria Comparison ===")
#> === Information Criteria Comparison ===
print(aic_bic.round(1))
#>                  AIC     BIC
#> ARIMA(1,1,0)  3594.9  3606.5
#> ARIMA(1,1,1)  3577.1  3594.6
#> ARIMA(2,1,1)  3598.9  3622.2
#> ARIMA(1,1,2)  3593.7  3616.9

best_model_name = aic_bic['AIC'].idxmin()
best_model_result = results[best_model_name]
print(f"\nBest model by AIC: {best_model_name}")
#> 
#> Best model by AIC: ARIMA(1,1,1)

Based on the ACF/PACF patterns and information criteria, ARIMA(1,1,1) emerges as the best choice: it balances fit (lower AIC) and parsimony (fewer parameters).

📝 Section 24.5 Review Questions

After differencing, you observe: ACF cuts off at lag 2, PACF decays. What ARIMA order do you propose?
You fit ARIMA(2,1,1) and ARIMA(1,1,1); the AICs are 520 and 518. Which is better? By how much?
Is it possible that two very different ARIMA models (e.g., ARIMA(1,1,0) and ARIMA(0,1,2)) have nearly identical AIC? What would this mean?

29.6 Fitting, Diagnosing, and Forecasting with ARIMA

Once you have selected a model, fit it and examine residuals to verify assumptions.

29.6.1 Maximum Likelihood Estimation

ARIMA parameters are typically estimated via maximum likelihood. The likelihood function is the probability of observing the data given the parameters. Numerical optimisation (e.g., gradient descent) finds the parameter values that maximise the likelihood.

29.6.2 Residual Diagnostics

After fitting, plot the residuals and check:

Mean zero: Residuals should average to zero.
Constant variance: No patterns (heteroscedasticity).
Uncorrelated: No residual autocorrelation (ACF of residuals should be insignificant).
Normally distributed: Histogram/QQ plot should be bell-shaped (though not critical for forecasting).

If residuals fail these checks, the model is misspecified. Try a different order.

29.6.3 Ljung-Box Test

Formally test for residual autocorrelation:

\[H_0: \text{Residuals are white noise (uncorrelated)}\] \[H_1: \text{Residuals are autocorrelated}\]

If p-value > 0.05, residuals are adequately white.

29.6.4 Automatic Model Selection: auto.arima()

Manually trying all combinations is tedious. The auto.arima() (R) or pmdarima.auto_arima() (Python) function automates:

Differencing to achieve stationarity.
Grid search over reasonable p, d, q ranges.
Selection by AIC/BIC.

While automatic, it is still good practice to verify the result.

29.6.5 Fitting and Diagnosing ARIMA(1,1,1) on FX Data

Show code

# Fit the best model
arima_fit <- arima(fx_ts, order = c(1, 1, 1))

# Print summary
cat("=== ARIMA(1,1,1) Model Summary ===\n")
#> === ARIMA(1,1,1) Model Summary ===
print(arima_fit)
#> 
#> Call:
#> arima(x = fx_ts, order = c(1, 1, 1))
#> 
#> Coefficients:
#>          ar1     ma1
#>       0.0074  0.0079
#> s.e.     NaN     NaN
#> 
#> sigma^2 estimated as 0.2486:  log likelihood = -1806.86,  aic = 3619.73

# Extract residuals and diagnose
residuals_arima <- residuals(arima_fit)

# Ljung-Box test
lb_test <- Box.test(residuals_arima, lag = 10, type = "Ljung-Box")
cat("\nLjung-Box Test (lag 10):\n")
#> 
#> Ljung-Box Test (lag 10):
cat("  Test Statistic:", round(lb_test$statistic, 3), "\n")
#>   Test Statistic: 3.784
cat("  P-value:", round(lb_test$p.value, 3), "\n")
#>   P-value: 0.957
if (lb_test$p.value > 0.05) {
  cat("  Conclusion: Residuals are white noise (good fit)\n")
} else {
  cat("  Conclusion: Residuals have autocorrelation (consider other orders)\n")
}
#>   Conclusion: Residuals are white noise (good fit)

# Diagnostic plots
par(mfrow = c(2, 2))
plot(residuals_arima, main = "Residuals Over Time", ylab = "Residual")
acf(residuals_arima, main = "ACF of Residuals")
pacf(residuals_arima, main = "PACF of Residuals")
qqnorm(residuals_arima)
qqline(residuals_arima)

Show code

par(mfrow = c(1, 1))

# Forecast 90 days ahead
forecast_arima <- forecast(arima_fit, h = 90)

# Plot with prediction intervals
autoplot(fx_ts) +
  autolayer(forecast_arima, alpha = 0.7) +
  theme_minimal() +
  labs(title = "NGN/USD Exchange Rate: ARIMA(1,1,1) Forecast",
       subtitle = "90-day ahead with 80% and 95% prediction intervals",
       x = "Date", y = "Exchange Rate") +
  theme(legend.position = "bottom")

Show code


# Print forecast
cat("\n=== ARIMA(1,1,1) Forecast (first 30 days) ===\n")
#> 
#> === ARIMA(1,1,1) Forecast (first 30 days) ===
print(forecast_arima, n = 30)
#>          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> 10.92063       233.8552 233.2162 234.4942 232.8780 234.8325
#> 10.92460       233.8553 232.9447 234.7659 232.4626 235.2479
#> 10.92857       233.8553 232.7372 234.9734 232.1453 235.5653
#> 10.93254       233.8553 232.5625 235.1480 231.8782 235.8323
#> 10.93651       233.8553 232.4089 235.3017 231.6432 236.0674
#> 10.94048       233.8553 232.2700 235.4405 231.4308 236.2797
#> 10.94444       233.8553 232.1424 235.5682 231.2356 236.4749
#> 10.94841       233.8553 232.0236 235.6869 231.0540 236.6566
#> 10.95238       233.8553 231.9121 235.7985 230.8834 236.8271
#> 10.95635       233.8553 231.8066 235.9039 230.7222 236.9884
#> 10.96032       233.8553 231.7063 236.0042 230.5688 237.1418
#> 10.96429       233.8553 231.6105 236.1000 230.4222 237.2883
#> 10.96825       233.8553 231.5186 236.1919 230.2817 237.4288
#> 10.97222       233.8553 231.4302 236.2803 230.1465 237.5640
#> 10.97619       233.8553 231.3449 236.3656 230.0161 237.6945
#> 10.98016       233.8553 231.2625 236.4481 229.8899 237.8206
#> 10.98413       233.8553 231.1825 236.5280 229.7676 237.9429
#> 10.98810       233.8553 231.1049 236.6057 229.6489 238.0616
#> 10.99206       233.8553 231.0294 236.6812 229.5335 238.1771
#> 10.99603       233.8553 230.9559 236.7547 229.4210 238.2895
#> 11.00000       233.8553 230.8842 236.8264 229.3113 238.3992
#> 11.00397       233.8553 230.8141 236.8964 229.2043 238.5063
#> 11.00794       233.8553 230.7457 236.9648 229.0996 238.6110
#> 11.01190       233.8553 230.6787 237.0318 228.9972 238.7134
#> 11.01587       233.8553 230.6131 237.0974 228.8969 238.8137
#> 11.01984       233.8553 230.5489 237.1617 228.7986 238.9120
#> 11.02381       233.8553 230.4858 237.2247 228.7021 239.0084
#> 11.02778       233.8553 230.4239 237.2866 228.6075 239.1031
#> 11.03175       233.8553 230.3631 237.3474 228.5145 239.1961
#> 11.03571       233.8553 230.3033 237.4072 228.4231 239.2875
#> 11.03968       233.8553 230.2446 237.4660 228.3332 239.3774
#> 11.04365       233.8553 230.1867 237.5238 228.2447 239.4658
#> 11.04762       233.8553 230.1298 237.5807 228.1577 239.5529
#> 11.05159       233.8553 230.0737 237.6368 228.0719 239.6386
#> 11.05556       233.8553 230.0185 237.6921 227.9874 239.7232
#> 11.05952       233.8553 229.9640 237.7465 227.9041 239.8065
#> 11.06349       233.8553 229.9103 237.8003 227.8219 239.8886
#> 11.06746       233.8553 229.8573 237.8533 227.7409 239.9697
#> 11.07143       233.8553 229.8050 237.9056 227.6609 240.0497
#> 11.07540       233.8553 229.7533 237.9572 227.5819 240.1286
#> 11.07937       233.8553 229.7023 238.0082 227.5039 240.2066
#> 11.08333       233.8553 229.6520 238.0586 227.4269 240.2837
#> 11.08730       233.8553 229.6022 238.1084 227.3507 240.3598
#> 11.09127       233.8553 229.5530 238.1576 227.2755 240.4351
#> 11.09524       233.8553 229.5043 238.2062 227.2011 240.5095
#> 11.09921       233.8553 229.4562 238.2543 227.1275 240.5830
#> 11.10317       233.8553 229.4086 238.3019 227.0547 240.6558
#> 11.10714       233.8553 229.3615 238.3490 226.9827 240.7278
#> 11.11111       233.8553 229.3150 238.3956 226.9114 240.7991
#> 11.11508       233.8553 229.2688 238.4417 226.8409 240.8696
#> 11.11905       233.8553 229.2232 238.4874 226.7711 240.9395
#> 11.12302       233.8553 229.1779 238.5326 226.7019 241.0086
#> 11.12698       233.8553 229.1332 238.5774 226.6334 241.0771
#> 11.13095       233.8553 229.0888 238.6218 226.5656 241.1450
#> 11.13492       233.8553 229.0448 238.6657 226.4983 241.2122
#> 11.13889       233.8553 229.0013 238.7093 226.4317 241.2788
#> 11.14286       233.8553 228.9581 238.7524 226.3657 241.3448
#> 11.14683       233.8553 228.9153 238.7952 226.3003 241.4103
#> 11.15079       233.8553 228.8729 238.8377 226.2354 241.4752
#> 11.15476       233.8553 228.8308 238.8797 226.1710 241.5395
#> 11.15873       233.8553 228.7891 238.9214 226.1072 241.6033
#> 11.16270       233.8553 228.7477 238.9628 226.0439 241.6666
#> 11.16667       233.8553 228.7067 239.0039 225.9812 241.7294
#> 11.17063       233.8553 228.6660 239.0446 225.9189 241.7916
#> 11.17460       233.8553 228.6256 239.0850 225.8571 241.8534
#> 11.17857       233.8553 228.5855 239.1251 225.7958 241.9148
#> 11.18254       233.8553 228.5457 239.1649 225.7349 241.9756
#> 11.18651       233.8553 228.5062 239.2044 225.6745 242.0360
#> 11.19048       233.8553 228.4670 239.2436 225.6146 242.0960
#> 11.19444       233.8553 228.4280 239.2825 225.5550 242.1555
#> 11.19841       233.8553 228.3894 239.3211 225.4959 242.2146
#> 11.20238       233.8553 228.3510 239.3595 225.4373 242.2733
#> 11.20635       233.8553 228.3129 239.3976 225.3790 242.3316
#> 11.21032       233.8553 228.2751 239.4355 225.3211 242.3895
#> 11.21429       233.8553 228.2375 239.4731 225.2636 242.4469
#> 11.21825       233.8553 228.2001 239.5104 225.2065 242.5041
#> 11.22222       233.8553 228.1630 239.5475 225.1498 242.5608
#> 11.22619       233.8553 228.1262 239.5844 225.0934 242.6172
#> 11.23016       233.8553 228.0896 239.6210 225.0374 242.6732
#> 11.23413       233.8553 228.0532 239.6574 224.9817 242.7288
#> 11.23810       233.8553 228.0170 239.6935 224.9264 242.7841
#> 11.24206       233.8553 227.9811 239.7295 224.8714 242.8391
#> 11.24603       233.8553 227.9453 239.7652 224.8168 242.8937
#> 11.25000       233.8553 227.9098 239.8007 224.7625 242.9480
#> 11.25397       233.8553 227.8745 239.8360 224.7085 243.0020
#> 11.25794       233.8553 227.8394 239.8711 224.6549 243.0557
#> 11.26190       233.8553 227.8046 239.9060 224.6015 243.1090
#> 11.26587       233.8553 227.7699 239.9407 224.5485 243.1621
#> 11.26984       233.8553 227.7354 239.9752 224.4957 243.2148
#> 11.27381       233.8553 227.7011 240.0095 224.4433 243.2673

Show code

# Fit ARIMA(1,1,1)
arima_model = ARIMA(fx_ts, order=(1, 1, 1))
arima_fit = arima_model.fit()

print("=== ARIMA(1,1,1) Model Summary ===")
#> === ARIMA(1,1,1) Model Summary ===
print(arima_fit.summary())
#>                                SARIMAX Results                                
#> ==============================================================================
#> Dep. Variable:                      y   No. Observations:                 2500
#> Model:                 ARIMA(1, 1, 1)   Log Likelihood               -1785.570
#> Date:                Sun, 10 May 2026   AIC                           3577.140
#> Time:                        15:42:42   BIC                           3594.611
#> Sample:                    01-01-2015   HQIC                          3583.483
#>                          - 11-04-2021                                         
#> Covariance Type:                  opg                                         
#> ==============================================================================
#>                  coef    std err          z      P>|z|      [0.025      0.975]
#> ------------------------------------------------------------------------------
#> ar.L1          0.9856      0.009    105.666      0.000       0.967       1.004
#> ma.L1         -0.9707      0.013    -71.942      0.000      -0.997      -0.944
#> sigma2         0.2444      0.007     35.495      0.000       0.231       0.258
#> ===================================================================================
#> Ljung-Box (L1) (Q):                   0.74   Jarque-Bera (JB):                 0.38
#> Prob(Q):                              0.39   Prob(JB):                         0.83
#> Heteroskedasticity (H):               0.96   Skew:                             0.03
#> Prob(H) (two-sided):                  0.51   Kurtosis:                         3.01
#> ===================================================================================
#> 
#> Warnings:
#> [1] Covariance matrix calculated using the outer product of gradients (complex-step).

# Extract residuals
residuals_arima = arima_fit.resid

# Ljung-Box test
from statsmodels.stats.diagnostic import acorr_ljungbox
lb_results = acorr_ljungbox(residuals_arima, lags=10, return_df=True)

print(f"\n\nLjung-Box Test Results:")
#> 
#> 
#> Ljung-Box Test Results:
print(lb_results)
#>      lb_stat  lb_pvalue
#> 1   0.000066   0.993523
#> 2   0.007082   0.996465
#> 3   0.054612   0.996661
#> 4   0.056852   0.999604
#> 5   0.056858   0.999960
#> 6   0.101899   0.999979
#> 7   0.113524   0.999996
#> 8   0.119064   1.000000
#> 9   0.126098   1.000000
#> 10  0.127996   1.000000
print(f"\nConclusion: Residuals are {'white noise' if lb_results['lb_pvalue'].iloc[-1] > 0.05 else 'autocorrelated'}")
#> 
#> Conclusion: Residuals are white noise

# Diagnostic plots
fig, axes = plt.subplots(2, 2, figsize=(12, 8))

# Residuals
axes[0, 0].plot(residuals_arima.values, color='blue', linewidth=0.8)
axes[0, 0].axhline(y=0, color='red', linestyle='--')
axes[0, 0].set_title('Residuals Over Time')
axes[0, 0].grid(True, alpha=0.3)

# ACF of residuals
plot_acf(residuals_arima, lags=40, ax=axes[0, 1])
axes[0, 1].set_title('ACF of Residuals')

# PACF of residuals
plot_pacf(residuals_arima, lags=40, ax=axes[1, 0], method='ywm')
axes[1, 0].set_title('PACF of Residuals')

# Q-Q plot
from scipy import stats
stats.probplot(residuals_arima, dist="norm", plot=axes[1, 1])
#> ((array([-3.45297476, -3.20602916, -3.06928934, ...,  3.06928934,
#>         3.20602916,  3.45297476], shape=(2500,)), array([ -1.62227291,  -1.5522529 ,  -1.51775872, ...,   1.5815681 ,
#>          1.88400021, 200.35442386], shape=(2500,))), (np.float64(0.7706753725976716), np.float64(0.09763750369065635), np.float64(0.19070069231932096)))
axes[1, 1].set_title('Q-Q Plot')

plt.tight_layout()
plt.savefig('arima_diagnostics.png', dpi=150, bbox_inches='tight')
print("\nPlot saved as 'arima_diagnostics.png'")
#> 
#> Plot saved as 'arima_diagnostics.png'

# Forecast 90 days ahead
forecast_arima = arima_fit.get_forecast(steps=90)
forecast_mean = forecast_arima.predicted_mean
forecast_ci = forecast_arima.conf_int()

# Plot
fig, ax = plt.subplots(figsize=(14, 6))
ax.plot(fx_ts.index, fx_ts.values, color='black', linewidth=1.5, label='Historical Data')
ax.plot(forecast_mean.index, forecast_mean.values, color='darkred', linewidth=2, label='ARIMA Forecast')
ax.fill_between(forecast_ci.index, forecast_ci.iloc[:, 0], forecast_ci.iloc[:, 1],
                 alpha=0.3, color='darkred', label='95% Prediction Interval')
ax.set_title('NGN/USD Exchange Rate: ARIMA(1,1,1) Forecast (90 days)', fontsize=12, fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('Exchange Rate')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('arima_forecast.png', dpi=150, bbox_inches='tight')
print("ARIMA forecast plot saved as 'arima_forecast.png'")
#> ARIMA forecast plot saved as 'arima_forecast.png'

# Print first 30 forecast values
print("\n=== ARIMA(1,1,1) Forecast (first 30 days) ===")
#> 
#> === ARIMA(1,1,1) Forecast (first 30 days) ===
print(forecast_mean.head(30).round(2))
#> 2021-11-05    288.18
#> 2021-11-06    288.21
#> 2021-11-07    288.24
#> 2021-11-08    288.26
#> 2021-11-09    288.29
#> 2021-11-10    288.32
#> 2021-11-11    288.35
#> 2021-11-12    288.37
#> 2021-11-13    288.40
#> 2021-11-14    288.43
#> 2021-11-15    288.45
#> 2021-11-16    288.48
#> 2021-11-17    288.50
#> 2021-11-18    288.53
#> 2021-11-19    288.55
#> 2021-11-20    288.58
#> 2021-11-21    288.60
#> 2021-11-22    288.62
#> 2021-11-23    288.65
#> 2021-11-24    288.67
#> 2021-11-25    288.69
#> 2021-11-26    288.71
#> 2021-11-27    288.74
#> 2021-11-28    288.76
#> 2021-11-29    288.78
#> 2021-11-30    288.80
#> 2021-12-01    288.82
#> 2021-12-02    288.84
#> 2021-12-03    288.86
#> 2021-12-04    288.88
#> Freq: D, Name: predicted_mean, dtype: float64

The diagnostic plots reveal that ARIMA(1,1,1) fits well: residuals show no systematic patterns, ACF/PACF are mostly insignificant (white noise), and the Ljung-Box test (p-value > 0.05) confirms no residual autocorrelation. The forecast extends 90 days with widening prediction intervals, reflecting increasing uncertainty into the future.

📝 Section 24.6 Review Questions

You fit an ARIMA model and the ACF of residuals shows a significant spike at lag 12. What might this indicate?
Prediction intervals widen as you forecast further ahead. Why?
The Ljung-Box test p-value is 0.03 (reject white noise). Does this mean the ARIMA model is “bad”? What would you do?
Two forecasts, both using ARIMA but with different orders, have 95% prediction intervals of (200, 250) and (185, 265). Which is more uncertain? Which might be more realistic?

29.7 Seasonal ARIMA (SARIMA)

If data has seasonality (which most business data does), extend ARIMA to SARIMA(p,d,q)(P,D,Q)_m:

Lowercase (p, d, q): non-seasonal orders (as before).
Uppercase (P, D, Q): seasonal orders.
m: seasonal period (e.g., 12 for monthly data).

A SARIMA(1,1,1)(0,1,1)_12 model includes:

Non-seasonal: AR(1), first difference (d=1), no MA.
Seasonal: no AR, seasonal difference (D=1), seasonal MA(1) at lag 12.

29.7.1 Fitting SARIMA to CPI Data

Show code

# Recreate Nigerian CPI data (same seed as Chapter 23)
set.seed(42)
n_months_cpi <- 168
trend_cpi <- 100 + 0.5 * (1:n_months_cpi) + 0.001 * (1:n_months_cpi)^2
seasonality_cpi <- 2 * sin(2 * pi * (1:n_months_cpi) / 12)
noise_cpi <- rnorm(n_months_cpi, mean = 0, sd = 0.5)
cpi_ts <- ts(trend_cpi + seasonality_cpi + noise_cpi, start = c(2010, 1), frequency = 12)

# Fit SARIMA to CPI
sarima_fit <- arima(cpi_ts, order = c(1, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12))

cat("=== SARIMA(1,1,1)(0,1,1)[12] Model Summary ===\n")
#> === SARIMA(1,1,1)(0,1,1)[12] Model Summary ===
print(sarima_fit)
#> 
#> Call:
#> arima(x = cpi_ts, order = c(1, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12))
#> 
#> Coefficients:
#>           ar1      ma1     sma1
#>       -0.1585  -0.5799  -0.6764
#> s.e.   0.1107   0.0966   0.0767
#> 
#> sigma^2 estimated as 0.3907:  log likelihood = -151.07,  aic = 310.14

# Forecast
forecast_sarima <- forecast(sarima_fit, h = 24)

autoplot(cpi_ts) +
  autolayer(forecast_sarima, alpha = 0.7, PI = TRUE) +
  theme_minimal() +
  labs(title = "Nigerian CPI: SARIMA Forecast",
       x = "Date", y = "CPI") +
  theme(legend.position = "bottom")

Show code


cat("\n=== SARIMA Forecast (next 24 months) ===\n")
#> 
#> === SARIMA Forecast (next 24 months) ===
print(forecast_sarima)
#>          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> Jan 2024       213.8929 213.0919 214.6939 212.6679 215.1180
#> Feb 2024       215.5828 214.7548 216.4108 214.3165 216.8491
#> Mar 2024       216.5997 215.7179 217.4815 215.2511 217.9482
#> Apr 2024       217.1958 216.2680 218.1235 215.7769 218.6147
#> May 2024       217.2562 216.2840 218.2285 215.7693 218.7432
#> Jun 2024       216.7596 215.7449 217.7743 215.2078 218.3115
#> Jul 2024       216.6215 215.5660 217.6770 215.0073 218.2357
#> Aug 2024       216.4601 215.3654 217.5548 214.7859 218.1343
#> Sep 2024       217.4439 216.3113 218.5765 215.7118 219.1760
#> Oct 2024       218.5991 217.4299 219.7683 216.8109 220.3873
#> Nov 2024       219.5387 218.3339 220.7434 217.6961 221.3812
#> Dec 2024       221.5403 220.3010 222.7796 219.6449 223.4356
#> Jan 2025       223.2612 221.9055 224.6170 221.1878 225.3346
#> Feb 2025       224.9068 223.5045 226.3091 222.7622 227.0514
#> Mar 2025       225.9307 224.4756 227.3858 223.7053 228.1561
#> Apr 2025       226.5257 225.0208 228.0306 224.2241 228.8272
#> May 2025       226.5863 225.0330 228.1396 224.2108 228.9618
#> Jun 2025       226.0897 224.4895 227.6898 223.6425 228.5369
#> Jul 2025       225.9515 224.3059 227.5972 223.4347 228.4684
#> Aug 2025       225.7901 224.1001 227.4801 223.2055 228.3748
#> Sep 2025       226.7739 225.0408 228.5071 224.1233 229.4246
#> Oct 2025       227.9292 226.1538 229.7045 225.2141 230.6443
#> Nov 2025       228.8687 227.0522 230.6852 226.0907 231.6468
#> Dec 2025       230.8703 229.0136 232.7270 228.0307 233.7099

Show code

from statsmodels.tsa.statespace.sarimax import SARIMAX

# Recreate Nigerian CPI data (same seed as Chapter 23)
np.random.seed(42)
n_months_cpi = 168
dates_cpi = pd.date_range(start='2010-01-01', periods=n_months_cpi, freq='MS')
trend_cpi = 100 + 0.5 * np.arange(1, n_months_cpi + 1) + 0.001 * np.arange(1, n_months_cpi + 1)**2
seasonality_cpi = 2 * np.sin(2 * np.pi * np.arange(1, n_months_cpi + 1) / 12)
noise_cpi = np.random.normal(0, 0.5, n_months_cpi)
cpi_ts = pd.Series(trend_cpi + seasonality_cpi + noise_cpi, index=dates_cpi)

# Fit SARIMA
sarima_model = SARIMAX(cpi_ts, order=(1, 1, 1), seasonal_order=(0, 1, 1, 12))
sarima_fit = sarima_model.fit(disp=False)

print("=== SARIMA(1,1,1)(0,1,1)[12] Model Summary ===")
#> === SARIMA(1,1,1)(0,1,1)[12] Model Summary ===
print(sarima_fit.summary())
#>                                      SARIMAX Results                                      
#> ==========================================================================================
#> Dep. Variable:                                  y   No. Observations:                  168
#> Model:             SARIMAX(1, 1, 1)x(0, 1, 1, 12)   Log Likelihood                -146.392
#> Date:                            Sun, 10 May 2026   AIC                            300.784
#> Time:                                    15:42:46   BIC                            312.958
#> Sample:                                01-01-2010   HQIC                           305.729
#>                                      - 12-01-2023                                         
#> Covariance Type:                              opg                                         
#> ==============================================================================
#>                  coef    std err          z      P>|z|      [0.025      0.975]
#> ------------------------------------------------------------------------------
#> ar.L1         -0.1968      0.132     -1.489      0.136      -0.456       0.062
#> ma.L1         -0.5893      0.126     -4.694      0.000      -0.835      -0.343
#> ma.S.L12      -0.6857      0.090     -7.661      0.000      -0.861      -0.510
#> sigma2         0.3669      0.048      7.659      0.000       0.273       0.461
#> ===================================================================================
#> Ljung-Box (L1) (Q):                   3.11   Jarque-Bera (JB):                 2.12
#> Prob(Q):                              0.08   Prob(JB):                         0.35
#> Heteroskedasticity (H):               1.20   Skew:                             0.16
#> Prob(H) (two-sided):                  0.52   Kurtosis:                         2.52
#> ===================================================================================
#> 
#> Warnings:
#> [1] Covariance matrix calculated using the outer product of gradients (complex-step).

# Forecast
forecast_sarima = sarima_fit.get_forecast(steps=24)
forecast_sarima_mean = forecast_sarima.predicted_mean
forecast_sarima_ci = forecast_sarima.conf_int()

# Plot
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(cpi_ts.index, cpi_ts.values, color='black', linewidth=1, label='Historical CPI')
ax.plot(forecast_sarima_mean.index, forecast_sarima_mean.values, color='darkblue', linewidth=2, label='SARIMA Forecast')
ax.fill_between(forecast_sarima_ci.index, forecast_sarima_ci.iloc[:, 0], forecast_sarima_ci.iloc[:, 1],
                 alpha=0.2, color='darkblue', label='95% Prediction Interval')
ax.set_title('Nigerian CPI: SARIMA(1,1,1)(0,1,1)[12] Forecast', fontsize=12, fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('CPI')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('sarima_forecast.png', dpi=150, bbox_inches='tight')
print("\nPlot saved as 'sarima_forecast.png'")
#> 
#> Plot saved as 'sarima_forecast.png'

# Print forecast
print("\n=== SARIMA Forecast (next 24 months) ===")
#> 
#> === SARIMA Forecast (next 24 months) ===
print(forecast_sarima_mean.round(1))
#> 2024-01-01    214.6
#> 2024-02-01    216.1
#> 2024-03-01    216.5
#> 2024-04-01    217.5
#> 2024-05-01    217.3
#> 2024-06-01    217.6
#> 2024-07-01    217.2
#> 2024-08-01    216.9
#> 2024-09-01    217.7
#> 2024-10-01    218.8
#> 2024-11-01    220.2
#> 2024-12-01    222.2
#> 2025-01-01    224.1
#> 2025-02-01    225.5
#> 2025-03-01    225.9
#> 2025-04-01    226.9
#> 2025-05-01    226.7
#> 2025-06-01    227.0
#> 2025-07-01    226.7
#> 2025-08-01    226.3
#> 2025-09-01    227.1
#> 2025-10-01    228.2
#> 2025-11-01    229.6
#> 2025-12-01    231.6
#> Freq: MS, Name: predicted_mean, dtype: float64

SARIMA naturally captures the seasonal cycle while maintaining the trend, producing a forecast that oscillates seasonally while trending upward.

📝 Section 24.7 Review Questions

What is the difference between d (non-seasonal differencing) and D (seasonal differencing)?
A SARIMA(1,1,1)(1,0,1)_12 includes seasonal AR(1). What does this mean: past seasonal values influence the current value?
For monthly data, what is m? For quarterly? For daily?
If you fit both ARIMA(1,1,1) and SARIMA(1,1,1)(0,1,1)[12], which likely has lower AIC? Why?

29.8 Communicating Forecasts and Uncertainty

A point forecast (“Next month’s CPI will be 110”) is rarely sufficient. Prediction intervals quantify uncertainty: “Next month’s CPI will be 110 ± 2 at 95% confidence.”

29.8.1 Fan Charts

A fan chart shows point forecasts and expanding prediction intervals. As time extends, intervals widen (reflecting uncertainty growth). This visual immediately communicates risk to decision-makers.

Show code

# Create a fan chart for CPI forecast
forecast_obj <- forecast_sarima

# Prepare data
forecast_data <- tibble(
  date = seq(from = as.Date("2023-07-01"), by = "month", length.out = length(forecast_obj$mean)),
  point = forecast_obj$mean,
  lower_80 = forecast_obj$lower[, 1],
  upper_80 = forecast_obj$upper[, 1],
  lower_95 = forecast_obj$lower[, 2],
  upper_95 = forecast_obj$upper[, 2]
)

ggplot() +
  # Historical CPI
  geom_line(aes(x = date_cpi, y = cpi, color = "Historical"),
            data = tibble(date_cpi = seq.Date(as.Date("2010-01-01"), by = "month", length.out = length(cpi_ts)),
                          cpi = as.numeric(cpi_ts)), linewidth = 1) +
  # 95% interval (light)
  geom_ribbon(aes(x = date, ymin = lower_95, ymax = upper_95, fill = "95% Interval"),
              data = forecast_data, alpha = 0.2) +
  # 80% interval (darker)
  geom_ribbon(aes(x = date, ymin = lower_80, ymax = upper_80, fill = "80% Interval"),
              data = forecast_data, alpha = 0.4) +
  # Point forecast
  geom_line(aes(x = date, y = point, color = "Forecast"), data = forecast_data, linewidth = 1) +
  theme_minimal() +
  labs(title = "Nigerian CPI: Fan Chart Forecast",
       subtitle = "Shaded regions show 80% and 95% prediction intervals",
       x = "Date", y = "CPI",
       color = "Series", fill = "Confidence") +
  theme(legend.position = "bottom")

Show code


# Interpretation for stakeholders
cat("\n=== INTERPRETING THE FAN CHART ===\n\n")
#> 
#> === INTERPRETING THE FAN CHART ===
cat("Blue line: Most likely CPI path (point forecast).\n")
#> Blue line: Most likely CPI path (point forecast).
cat("Dark shading: 80% chance CPI falls here. (4 in 5 chance.)\n")
#> Dark shading: 80% chance CPI falls here. (4 in 5 chance.)
cat("Light shading: 95% chance CPI falls here. (19 in 20 chance.)\n")
#> Light shading: 95% chance CPI falls here. (19 in 20 chance.)
cat("Widening bands: Uncertainty increases over time.\n")
#> Widening bands: Uncertainty increases over time.
cat("Narrative: 'We expect inflation to continue rising, but by Q4,\n")
#> Narrative: 'We expect inflation to continue rising, but by Q4,
cat("  it could be anywhere from 115 to 125 with 95% confidence.'\n")
#>   it could be anywhere from 115 to 125 with 95% confidence.'

Show code

# Create a fan chart
forecast_dates = forecast_sarima_mean.index
forecast_point = forecast_sarima_mean.values
forecast_lower_95 = forecast_sarima_ci.iloc[:, 0].values
forecast_upper_95 = forecast_sarima_ci.iloc[:, 1].values

# 80% intervals (approximately 1.28 * std)
forecast_lower_80 = forecast_sarima_mean - 1.28 * (forecast_upper_95 - forecast_sarima_mean) / 1.96
forecast_upper_80 = forecast_sarima_mean + 1.28 * (forecast_upper_95 - forecast_sarima_mean) / 1.96

fig, ax = plt.subplots(figsize=(14, 7))

# Historical CPI
ax.plot(cpi_ts.index, cpi_ts.values, color='black', linewidth=2, label='Historical CPI')

# 95% interval
ax.fill_between(forecast_dates, forecast_lower_95, forecast_upper_95,
                 alpha=0.15, color='blue', label='95% Prediction Interval')

# 80% interval
ax.fill_between(forecast_dates, forecast_lower_80, forecast_upper_80,
                 alpha=0.3, color='blue', label='80% Prediction Interval')

# Point forecast
ax.plot(forecast_dates, forecast_point, color='darkblue', linewidth=2.5, label='Point Forecast')

ax.set_title('Nigerian CPI: Fan Chart Forecast', fontsize=14, fontweight='bold')
ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('CPI', fontsize=12)
ax.legend(fontsize=11, loc='best')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('fan_chart.png', dpi=150, bbox_inches='tight')
print("Fan chart saved as 'fan_chart.png'")
#> Fan chart saved as 'fan_chart.png'

print("\n=== INTERPRETING THE FAN CHART ===\n")
#> 
#> === INTERPRETING THE FAN CHART ===
print("Blue line: Most likely CPI path (point forecast).")
#> Blue line: Most likely CPI path (point forecast).
print("Dark shading: 80% chance CPI falls here. (4 in 5 chance.)")
#> Dark shading: 80% chance CPI falls here. (4 in 5 chance.)
print("Light shading: 95% chance CPI falls here. (19 in 20 chance.)")
#> Light shading: 95% chance CPI falls here. (19 in 20 chance.)
print("Widening bands: Uncertainty increases over time.")
#> Widening bands: Uncertainty increases over time.
print("Narrative: 'We expect inflation to continue rising, but by Q4,")
#> Narrative: 'We expect inflation to continue rising, but by Q4,
print("  it could be anywhere from 115 to 125 with 95% confidence.'")
#>   it could be anywhere from 115 to 125 with 95% confidence.'

The fan chart is a powerful communication tool. It avoids the false certainty of a single point forecast while giving decision-makers a clear sense of the range of likely outcomes.

📝 Section 24.8 Review Questions

A stakeholder says, “Your forecast says CPI will be 110. I’ll plan assuming 110.” How would you respond?
Why do prediction intervals widen further into the future?
If the 95% interval for next month is (108, 112) but CPI turns out to be 115, does this mean your model was wrong?

29.9 Case Study: Forecasting Nigerian Foreign Exchange Rates with ARIMA and Sensitivity Analysis

We execute a full production-ready forecast: fit ARIMA to NGN/USD daily rates (2015-2023), produce 90-day-ahead forecasts with confidence intervals, analyse sensitivity to model choice, and communicate findings.

Business Context: A Nigerian import firm wants to hedge against currency risk. They need a probabilistic forecast of the NGN/USD rate over the next 3 months to inform hedging strategy.

Data: Daily NGN/USD exchange rates, 2015-2023 (synthetic but realistic: trending with volatility).

Approach: 1. Fit ARIMA(1,1,1) (identified in earlier sections). 2. Produce 90-day forecast with 95% prediction intervals. 3. Fit alternative models (ARIMA(2,1,1), ARIMA(1,1,2)) and compare forecasts. 4. Analyse sensitivity: how much do forecasts change with model choice? 5. Recommend hedging strategy based on forecast range.

Show code

# Full case study: NGN/USD forecasting

cat("=== NIGERIAN FX RATE FORECASTING CASE STUDY ===\n")
#> === NIGERIAN FX RATE FORECASTING CASE STUDY ===
cat("Business Goal: Forecast NGN/USD for next 90 days to inform hedging strategy.\n\n")
#> Business Goal: Forecast NGN/USD for next 90 days to inform hedging strategy.

# Data summary
cat("Data Summary:\n")
#> Data Summary:
cat("  Series: NGN/USD daily exchange rate\n")
#>   Series: NGN/USD daily exchange rate
cat("  Period: 2015-01-01 to 2023-09-30 (", length(fx_ts), "observations)\n")
#>   Period: 2015-01-01 to 2023-09-30 ( 2500 observations)
cat("  Current rate:", round(tail(fx_ts, 1), 2), "\n")
#>   Current rate: 233.85
cat("  Min/Max:", round(min(fx_ts), 2), "/", round(max(fx_ts), 2), "\n\n")
#>   Min/Max: 195.56 / 237.98

# Fit three candidate models
models <- list(
  ARIMA_1_1_1 = arima(fx_ts, order = c(1, 1, 1)),
  ARIMA_2_1_1 = arima(fx_ts, order = c(2, 1, 1)),
  ARIMA_1_1_2 = arima(fx_ts, order = c(1, 1, 2))
)

# Compare
ics <- sapply(models, function(m) c(AIC = AIC(m), BIC = BIC(m)))
cat("=== Model Comparison ===\n")
#> === Model Comparison ===
print(ics)
#>     ARIMA_1_1_1 ARIMA_2_1_1 ARIMA_1_1_2
#> AIC    3619.728    3621.698    3621.616
#> BIC    3637.199    3644.992    3644.910

# Forecast all models
forecasts <- lapply(models, function(m) forecast(m, h = 90))

# Extract mean and intervals for ARIMA(1,1,1) (best by AIC)
best_forecast <- forecasts$ARIMA_1_1_1
forecast_dates_90 <- seq(from = tail(dates_fx, 1) + 1, by = "day", length.out = 90)

# Create comparison plot
forecast_data_comp <- tibble(
  date = forecast_dates_90,
  ARIMA_1_1_1 = best_forecast$mean,
  ARIMA_2_1_1 = forecasts$ARIMA_2_1_1$mean,
  ARIMA_1_1_2 = forecasts$ARIMA_1_1_2$mean
) |>
  pivot_longer(-date, names_to = "Model", values_to = "Forecast")

ggplot() +
  geom_line(aes(x = dates_fx, y = as.numeric(fx_ts)), color = "black", linewidth = 1, label = "Historical") +
  geom_line(data = forecast_data_comp, aes(x = date, y = Forecast, color = Model), linewidth = 1) +
  geom_ribbon(aes(x = forecast_dates_90, ymin = best_forecast$lower[, 2], ymax = best_forecast$upper[, 2]),
              alpha = 0.15, fill = "blue") +
  theme_minimal() +
  labs(title = "NGN/USD Forecast: Model Sensitivity Analysis",
       subtitle = "Shaded: 95% PI for ARIMA(1,1,1). Lines: Point forecasts from 3 models.",
       x = "Date", y = "Exchange Rate") +
  theme(legend.position = "bottom")

Show code


# Risk analysis
cat("\n=== Risk Analysis (ARIMA(1,1,1)) ===\n")
#> 
#> === Risk Analysis (ARIMA(1,1,1)) ===
upper_95 <- best_forecast$upper[, 2]
lower_95 <- best_forecast$lower[, 2]

cat("90-day forecast range:\n")
#> 90-day forecast range:
cat("  Best case (lower bound):", round(min(lower_95), 2), "\n")
#>   Best case (lower bound): 224.44
cat("  Expected (point forecast):", round(tail(best_forecast$mean, 1), 2), "\n")
#>   Expected (point forecast): 233.86
cat("  Worst case (upper bound):", round(max(upper_95), 2), "\n")
#>   Worst case (upper bound): 243.27
cat("  Range width:", round(max(upper_95) - min(lower_95), 2), "\n\n")
#>   Range width: 18.82

# Business recommendation
current_rate <- as.numeric(tail(fx_ts, 1))
expected_rate <- as.numeric(tail(best_forecast$mean, 1))
pct_change <- 100 * (expected_rate - current_rate) / current_rate

cat("=== BUSINESS RECOMMENDATION ===\n")
#> === BUSINESS RECOMMENDATION ===
cat("Current rate:", round(current_rate, 2), "\n")
#> Current rate: 233.85
cat("Expected rate (90 days):", round(expected_rate, 2), "\n")
#> Expected rate (90 days): 233.86
cat("Expected move:", round(pct_change, 1), "%\n\n")
#> Expected move: 0 %

if (pct_change > 2) {
  cat("Direction: NAIRA DEPRECIATION (stronger USD)\n")
  cat("Hedging advice: Consider forward contracts or options to lock in rates.\n")
  cat("Risk: If rate strengthens beyond", round(max(upper_95), 2), "(5% probability),\n")
  cat("  import costs could increase substantially.\n")
} else if (pct_change < -2) {
  cat("Direction: NAIRA APPRECIATION (weaker USD)\n")
  cat("Hedging advice: May be able to defer hedging or use collars.\n")
} else {
  cat("Direction: RELATIVELY FLAT\n")
  cat("Hedging advice: Limited currency risk; focus on operational efficiency.\n")
}
#> Direction: RELATIVELY FLAT
#> Hedging advice: Limited currency risk; focus on operational efficiency.

Show code

# Full case study: NGN/USD forecasting

print("=== NIGERIAN FX RATE FORECASTING CASE STUDY ===")
#> === NIGERIAN FX RATE FORECASTING CASE STUDY ===
print("Business Goal: Forecast NGN/USD for next 90 days to inform hedging strategy.\n")
#> Business Goal: Forecast NGN/USD for next 90 days to inform hedging strategy.

# Data summary
print("Data Summary:")
#> Data Summary:
print(f"  Series: NGN/USD daily exchange rate")
#>   Series: NGN/USD daily exchange rate
print(f"  Period: 2015-01-01 to 2023-09-30 ({len(fx_ts)} observations)")
#>   Period: 2015-01-01 to 2023-09-30 (2500 observations)
print(f"  Current rate: {fx_ts.iloc[-1]:.2f}")
#>   Current rate: 288.15
print(f"  Min/Max: {fx_ts.min():.2f}/{fx_ts.max():.2f}\n")
#>   Min/Max: 197.89/294.91

# Fit three models
models_dict = {
    'ARIMA(1,1,1)': SARIMAX(fx_ts, order=(1,1,1)),
    'ARIMA(2,1,1)': SARIMAX(fx_ts, order=(2,1,1)),
    'ARIMA(1,1,2)': SARIMAX(fx_ts, order=(1,1,2))
}

fits = {}
aics = []
for name, model in models_dict.items():
    fit = model.fit(disp=False)
    fits[name] = fit
    aics.append(fit.aic)

# Compare
comparison_df = pd.DataFrame({
    'Model': models_dict.keys(),
    'AIC': aics,
    'BIC': [fits[name].bic for name in models_dict.keys()]
})

print("=== Model Comparison ===")
#> === Model Comparison ===
print(comparison_df.round(1))
#>           Model     AIC     BIC
#> 0  ARIMA(1,1,1)  3577.1  3594.6
#> 1  ARIMA(2,1,1)  3598.9  3622.2
#> 2  ARIMA(1,1,2)  3593.7  3616.9

best_model_name = comparison_df.loc[comparison_df['AIC'].idxmin(), 'Model']
best_fit = fits[best_model_name]

# Forecast all models
forecasts_dict = {}
for name, fit in fits.items():
    fc = fit.get_forecast(steps=90)
    forecasts_dict[name] = fc

# Create comparison data
forecast_dates_90 = pd.date_range(start=fx_ts.index[-1] + timedelta(days=1), periods=90, freq='D')

comparison_plot_data = pd.DataFrame({
    'date': forecast_dates_90,
    'ARIMA(1,1,1)': forecasts_dict['ARIMA(1,1,1)'].predicted_mean.values,
    'ARIMA(2,1,1)': forecasts_dict['ARIMA(2,1,1)'].predicted_mean.values,
    'ARIMA(1,1,2)': forecasts_dict['ARIMA(1,1,2)'].predicted_mean.values
})

# Plot
best_forecast = forecasts_dict[best_model_name]
best_ci = best_forecast.conf_int()

fig, ax = plt.subplots(figsize=(14, 7))
ax.plot(fx_ts.index, fx_ts.values, color='black', linewidth=1.5, label='Historical Data')

for model_name in models_dict.keys():
    fc = forecasts_dict[model_name].predicted_mean
    ax.plot(forecast_dates_90, fc.values, linewidth=1.5, label=model_name)

ax.fill_between(best_ci.index, best_ci.iloc[:, 0], best_ci.iloc[:, 1],
                 alpha=0.15, color='blue', label='95% PI (Best Model)')

ax.set_title('NGN/USD Forecast: Model Sensitivity Analysis', fontsize=14, fontweight='bold')
ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('Exchange Rate', fontsize=12)
ax.legend(fontsize=10)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('fx_forecast_case_study.png', dpi=150, bbox_inches='tight')
print("\nPlot saved as 'fx_forecast_case_study.png'")
#> 
#> Plot saved as 'fx_forecast_case_study.png'

# Risk analysis
upper_95 = best_ci.iloc[:, 1].values
lower_95 = best_ci.iloc[:, 0].values

print("\n=== Risk Analysis (Best Model) ===")
#> 
#> === Risk Analysis (Best Model) ===
print(f"90-day forecast range:")
#> 90-day forecast range:
print(f"  Best case (lower bound): {lower_95.min():.2f}")
#>   Best case (lower bound): 276.13
print(f"  Expected (point forecast): {best_forecast.predicted_mean.iloc[-1]:.2f}")
#>   Expected (point forecast): 289.66
print(f"  Worst case (upper bound): {upper_95.max():.2f}")
#>   Worst case (upper bound): 303.19
print(f"  Range width: {upper_95.max() - lower_95.min():.2f}\n")
#>   Range width: 27.06

# Business recommendation
current_rate = fx_ts.iloc[-1]
expected_rate = best_forecast.predicted_mean.iloc[-1]
pct_change = 100 * (expected_rate - current_rate) / current_rate

print("=== BUSINESS RECOMMENDATION ===")
#> === BUSINESS RECOMMENDATION ===
print(f"Current rate: {current_rate:.2f}")
#> Current rate: 288.15
print(f"Expected rate (90 days): {expected_rate:.2f}")
#> Expected rate (90 days): 289.66
print(f"Expected move: {pct_change:.1f}%\n")
#> Expected move: 0.5%

if pct_change > 2:
    print("Direction: NAIRA DEPRECIATION (stronger USD)")
    print("Hedging advice: Consider forward contracts to lock in rates.")
    print(f"Risk: Rate could strengthen to {upper_95.max():.2f} (5% probability).")
elif pct_change < -2:
    print("Direction: NAIRA APPRECIATION (weaker USD)")
    print("Hedging advice: Defer hedging or use collars.")
else:
    print("Direction: RELATIVELY FLAT")
    print("Hedging advice: Limited currency risk.")
#> Direction: RELATIVELY FLAT
#> Hedging advice: Limited currency risk.

This complete case study demonstrates the full workflow: data exploration, model selection, forecasting, uncertainty quantification, and actionable business recommendations. A firm using this forecast can now make informed hedging decisions, quantifying the downside risk they face.

📝 Case Study Reflection Questions

If the hedging cost is 2% of the transaction value, and the expected move is 1% depreciation, is hedging worthwhile?
What would you do if all three models produced very different forecasts (e.g., ranging from -5% to +10%)?
How would you update the forecast if a major policy change (e.g., central bank intervention) occurs mid-forecast period?
The 95% prediction interval is (420, 480) NGN/USD. If the actual rate turns out to be 500, does this invalidate your model? Why or why not?

29.10 Chapter Exercises

Chapter 24 Exercises

Exercise 24.1: Exponential Smoothing by Hand

A supermarket chain in Abuja records weekly sales of a popular beverage (in cartons). The last 8 weeks of data are:

Week 1: 120, Week 2: 135, Week 3: 128, Week 4: 142, Week 5: 138, Week 6: 155, Week 7: 149, Week 8: 162

Apply Simple Exponential Smoothing (SES) with α = 0.3. Use Week 1 actual sales as your starting forecast. Calculate the forecast for weeks 2–9. Show all working.
Now apply SES with α = 0.8. Calculate forecasts for weeks 2–9.
Calculate the Mean Squared Error (MSE) and Mean Absolute Error (MAE) for both α values using the 8 weeks of data (weeks 1–8). Which α gives a lower error?
Explain intuitively what the difference between α = 0.3 and α = 0.8 means in terms of how the model responds to recent changes in demand. In a market where trends change quickly, which α would you prefer?
Simple exponential smoothing has no trend component. What would happen to its forecasts for a series that is steadily growing by 10 cartons per week? Describe the long-run behaviour of SES forecasts in this scenario.

Exercise 24.2: ARIMA Order Identification

For each of the following ACF/PACF descriptions, suggest the most likely ARIMA(p,d,q) model. Briefly justify each choice.

The original series has an upward trend. The ACF decays very slowly. After first differencing, the ACF cuts off after lag 1, and the PACF shows exponential decay.
The original series has no clear trend. The ACF shows a significant spike at lag 1 and then cuts off. The PACF decays exponentially.
The original series oscillates with no trend. The ACF shows significant spikes at lags 1, 2, and 3, then cuts off. The PACF decays exponentially.
After first differencing, both the ACF and PACF show no significant autocorrelation at any lag (all values within the blue confidence bands).
The original series shows a strong seasonal pattern with a 12-month cycle. After seasonal differencing (lag 12), the ACF has a significant spike at lag 12 only. What type of model would you use?

Exercise 24.3: ARIMA Forecasting — Step by Step

You fit an ARIMA(1,1,1) model to monthly cement production data (tonnes, millions) and get the following estimated parameters:

AR coefficient (φ₁) = 0.45
MA coefficient (θ₁) = −0.30
Constant (drift) = 0.05

The last two actual observations are: Y₅₉ = 8.4, Y₆₀ = 8.7 The last residual (forecast error at time 60) is: ε₆₀ = 0.12

The ARIMA(1,1,1) forecast formula for the differenced series W_t = Y_t − Y_{t−1} is: \[\hat{W}_{t+1} = c + \phi_1 W_t + \theta_1 \varepsilon_t\]

And the level forecast: \(\hat{Y}_{t+1} = Y_t + \hat{W}_{t+1}\)

Calculate W₆₀ = Y₆₀ − Y₅₉. What does this represent?
Using the formula above, calculate the one-step-ahead forecast \(\hat{W}_{61}\).
Convert to a level forecast: what is \(\hat{Y}_{61}\)?
Without calculating, describe how the uncertainty in the two-step-ahead forecast (\(\hat{Y}_{62}\)) compares to the one-step-ahead forecast. Would you expect a wider or narrower prediction interval?
After fitting the model, you check the residual ACF and find a significant spike at lag 3. What does this tell you about the model fit? What would you do?

Exercise 24.4: Holt-Winters for Seasonal Data

A Nigerian bank records quarterly net interest income (₦ billions) over 4 years:

Year	Q1	Q2	Q3	Q4
2021	12.1	13.8	14.2	16.5
2022	13.0	14.9	15.3	17.8
2023	14.1	16.2	16.8	19.3
2024	15.3	17.5	18.1	21.0

Identify the trend and seasonal pattern in this data. Is the seasonality additive or multiplicative? (Hint: does the size of the Q4 peak grow proportionally with the overall level?)
Which version of Holt-Winters (additive or multiplicative) would you apply to this series? Explain your reasoning.
Using the 2021–2023 data as your training set and 2024 as your test set, fit both additive and multiplicative Holt-Winters models (using R or Python). Compare their RMSE on the test set.

# Load data
quarterly_data <- c(12.1, 13.8, 14.2, 16.5,
                    13.0, 14.9, 15.3, 17.8,
                    14.1, 16.2, 16.8, 19.3,
                    15.3, 17.5, 18.1, 21.0)
ts_data <- ts(quarterly_data, frequency = 4, start = c(2021, 1))

Using the better model, forecast all four quarters of 2025. Report both point forecasts and 80% prediction intervals.
The bank’s CFO says: “Q4 is always our best quarter — we should budget for the seasonal peak to continue.” Using your model, how would you validate or challenge this assumption? (Consider: are the seasonal factors stable year-over-year, or are they changing?)

Exercise 24.5: Model Comparison and Selection

You are forecasting monthly exports from Nigeria (USD billions) for the next 12 months. You have fitted four models on 10 years of training data and evaluated them on a 1-year test set:

Model	Training RMSE	Test RMSE	AIC	Training Time
ARIMA(2,1,1)	0.48	0.61	142	2 sec
ARIMA(1,1,0)	0.52	0.58	147	1 sec
Holt-Winters (multiplicative)	0.41	0.65	—	1 sec
ETS(M,Ad,M)	0.45	0.59	138	1 sec

Which model has the best in-sample fit (lowest training RMSE)? Which has the best out-of-sample forecast accuracy (lowest test RMSE)?
Holt-Winters has the best training RMSE but the worst test RMSE. What does this suggest about the Holt-Winters model’s behaviour?
ARIMA(2,1,1) has a lower AIC than ARIMA(1,1,0). However, the simpler ARIMA(1,1,0) has better test performance. How do you reconcile these findings? Which model would you choose and why?
A statistician colleague says: “You should always choose the model with the lowest AIC.” A business manager says: “I don’t care about AIC — I care about how accurate the forecasts will be.” Who is right? Under what circumstances do AIC and test performance agree, and when might they diverge?
Write a 150-word recommendation memo to the CBN’s Export Finance Department explaining which forecasting model you would deploy, why, and what its limitations are. Use plain language.

29.11 Appendix: ARIMA Estimation and Diagnostics

29.12 A24.1 AR(p) Yule-Walker Equations

For an AR(p) process \(Y_t = \phi_1 Y_{t-1} + \cdots + \phi_p Y_{t-p} + \epsilon_t\) with white noise \(\epsilon_t\), the autocovariances \(\gamma_h = \text{Cov}(Y_t, Y_{t-h})\) satisfy:

\[\gamma_h = \phi_1 \gamma_{h-1} + \phi_2 \gamma_{h-2} + \cdots + \phi_p \gamma_{h-p} \quad \text{for } h \geq 1\]

The Yule-Walker equations set these equal to sample autocovariances and solve for \(\phi_1, \ldots, \phi_p\):

\[\begin{pmatrix} \gamma_0 & \gamma_1 & \cdots & \gamma_{p-1} \\ \gamma_1 & \gamma_0 & \cdots & \gamma_{p-2} \\ \vdots & \vdots & \ddots & \vdots \\ \gamma_{p-1} & \gamma_{p-2} & \cdots & \gamma_0 \end{pmatrix} \begin{pmatrix} \phi_1 \\ \phi_2 \\ \vdots \\ \phi_p \end{pmatrix} = \begin{pmatrix} \gamma_1 \\ \gamma_2 \\ \vdots \\ \gamma_p \end{pmatrix}\]

This gives consistent (asymptotically) estimates of the AR parameters.

29.13 A24.2 MA(q) Invertibility Condition

An MA(q) process \(Y_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \cdots + \theta_q \epsilon_{t-q}\) is invertible if the roots of:

\[1 + \theta_1 z + \theta_2 z^2 + \cdots + \theta_q z^q = 0\]

all lie outside the unit circle (|z| > 1). Invertibility ensures the MA process can be represented as an AR(∞) and is statistically identifiable.

29.14 A24.3 Ljung-Box Q Statistic Derivation

The Ljung-Box Q statistic tests whether the first \(m\) sample autocorrelations of residuals are significantly different from zero:

\[Q_m = n(n+2) \sum_{h=1}^{m} \frac{\hat{\rho}_h^2}{n - h}\]

where \(\hat{\rho}_h\) is the sample autocorrelation at lag \(h\) and \(n\) is the sample size. Under the null (white noise), \(Q_m \sim \chi^2_m\).

The Ljung-Box correction \((n+2)/(n-h)\) improves the test’s performance in small samples compared to the older Box-Pierce statistic.

29.15 A24.4 Proof That ARIMA(0,1,0) Is a Random Walk

ARIMA(0,1,0) with no AR, one differencing, and no MA:

\[\Delta Y_t = \epsilon_t\] \[Y_t - Y_{t-1} = \epsilon_t\] \[Y_t = Y_{t-1} + \epsilon_t\]

This is a random walk (or random walk with drift if a constant is added). Each value is the previous value plus a white noise shock. The random walk is non-stationary: it has no constant mean or variance, and its ACF decays to zero very slowly.

The random walk is the canonical I(1) process, confirming that ARIMA(0,1,0) achieves stationarity via one differencing.