ArimaX

ArimaX stands for AutoRegressive Integrated Moving Average with eXogenous variables.

The autoregressive part (AR) forecasts the variable of interest by using the past values of the variable. Lags are simply historical demand values. Lag 1 represents demand value in the previous period, Lag 2 represents demand value two periods ago, and so on. Therefore, if we are dealing with daily data, Lag 1 is yesterday’s demand, but if we are dealing with weekly data, then Lag 1 represents demand that occurred last week.

The integrated part (I) represents the differencing of raw observations to allow the time series to become stationary. In the standard Arima algorithm, a time series is said to be weakly stationary when its mean, variance, and autocorrelation structure do not change over time. For example, if an upward trend exists in the data, it becomes a non-stationary time series. If a time series is non-stationary, the Arima algorithm takes differences between consecutive values of the time series to make a new time series. Taking such differences makes a time series stationary.

The moving average part (MA) uses past forecast errors in a regression-like model. The past forecast error is the difference between the actual data and fitted values. The moving average part of an Arima algorithm is different from a simple moving average.

The external part (X) indicates its ability to incorporate external features. After the time series is differenced to become stationary, the linear regression equation for ArimaX is expressed as:

In this equation:

  • The last k terms of the equation indicate inclusion of k causal variables.

  • Yt-1 to Yt-p are the first p lags

  • εt-1 to εt-q are the first q moving average terms

  • The alphas, thetas and bettas are coefficients of the autoregressive terms, the moving average terms and the features.

The Arima algorithm can also accommodate seasonal lags, seasonal differencing, and seasonal moving average terms. For example, if the data is weekly and it has annual seasonality, the seasonal lag 1 indicates the demand value occurred in the same week of the previous year. Similarly, seasonal moving average term 1 indicates the error between the demand that occurred in the same week of the last year and the corresponding fitted value. Similarly, seasonal differencing is defined as the difference between the current demand and the demand that occurred in the same week of the last year. The Arima variation that includes seasonal terms is known as SARIMA.

This algorithm is useful when:

  • The time series length is medium to long (greater than 50 points).

  • Causal data is available (Arimax can use causal data)

  • The time series is non-stationary. ArimaX can accommodate this by first making it stationary.

ArimaX should not be used when the number of causals is higher than number of data points in demand history

Last modified: Friday May 12, 2023

Is this useful?