Input data

Demand Modeler requires a number of tables populated with the data to be processed, along with parameters that determine how the data is manipulated.

The following diagram summarizes the types of tables needed to execute the forecasting process:

For more information about the structure and content of the input data, see the Demand Modeler Data Dictionary.

Data preparation

Before forecasting can proceed, the raw data read from the input tables must be transformed into a format that the forecasting process can use. Ultimately, all of the data is combined into large tables referred to as feature matrices. These tables include one row for each timestamp of each time series, and each row includes all of its feature values. One feature matrix exists for the historical data, and another feature matrix exists for the dates to be forecast.

As part of the steps for building the feature matrices, additional processing is performed to aggregate data, impute missing values, and provide a certain amount of data validation.

Target time series

Table: target_time_series

This table contains the main set of historical data for the time series to be forecast. It is keyed by a number of dimensions, such as city or sku, used to link each record to corresponding data in other tables. It includes a timestamp column that lists the date of each historical data point and is used to link the record to corresponding data in other tables. Also, it contains a column containing the historical value, which should always be present for all rows to ensure more accurate forecasts.

Related time series

Tables: mapped_features, trend_cloud_features

These tables contain additional data related to the target time series. For example, if the target time series contains the sales of products, there might be a related time series that has the price of the product on each date, which may be useful in predicting the demand for the product. These related time series are joined together with the target time series via the dimensions and timestamps.

Events

Tables: event_info

This table contains date-based occurrences that may have an impact on the forecast of the target time series. For example, holidays or promotions might affect the demand for a product, so for each timestamp in the target time series it might help to know whether any events occurred at the same time. These events are joined together with the target time series via the timestamps.

Master data

Tables: dim_1_master, dim_2_master, dim_3_master, dim_1_dim_2_master, dim_1_dim_3_master, dim_2_dim_3_master, dim_1_dim_2_dim_3_master

These tables are used to specify additional characteristics for each of the dimensions and their combinations. For example, a dimension that is a product id might have additional characteristics, such as category, that can be captured via these tables and used as categorical features in the forecasting process. These tables, if present, are joined to the target time series table via the corresponding dimension columns.

Feature configuration

Tables: feature_groups, feature_assignment, feature_availability_prediction_time

These tables can be used to configure features in the forecasting process.

Algorithm overrides

Tables: time_series_groups, algorithm_overrides

These tables can be used to override the algorithms run for specific LODs. The LODs and algorithms specified in those tables take precedence over the process described in `Specifying Which Algorithms To Use.`

Parameters

Tables: input_parameters, advanced_params

Demand modeler uses two parameter tables. The input_parameters table contains the more commonly modified parameters. The advanced_parameters table is used to override the usual defaults for specific cases.

Specifying which algorithm to use

Demand Modeler provides a lot of flexibility for choosing the forecasting algorithms to be used in the forecasting process. This section outlines the process it follows in reading the parameters to ultimately decide what to use.

If input parameter Forecast_Strategy is set to AutoTuning:

  • If input parameter Effort_Level is set to Low

    • Look for an advanced parameter where Entity=’Effort Level’, Entity Subtype=’Low’ and Parameter name=’Algorithms’

      • If found, use those algorithms.

      • If not found, use Naive, Seasonal Naive, Simple Moving Average, Auto Arima, and Exponential Smoothing.

  • If input parameter Effort_Level is set to Medium

    • Look for an advanced parameter where Entity=’Effort Level’, Entity Subtype=’Medium’ and Parameter name=’Algorithms’

      • If found, use those algorithms.

      • If not found, use Stochastic Gradient Boosting (Global), ArimaX, AutoArima, and Exponential Smoothing.

  • If input parameter Effort_Level is set to High

    • Look for an advanced parameter where Entity=’Effort Level’, Entity Subtype=’High’ and Parameter name=’Algorithms’

      • If found, use those algorithms.

      • If not found, use Naive, Seasonal Naive, Simple Moving Average, Prophet, Stochastic Gradient Boosting (Global), ArimaX, AutoArima, and Exponential Smoothing.

If input parameter Forecast_Strategy is NOT set to AutoTuning:

  • Read input parameter Algorithms.

    • If values are present, use those algorithms.

    • If no values are present, use Prophet.

Note that if any algorithm overrides are specified at the LOD level via the algorithm_overrides table, those overrides take precedence over the algorithms specified via the parameters.

Last modified: Friday May 12, 2023

Is this useful?