Input data
Demand Modeler requires a number of tables populated with the data to be processed, along with parameters that determine how the data is manipulated.
The following diagram summarizes the types of tables needed to execute the forecasting process:
For more information about the structure and content of the input data, see the Demand Modeler Data Dictionary.
Data preparation
Before forecasting can proceed, the raw data read from the input tables must be transformed into a format that the forecasting process can use. Ultimately, all of the data is combined into large tables referred to as feature matrices. These tables include one row for each timestamp of each time series, and each row includes all of its feature values. One feature matrix exists for the historical data, and another feature matrix exists for the dates to be forecast.
As part of the steps for building the feature matrices, additional processing is performed to aggregate data, impute missing values, and provide a certain amount of data validation.
Target time series
Table: target_time_series
This table contains the main set of historical data for the time series to be forecast. It is keyed by a number of dimensions, such as city or sku, used to link each record to corresponding data in other tables. It includes a timestamp column that lists the date of each historical data point and is used to link the record to corresponding data in other tables. Also, it contains a column containing the historical value, which should always be present for all rows to ensure more accurate forecasts.
Related time series
Tables: mapped_features, trend_cloud_features
These tables contain additional data related to the target time series. For example, if the target time series contains the sales of products, there might be a related time series that has the price of the product on each date, which may be useful in predicting the demand for the product. These related time series are joined together with the target time series via the dimensions and timestamps.
Events
Tables: event_info
This table contains date-based occurrences that may have an impact on the forecast of the target time series. For example, holidays or promotions might affect the demand for a product, so for each timestamp in the target time series it might help to know whether any events occurred at the same time. These events are joined together with the target time series via the timestamps.
Master data
Tables: dim_1_master, dim_2_master, dim_3_master, dim_1_dim_2_master, dim_1_dim_3_master, dim_2_dim_3_master, dim_1_dim_2_dim_3_master
These tables are used to specify additional characteristics for each of the dimensions and their combinations. For example, a dimension that is a product id might have additional characteristics, such as category, that can be captured via these tables and used as categorical features in the forecasting process. These tables, if present, are joined to the target time series table via the corresponding dimension columns.
Feature configuration
Tables: feature_groups, feature_assignment, feature_availability_prediction_time
These tables can be used to configure features in the forecasting process.
Algorithm overrides
Tables: time_series_groups, algorithm_overrides
These tables can be used to override the algorithms run for specific LODs. The LODs and algorithms specified in those tables take precedence over the process described in `Specifying Which Algorithms To Use.`
Parameters
Tables: input_parameters, advanced_params
Demand modeler uses two parameter tables. The input_parameters table contains the more commonly modified parameters. The advanced_parameters table is used to override the usual defaults for specific cases.
Specifying which algorithm to use
Demand Modeler provides a lot of flexibility for choosing the forecasting algorithms to be used in the forecasting process. This section outlines the process it follows in reading the parameters to ultimately decide what to use.
If input parameter Forecast_Strategy is set to AutoTuning:
-
If input parameter Effort_Level is set to Low
-
Look for an advanced parameter where Entity=’Effort Level’, Entity Subtype=’Low’ and Parameter name=’Algorithms’
-
If found, use those algorithms.
-
If not found, use Naive, Seasonal Naive, Simple Moving Average, Auto Arima, and Exponential Smoothing.
-
-
-
If input parameter Effort_Level is set to Medium
-
Look for an advanced parameter where Entity=’Effort Level’, Entity Subtype=’Medium’ and Parameter name=’Algorithms’
-
If found, use those algorithms.
-
If not found, use Stochastic Gradient Boosting (Global), ArimaX, AutoArima, and Exponential Smoothing.
-
-
-
If input parameter Effort_Level is set to High
-
Look for an advanced parameter where Entity=’Effort Level’, Entity Subtype=’High’ and Parameter name=’Algorithms’
-
If found, use those algorithms.
-
If not found, use Naive, Seasonal Naive, Simple Moving Average, Prophet, Stochastic Gradient Boosting (Global), ArimaX, AutoArima, and Exponential Smoothing.
-
-
If input parameter Forecast_Strategy is NOT set to AutoTuning:
-
Read input parameter Algorithms.
-
If values are present, use those algorithms.
-
If no values are present, use Prophet.
-
Note that if any algorithm overrides are specified at the LOD level via the algorithm_overrides table, those overrides take precedence over the algorithms specified via the parameters.
Last modified: Friday May 12, 2023