Demand Clustering

A new home for the Coupa Supply Chain documentation

Starting with Supply Chain 42, our documentation will be located on Coupa Compass. The help here will continue to be accessible for the foreseeable future, but will no longer be updated. If you need a Compass user account, contact your company’s Designated Support Contact (DSC).

You can access new and updated Supply Chain documentation in the following location on Compass:

https://compass.coupa.com/en-us/products/supply-chain-design-and-planning

Demand Clustering

Available for licensed Demand Guru users only, the Demand Guru Clustering action allows you to run cluster definitions created in Demand Guru’s Time Series Forecasting Workbench, and save the output data generated from the cluster definitions.

When you use Demand Guru to create demand clusters, the clusters you define are saved within the workbench. However, the output generated when you run those cluster definitions in Demand Guru is only in memory and therefore not saved. Use this action to run the cluster definitions as configured in Demand Guru, and generate output that can be saved in Data Guru.

On the Connections tab:

Provide a name and description.
Select the Demand Guru Workbench that contains the cluster definitions.
Select the clusters to be run when the action is executed.
Select output options, including the prefix to be used for output table names.

The most common application for this action is using demand cluster output and input to other actions in Data Guru.

Execute Demand Guru cluster definition

Drag the Demand Guru Clustering icon onto the design surface.
Enter a Name and a Description to identify this action.
For Input, select a Workbench created in Demand Guru that contains cluster definitions. Note that all workbenches are listed here, regardless of whether they are configured to use clusters.
For Cluster Definitions, choose the cluster definitions to be included in the output. The list includes all cluster definitions for the selected workbench, with two additional choices that are always available here:
- The Select All option is a choice that reflects one of three different states, depending on whether any clusters in this list are already selected -
  - Selected (traditional check mark), means that all clusters in the list are selected.
  - Semi-selected (solid square), means that some but not all clusters in the list are selected.
  - Unselected (empty) means that none of the clusters in the list are selected.
  As you make or change selections to the clusters in this list, the state of the Select All checkbox is updated when necessary to reflect your current selections.
  When you select this checkbox -
  - If the checkbox is already selected or semi-selected, all clusters in the list become unselected.
  - If the checkbox is not already selected, all clusters become selected.
- The Automatic cluster definition option allows you to let the application determine the relevant features and number of clusters.
To choose the cluster definitions to be included in the output -
- Select Run All Cluster Definitions to generate output data for all clusters in the workbench.
- Select Run Selected Cluster Definitions to run specific clusters, then check the box next to each cluster to be included.
  If you elect to run selected clusters, those clusters are selected by default the next time you open the action.
For Output, choose your output database and table options:
- For Database Connection, select the database to which the output tables will be written.
- For Output Table Prefix, enter a string by which the output table names will be identified. The following tables are created:
- For Output Mode, indicate whether the output tables should be deleted after macro execution.

Demand Guru Clustering output tables

When you execute the Demand Guru Cluster action, you create a set of output tables similar to those created in Coupa’s Demand Guru. This output is based on the last saved clusters in the Demand Guru Workbench you are referencing in this action.

Cluster Feature

The Demand Guru Clustering action outputs two cluster feature tables with the same columns:

Scaled - The data is normalized between -1 and 1.
Unscaled - The data is not normalized.

Cluster Model

The name of the cluster definition.

Time Stamp

The date and time at which the cluster was run by this action.

Time Series

The user-specified name of the time series; for example, SKU identifiers, customer identifiers, or any of the "Group by" tags used in the data being clustered.

Cluster

The cluster to which the time series belongs after execution of the cluster model.

Representative Time Series

Indicates if the time series is representative of the cluster in which it is classified.

Seasonality

A coefficient value based on the periodicity of the most prominent seasonal period in a time series. Low values indicate low frequency or long period seasons, while high values indicate high frequency or short periodic time series. An absence of seasonality in the data is assumed to indicate extremely high frequency (noisy data) and has a high value associated with it.

Trend

An index of the strength of a trend. High positive values indicate strong upward trend, high negative values indicate a strong downward trend, and values close to zero indicate that the trend is flat.

Mean

Mean of the demand values in a time series.

Variance

Variance of the demand values in a time series.

Auto Correlation

Represents the extent of dependence on past demand values. While calculating this score, 10 lags are considered. A higher dependence of the time series on its past 10 values results in a higher auto correlation score.

Lumpiness

The variability of variance of each period in a time series. Conceptually, divide the time series into multiple sections, calculate the variance of each section, and then calculate the variance of these variance values. Low values indicate that the variance of the time series does not change much across its different sections, while high values indicate the variance is changing a lot across different sections of a time series.

Level Shift

The maximum absolute value of mean values of slices of a time series, when a rolling window (of size =1) is used. Seasonal time series are divided into multiple slices, with the slice length equal to its most prominent period. For non-seasonal time series, this slice length is equal to a fixed constant.

The first slice is rolled by the window size (= 1 here), and the maximum absolute value of the mean of the rolled slice gives the value of the level shift.

Intuitively, this value represents the maximum “level” in a section of a time series.

Variance Change

The maximum absolute value of variance values of slices of a time series, when a rolling window (of size =1) is used. Seasonal time series are divided into multiple slices, with the slice length equal to its most prominent period. For non-seasonal time series, this slice length is equal to a fixed constant.

The first slice is rolled by the window size (= 1 here), and the maximum absolute value of the variance of the rolled slice gives the value of the variance change.

Intuitively, this value represents the maximum variance in a section of a time series.

Crossing Points

The number of times a time series crosses the midpoint of its range. Range is nothing but (Maximum value – Minimum value).

Linearity

Strength of the linearity component of a time series, with high values indicating strong linearity, and positive values indicating an upward trending time series.

Curvature

The strength of a trend's curvature component in a time series. Positive values indicate convex shaped time series, while negative values indicate concave shaped time series.

Peak

Strength of the highest point on the seasonal component of a time series.

Trough

Value of the lowest point on the seasonal component of a time series.

Entropy

A measure of the forecastability of a time series. It reveals the degree of difficulty associated with forecasting a specific time series, based on only the demand values. Low values indicate that the time series is relatively easy to forecast, and higher values indicate increased difficulty.

Spikiness

The strength of spikes of residuals in a time series. The seasonality and trend components are first removed, and then the value is calculated over the residual component.

Flat Spots

The length of flat spots in a time series. To arrive at this value, a time series is broken down into multiple discrete levels. Then, an analysis is made to determine the number periods for which the time series maintains the same level. The overall length of these periods provides the value for this feature. A lower value means that the time series is changing its discretized levels more often, and a higher value means that the time series is changing its levels less often.

Model ID

The internal ID of the cluster model.

Last modified: Thursday December 19, 2024

Is this useful?

Yes No

Demand Clustering

Execute Demand Guru cluster definition

Demand Guru Clustering output tables

Cluster Model

Time Stamp

Feature

Importance Score

Selected

Model ID

Cluster Model

Time Stamp

Cluster Quality Score

Model ID

Cluster Model

Time Stamp

Time Series

Cluster

Representative Time Series

Seasonality

Trend

Mean

Variance

Auto Correlation

Lumpiness

Level Shift

Variance Change

Crossing Points

Linearity

Curvature

Peak

Trough

Entropy

Spikiness

Flat Spots

Model ID