Machine Learning model explainability
Tree-based models like XGBoost use multiple decision trees to make predictions. Depending on the number of features we have, such models can become very complex and understanding their results may be challenging. We use Shapley Additive Explanations (SHAP) approaches to explain the prediction paths and quantify the magnitude and directionality of each feature in predicting cost. These are referred to as “scores” of a feature.
SHAP
SHAP stands for Shapley Additive Explanations, which is a method to explain model predictions based on Shapley values from game theory. The downside of this approach is the computation involved when there are a large number of features in the model. To overcome this, we have used FastTreeShap, which is an efficient way to compute feature contributions. More information about this can be found here.
For the default Machine Learning model (XGBoost), FastTreeSHAP is used to compute the feature local scores to calculate scores for every path.
Local Scores
Local scores are the contributions of features in predicting the cost of a single path. This is calculated using SHAP for an XGBoost model. As a default, Local Scores are calculated for the XGBoost model using the FastTreeShap approach.
Interpretation of the score
Below is an example showing the local scores of features contributing to the transportation cost for Path ID “3”.
Path ID | Feature Name | Local Score | Feature Values |
---|---|---|---|
3 |
Truckload_Distance |
0.7323 | 3406.24 |
3 |
NumberSitesMFG |
0.246 | 1 |
3 |
TotalFlowUnitQty |
-0.03 | .222 |
By default, a logarithmic transformation of the business objective is performed before fitting the Machine Learning model. For better interpretation, the local scores generated for the features are retained in the log-space when reporting. The business objective’s true value and the Machine Learning model predicted value are converted to the original scale (after applying a reverse transformation - exponentiation).
A positive Local Score attributes to an increase in cost. A negative local score attributes to a reduction in cost. In this example, larger truckload distances increase transportation cost. On the contrary, large flow unit quantities are associated with lower transportation costs.
The True Value and predicted Value of the response is reported in the original space (after performing an inverse transformation). In order to derive the model predicted value from the Local scores, this calculation needs to be performed:
Model Predicted Score = Exp (Sum of Individual Local Scores) * Mean Prediction
Global Scores
An aggregated score of feature contributions across all paths of a model is referred to as a Global score. This is calculated by grouping the individual local feature scores and averaging their absolute values across all paths.
Interpretation of the score
Feature Name | Global Score |
---|---|
UniqueModeNumber |
1.10 |
CustomerServiceDistance |
-1.03 |
NumberSitesMFG |
1.001 |
A positive global score attributes to an increase in cost and a negative score attributes to a decrease in cost.
Last modified: Friday May 12, 2023