DeepBridge Resilience Report

Model: Test Classification Model
Date: 2025-04-21 18:34:24
Type: RandomForestClassifier

Distribution Shift Resilience Analysis: Test Classification Model

This report analyzes how well the model maintains performance when distribution shifts occur between baseline and target data.

Model: RandomForestClassifier

Date: 2025-04-21 18:34:24

82%
Resilience Score

Good resilience with limited performance degradation under distribution shifts.

Poor
Fair
Moderate
Good
Excellent
Higher scores indicate better model resilience to distribution shifts
Performance Gap
12.0%
Average performance decline on shifted data
Distribution Shift
0.14
Average distribution shift magnitude
Base Performance
91.0%
Model performance on baseline data
Outlier Sensitivity
0.01
Model sensitivity to outlier samples

Model Information

Model Type:
RandomForestClassifier
Evaluation Metric:
AUC
Baseline Dataset:
Original Dataset
Target Dataset:
Shifted Dataset
Shift Scenarios:
7 scenarios
Critical Features:
Data not available
Analysis Completed:
2025-04-21 18:34:24

Overview

This overview provides a high-level assessment of the model's resilience to distribution shifts between baseline and target distributions.

Distribution Shift Resilience Analysis

This analysis evaluates how the model's performance changes when data distributions shift. A resilient model maintains consistent performance despite distribution shifts between training and deployment data.

Key Findings

Overall Resilience
Good (82%)
Performance Gap
Moderate (12.0%)
Feature Sensitivity
5 highly sensitive features
Most Affected Scenario
30% shift with PSI metric (22.0% gap)

Recommendations

  • Focus on feature engineering for the 5 most sensitive features to improve resilience
  • Prioritize addressing the "30% shift with PSI metric" scenario which shows the largest performance gap
  • Model demonstrates good resilience; continue monitoring with expanded distribution shift scenarios

Performance Gap Analysis

Baseline Performance
Target Performance
Performance Gap
Average Gap:
-
Maximum Gap:
-
Resilience Score:
-
⚠️ This chart shows the performance gap between baseline and target distributions, indicating the model's resilience to distribution shifts.

Performance gap chart data will display here.

Distribution Shift Analysis

Baseline Distribution
Target Distribution
Difference Area
KL Divergence:
-
JS Distance:
-
Wasserstein Distance:
-
📈 This chart compares the distribution of a feature between baseline and target datasets, showing how the feature distribution has shifted.

Distribution shift chart data will display here.

Feature Impact Analysis

High Intensity
Medium Intensity
Low Intensity
📊 This chart shows the intensity of distribution shifts across features, using statistical distance metrics to quantify the severity of shifts.

Feature impact chart data will display here.

Key Insights

Performance Impact
📉

The model experiences a moderate impact when exposed to distribution shifts, with an average performance gap of 12.0%.

Average Gap:
12.0%
Distribution Shift
🔄

The dataset exhibits minor distribution shifts, with an average distance metric of 0.14.

Shift Magnitude:
0.14
Sensitive Features
⚠️

5 features show high sensitivity to distribution shifts, with the greatest impact observed in the most affected features.

Scenario Analysis
🔍

7 different shift scenarios were analyzed to evaluate the model's resilience under various conditions.

Shift Results

This table shows model performance across different distribution shift scenarios.

Shift Scenario Baseline Performance Target Performance Performance Gap Shift Magnitude Resilience Score
Loading shift results data...
Page 1 of 1
Average Gap: 12.0%
Max Gap: 22.0%
Overall Resilience: 82.0%
ℹ️ Performance Gap represents the decrease in performance when shifting from baseline to target distribution. Lower values indicate better resilience.
📊 Shift Magnitude quantifies the statistical distance between baseline and target distributions. Higher values indicate more severe distribution shifts.

This table analyzes how distribution shifts in each feature impact model performance.

Feature Type Feature Importance Shift Magnitude Performance Impact Resilience Impact Shift Sensitivity
Loading feature impact data...
Page 1 of 1

Understanding Feature Impact Metrics

Feature Importance
The importance of the feature to the model's predictions, derived from the model's internal metrics or from permutation importance.
Shift Magnitude
How much the feature's distribution has changed between baseline and target datasets, measured using statistical distance metrics.
Performance Impact
The direct impact on model performance when this feature's distribution shifts, holding other features constant.
Resilience Impact
The overall contribution of this feature to resilience issues, combining importance, shift magnitude, and performance impact.
Shift Sensitivity
How sensitive the model's predictions are to shifts in this feature's distribution, normalized across features.
⚠️ Features highlighted in red have high resilience impact and should be prioritized for mitigation strategies.

Resilience Recommendations

🔄

Regular Monitoring

Medium

Set up ongoing monitoring of feature distributions in production to detect shifts early.

Detailed Analysis

This section provides detailed metrics and performance analysis for each shift scenario.

Filter Scenarios

Baseline Performance
91.0%
Performance on baseline distribution
Target Performance
91.0%
Performance on target distribution
Performance Gap
12.0%
Drop in performance due to distribution shift
Resilience Score
82.0%
Normalized resilience to distribution shifts

Performance by Metric

This table provides detailed performance metrics for each distribution shift scenario across multiple evaluation metrics.

Shift Scenario Accuracy F1 Score AUC Composite Score
Baseline Target Gap Baseline Target Gap Baseline Target Gap
Loading detailed metrics data...

Metrics Summary

Metric Avg. Baseline Avg. Target Avg. Gap Max Gap Resilience
Accuracy - - - - -
F1 Score - - - - -
AUC - - - - -
Composite - - - - -
📊 Composite Score is a weighted average of all evaluation metrics, providing an overall measure of model performance.
ℹ️ Gap values are highlighted based on severity: green (minor), yellow (moderate), orange (significant), and red (severe).

Group Analysis

This analysis shows how different data groups or classes are affected by distribution shifts.

Sample-level Analysis

This analysis examines individual samples that are most affected by distribution shifts.

Sample ID Baseline Prediction Target Prediction Prediction Change Key Features Actions
Loading sample data...

Distribution Analysis

This section analyzes the distribution shifts between baseline and target datasets.

Filter Features

Distribution Comparison

Baseline Distribution
Target Distribution
Difference Area
KL Divergence:
-
JS Distance:
-
Wasserstein Distance:
-
📈 This chart compares the distribution of a feature between baseline and target datasets, showing how the feature distribution has shifted.

Statistical Distance Metrics

KL Divergence:
-
JS Distance:
-
Wasserstein:
-
Hellinger:
-

Distribution Statistics

Statistic Baseline Target Change
Mean - - -
Median - - -
Std Dev - - -
IQR - - -

Feature Shift Intensity

High Intensity
Medium Intensity
Low Intensity
📊 This chart shows the intensity of distribution shifts across features, using statistical distance metrics to quantify the severity of shifts.

Feature Distance Metrics

This table provides detailed statistical distance metrics between baseline and target distributions for each feature.

Feature Type KL Divergence JS Distance Wasserstein Distance Hellinger Distance Shift Severity
Loading distance metrics data...
Page 1 of 1

Distance Metrics Interpretation

Metric Range Minor Shift Moderate Shift Significant Shift Description
KL Divergence [0, ∞) 0 - 0.5 0.5 - 2.0 > 2.0 Measures information gain when updating from baseline to target distribution
JS Distance [0, 1] 0 - 0.2 0.2 - 0.4 > 0.4 Symmetric measure of similarity between distributions
Wasserstein [0, ∞) 0 - 0.1 0.1 - 0.3 > 0.3 Earth mover's distance between distributions
Hellinger [0, 1] 0 - 0.2 0.2 - 0.5 > 0.5 Probabilistic measure of distribution similarity
⚠️ Different distance metrics may be more appropriate for different feature types. For numerical features, Wasserstein distance is often more interpretable. For categorical features, KL divergence or JS distance are typically used.

Feature Correlation Analysis

This analysis examines how feature correlations change between baseline and target distributions.

Baseline Correlation

Target Correlation

Correlation Difference