{% if favicon_base64 %} {% endif %}

Model Validation Report

Uncertainty Analysis Report

Uncertainty Score
0.9 %} stroke="#28a745" {% elif (uncertainty_score|default(0)) > 0.7 %} stroke="#ffc107" {% elif (uncertainty_score|default(0)) > 0.5 %} stroke="#fd7e14" {% else %} stroke="#dc3545" {% endif %} stroke-width="10" stroke-dasharray="{{ (uncertainty_score|default(0) * 314) }} 314" transform="rotate(-90 60 60)" > {{ ((uncertainty_score if uncertainty_score is not none else 0) * 100) | round(1) }}%
{{ (coverage if coverage is not none else 0) | round(3) }} Coverage
{{ (mean_width if mean_width is not none else 0) | round(3) }} Mean Width
{% if (uncertainty_score|default(0)) > 0.9 %} Excellent uncertainty quantification {% elif (uncertainty_score|default(0)) > 0.7 %} Good uncertainty quantification {% elif (uncertainty_score|default(0)) > 0.5 %} Moderate uncertainty quantification {% else %} Needs improvement in uncertainty quantification {% endif %}
Model Information
{% if sensitive_features %} {% endif %} {% if report_data and report_data.alternative_models %} {% endif %}
Type: {{ model_type|default('Unknown') }}
Features: {{ features|length|default(0) }}
Primary Metric: {{ metric|default('Accuracy')|upper }}
Sensitive Features: {{ sensitive_features|length|default(0) }}
Alternative Models: {{ report_data.alternative_models|length|default(0) }}
Test Summary
{% if cal_size %} {% endif %}
Uncertainty Score: {{ (uncertainty_score if uncertainty_score is not none else 0)|round(4) }}
Coverage: {{ (coverage if coverage is not none else 0)|round(4) }}
Mean Width: {{ (mean_width if mean_width is not none else 0)|round(4) }}
Calibration Size: {{ cal_size }}

Test Information

Test Type

{{ test_type|capitalize }}
Static report

Model Type

{{ model_type }}
Algorithm

Features

{{ features|length }}
Total features
{% if sensitive_features %}

Sensitive Features

{{ sensitive_features|length }}
For resilience analysis
{% endif %}

Test Configuration

{% if sensitive_features %} {% endif %}
Generation Time {{ timestamp }}
Sensitive Features {{ sensitive_features|join(', ') }}
Metric {{ metric|default('Accuracy') }}
Report Type Static (non-interactive)

Uncertainty Metrics

Uncertainty Quantification Metrics

{% if metrics %} {% for metric_name in metrics|sort %} {% if metric_name not in ['uncertainty_score', 'coverage', 'mean_width'] %} {% endif %} {% endfor %} {% endif %} {% if metrics %} {% for metric_name, metric_value in metrics.items() %} {% if metric_name not in ['uncertainty_score', 'coverage', 'mean_width'] %} {% endif %} {% endfor %} {% endif %} {% if report_data.alternative_models %} {% for alt_model_name, alt_model_data in report_data.alternative_models.items() %} {% if alt_model_data.metrics %} {% for metric_name, metric_value in alt_model_data.metrics.items() %} {% if metric_name not in ['uncertainty_score', 'coverage', 'mean_width'] %} {% endif %} {% endfor %} {% endif %} {% endfor %} {% endif %}
Model Uncertainty Score Coverage Mean Width{{ metric_name|title }}
{{ model_name }} {{ "%.4f"|format(uncertainty_score if uncertainty_score is not none else 0) }} {{ "%.4f"|format(coverage if coverage is not none else 0) }} {{ "%.4f"|format(mean_width if mean_width is not none else 0) }}{{ "%.4f"|format(metric_value if metric_value is not none else 0) }}
{{ alt_model_name }} {{ "%.4f"|format(alt_model_data.uncertainty_score if alt_model_data.uncertainty_score is not none else 0) }} {{ "%.4f"|format(alt_model_data.coverage if alt_model_data.coverage is not none else 0) }} {{ "%.4f"|format(alt_model_data.mean_width if alt_model_data.mean_width is not none else 0) }}{{ "%.4f"|format(metric_value if metric_value is not none else 0) }}
{% if charts.model_comparison %}

Model Comparison

Compares uncertainty metrics across different models or model configurations.

Model comparison
{% endif %}

Overview

Uncertainty Score

{{ "%.4f"|format(uncertainty_score if uncertainty_score is not none else 0) }}
Higher is better

Coverage

{{ "%.4f"|format(coverage if coverage is not none else 0) }}
Target: Expected coverage

Mean Width

{{ "%.4f"|format(mean_width if mean_width is not none else 0) }}
Lower is better
{% if cal_size %}

Calibration Size

{{ cal_size }}
Samples used for calibration
{% endif %}

Coverage vs Expected Coverage

Compares actual coverage with expected coverage at different alpha (confidence) levels. Closer to the diagonal line indicates better calibration.

{% if charts.coverage_vs_expected %}
Coverage vs Expected Coverage
{% else %}

No coverage data available for visualization.

{% endif %}

Interval Width vs Coverage

Shows the relationship between interval width and coverage. Efficient uncertainty estimates achieve higher coverage with narrower intervals.

{% if charts.width_vs_coverage %}
Width vs Coverage
{% else %}

No width vs coverage data available for visualization.

{% endif %}

Performance Gap by Alpha Level

Shows gaps between expected and actual coverage at different alpha levels. Values close to zero indicate well-calibrated uncertainty.

{% if charts.performance_gap_by_alpha %}
Performance Gap by Alpha
{% else %}

No performance gap data available for visualization.

{% endif %}

Uncertainty Metrics Overview

Shows key uncertainty metrics for the model, including uncertainty score, coverage, and mean width.

{% if charts.uncertainty_metrics %}
Uncertainty metrics overview
{% else %}

No uncertainty metrics data available for visualization.

{% endif %}

Feature Analysis

{% if charts.feature_importance %}

Feature Importance for Uncertainty

Shows the most important features affecting model uncertainty. Features with higher importance have greater impact on prediction intervals.

Feature importance for uncertainty
{% endif %} {% if charts.feature_reliability %}

Feature Reliability Analysis

Shows feature reliability scores, indicating which features are most consistent in their impact on uncertainty quantification.

Feature reliability analysis
{% endif %} {% if charts.interval_widths_comparison %}

Interval Widths Distribution

Shows the distribution of prediction interval widths across the dataset. Narrower intervals with proper coverage indicate more efficient uncertainty quantification.

Interval widths distribution
{% endif %} {% if report_data.psi_scores %}

Feature PSI Scores

Population Stability Index (PSI) scores measure the stability of feature distributions between calibration and test sets.

{% for feature, psi in report_data.psi_scores|dictsort(by='value', reverse=true) %} {% endfor %}
Feature PSI Score
{{ feature }} {{ "%.4f"|format(psi) }}
{% endif %} {% if feature_importance %}

Feature Importance for Uncertainty

{% for feature, importance in feature_importance|dictsort(by='value', reverse=true) %} {% endfor %}
Feature Importance
{{ feature }} {{ "%.4f"|format(importance) }}
{% endif %}
{% if charts.residual_distribution or charts.feature_residual_correlation %}

Residual Analysis

{% if charts.residual_distribution %}

Model Residual Distribution

Shows the distribution of residuals (prediction errors) across different datasets, helping identify biases under stress conditions.

Residual distribution
{% endif %} {% if charts.feature_residual_correlation %}

Feature-Residual Correlation

Shows which features are most correlated with model errors, helping identify potential areas for model improvement.

Feature-residual correlation
{% endif %}
{% endif %} {% if charts.distance_metrics_comparison or charts.feature_distance_heatmap %}

Distribution Metrics Analysis

{% if charts.distance_metrics_comparison %}

Distance Metrics Comparison by Alpha

Compares different distance metrics (PSI, WD1, KS, etc.) across alpha levels, showing how distribution shift is captured by different metrics.

Distance metrics comparison by alpha
{% endif %} {% if charts.feature_distance_heatmap %}

Feature Distance Heatmap by Metric

Shows the distribution shift of each feature as measured by different metrics, visualizing which features are most affected by different types of distribution shifts.

Feature distance heatmap by metric
{% endif %}
{% endif %} {% if charts.model_comparison_chart or charts.model_resilience_scores %}

Model Comparison

{% if charts.model_comparison_chart %}

Model Resilience Comparison

Compares resilience performance across different models under increasing stress levels. Models with more gradual decline are more resilient.

Model resilience comparison
{% endif %} {% if charts.model_resilience_scores %}

Resilience Scores by Model

Compares the overall resilience score for each model. Higher scores indicate better performance under distribution shifts.

Model resilience scores
{% endif %} {% if charts.distance_metrics_comparison %}

Distance Metrics Comparison by Alpha

Compares different distance metrics (PSI, WD1, KS, etc.) across alpha levels, showing how distribution shift is captured by different metrics.

Distance metrics comparison by alpha
{% endif %}
{% endif %}