{% if favicon_base64 %} {% endif %}

Model Validation Report

Performance Analysis Report

Robustness Score
0.9 %} stroke="#28a745" {% elif (robustness_score|default(0)) > 0.7 %} stroke="#ffc107" {% elif (robustness_score|default(0)) > 0.5 %} stroke="#fd7e14" {% else %} stroke="#dc3545" {% endif %} stroke-width="10" stroke-dasharray="{{ (robustness_score|default(0) * 314) }} 314" transform="rotate(-90 60 60)" > {{ (robustness_score|default(0) * 100) | round(1) }}%
{{ (base_score|default(0) * 100) | round(1) }}% Base Score
{{ (raw_impact|default(0) * 100) | round(2) }}% Impact
{% if (robustness_score|default(0)) > 0.9 %} Excellent resistance to perturbations {% elif (robustness_score|default(0)) > 0.7 %} Good resistance to perturbations {% elif (robustness_score|default(0)) > 0.5 %} Moderate resistance to perturbations {% else %} Needs improvement in robustness {% endif %}
Model Information
{% if report_data and report_data.alternative_models %} {% endif %}
Type: {{ model_type|default('Unknown') }}
Features: {{ features|length|default(0) }}
Primary Metric: {{ metric|default('Score')|upper }}
Critical Features: {{ feature_subset|length|default(0) }}
Alternative Models: {{ report_data.alternative_models|length|default(0) }}
Test Summary
{% if feature_subset %} {% endif %}
Perturbation Levels: {% if report_data and report_data.raw and report_data.raw.by_level %}{{ report_data.raw.by_level|length }}{% else %}0{% endif %}
Iterations Per Level: {{ iterations|default(10) }}
Max Impact Level: {% set max_level = {'level': '0', 'impact': 0} %} {% if report_data and report_data.raw and report_data.raw.by_level %} {% for level, level_data in report_data.raw.by_level.items() %} {% if level_data.overall_result and level_data.overall_result.all_features and level_data.overall_result.all_features.impact and level_data.overall_result.all_features.impact > max_level.impact %} {% set _ = max_level.update({'level': level, 'impact': level_data.overall_result.all_features.impact}) %} {% endif %} {% endfor %} {% endif %} {{ (max_level.impact * 100)|round(2) }}% at {{ max_level.level }}
Feature Subset Impact: {% if report_data and report_data.feature_subset_max_impact and report_data.feature_subset_max_impact.value > 0 %} {{ (report_data.feature_subset_max_impact.value * 100)|round(2) }}% at {{ report_data.feature_subset_max_impact.level }} {% else %} {% set max_subset_impact = {'level': '0', 'impact': 0} %} {% if report_data and report_data.raw and report_data.raw.by_level %} {% for level, level_data in report_data.raw.by_level.items() %} {% if level_data.overall_result and level_data.overall_result.feature_subset and level_data.overall_result.feature_subset.impact and level_data.overall_result.feature_subset.impact > max_subset_impact.impact %} {% set _ = max_subset_impact.update({'level': level, 'impact': level_data.overall_result.feature_subset.impact}) %} {% endif %} {% endfor %} {% endif %} {% if max_subset_impact.impact > 0 %} {{ (max_subset_impact.impact * 100)|round(2) }}% at {{ max_subset_impact.level }} {% else %} 0.00% {% endif %} {% endif %}

Test Information

Test Type

{{ test_type|capitalize }}
Static report

Model Type

{{ model_type }}
Algorithm

Features

{{ features|length }}
Total features

Iterations

{{ iterations|default(10) }}
Per perturbation

Test Configuration

Generation Time {{ timestamp }}
Feature Subset {{ feature_subset_display }}
Metric {{ metric }}
Report Type Static (non-interactive)

Performance Metrics

Model Metrics Comparison

{% if report_data.metrics_details %} {% for metric_name in report_data.metrics_details|sort %} {% endfor %} {% else %} {% set primary_metrics = report_data.metrics %} {% if primary_metrics %} {% for metric_name, metric_value in primary_metrics.items() %} {% if metric_name != "base_score" and metric_name != "robustness_score" %} {% endif %} {% endfor %} {% endif %} {% endif %} {% if report_data.metrics_details %} {% for metric_name in report_data.metrics_details|sort %} {% endfor %} {% else %} {% set primary_metrics = report_data.metrics %} {% if primary_metrics %} {% for metric_name, metric_value in primary_metrics.items() %} {% if metric_name != "base_score" and metric_name != "robustness_score" %} {% endif %} {% endfor %} {% endif %} {% endif %} {% if report_data and report_data.alternative_models %} {% for model_name, model_data in report_data.alternative_models|dictsort %} {% if report_data.metrics_details %} {% for metric_name in report_data.metrics_details|sort %} {% endfor %} {% else %} {% set primary_metrics = report_data.metrics %} {% if primary_metrics %} {% for metric_name, metric_value in primary_metrics.items() %} {% if metric_name != "base_score" and metric_name != "robustness_score" %} {% endif %} {% endfor %} {% endif %} {% endif %} {% endfor %} {% endif %}
Model Base {{ metric|capitalize }} Robustness Score Avg. Impact{{ metric_name|title }}{{ metric_name|title }}
{{ model_name }} {{ "%.4f"|format(base_score) }} {{ "%.4f"|format(robustness_score) }} {{ "%.4f"|format(raw_impact) }}{{ "%.4f"|format(report_data.metrics_details[metric_name]|default(0)) }}{{ "%.4f"|format(metric_value|default(0)) }}
{{ model_name }} {{ "%.4f"|format(model_data.base_score|default(0)) }} {{ "%.4f"|format(model_data.get('robustness_score', 1.0 - model_data.get('raw_impact', 0))) }} {{ "%.4f"|format(model_data.get('raw_impact', 0)) }} {% if model_data.metrics_details and metric_name in model_data.metrics_details %} {{ "%.4f"|format(model_data.metrics_details[metric_name]|default(0)) }} {% elif model_data.metrics and metric_name in model_data.metrics %} {{ "%.4f"|format(model_data.metrics[metric_name]|default(0)) }} {% else %} - {% endif %} {% if model_data.metrics and metric_name in model_data.metrics %} {{ "%.4f"|format(model_data.metrics[metric_name]|default(0)) }} {% else %} - {% endif %}

Overview

Robustness Score

{{ "%.4f"|format(robustness_score) }}
Higher is better

Base {{ metric|capitalize }}

{{ "%.4f"|format(base_score) }}
Without perturbation

Average Impact

{{ "%.4f"|format(raw_impact) }}
Lower is better
{% if report_data and report_data.alternative_models %}

Models Compared

{{ report_data.alternative_models|length + 1 }}
Including primary model
{% endif %}

Performance by Perturbation Level

Shows the model's average performance at different perturbation levels. The red dotted line represents the base score without perturbations.

{% if charts.overview_chart %}
Model performance by perturbation level
{% else %}

No perturbation data available for visualization.

{% endif %} {% if charts.feature_subset_chart %}

Feature Subset Performance

Comparison between the impact of perturbation on all features versus only on the subset of critical features.

Feature subset performance by perturbation level
{% endif %}

Worst Performance by Perturbation Level

Visualizes the worst-case performance at each perturbation level, showing the most adverse scenarios for the model.

{% if charts.worst_performance_chart %}
Worst model performance by perturbation level
{% else %}

No worst performance data available for visualization.

{% endif %}
{% if report_data and report_data.alternative_models %}

Model Comparison

Compares the performance of different models across perturbation levels, helping to identify which model is more robust.

{% if charts.comparison_chart %}
Model comparison across perturbation levels
{% else %}

No comparison data available for visualization.

{% endif %}
{% endif %}
{% if has_model_feature_importance or feature_importance %}

Feature Importance

{% if has_model_feature_importance %}

Feature Importance Comparison

Compares the feature importance declared by the model with the importance based on robustness analysis, highlighting discrepancies.

{% if charts.feature_comparison_chart %}
Comparison of model-defined vs. robustness-based feature importance
{% else %}

No feature comparison data available for visualization.

{% endif %}
{% endif %} {% if feature_importance %}

Feature Importance Details

{% if has_model_feature_importance %} {% endif %} {% for feature, importance in feature_importance|dictsort(by='value', reverse=true) %} {% if has_model_feature_importance %} {% if model_feature_importance.get(feature) is not none %} {% set diff = (importance|float - model_feature_importance.get(feature, 0)|float) %} {% else %} {% endif %} {% endif %} {% endfor %}
Feature Robustness ImpactModel Importance Difference
{{ feature }} {{ "%.4f"|format(importance) }}{{ "%.4f"|format(model_feature_importance.get(feature, 0)|float) }}{{ "%.4f"|format((diff|abs_value)) }}N/A
{% endif %}
{% endif %}

Performance Distribution

Score Distribution

Shows the complete distribution of performance scores using violin plots (density), boxplots (quartiles), and individual points. The red diamond indicates the base score.

{% if charts.boxplot_chart %}
Distribution visualization of model performance scores
{% else %}

No distribution data available for visualization. Run tests with multiple iterations to generate this chart.

{% endif %}
{% if report_data and report_data.raw and report_data.raw.by_level %}

Performance by Perturbation Level

{% for level, level_data in report_data.raw.by_level|dictsort %} {% if level_data.overall_result and level_data.overall_result.all_features %} {% endif %} {% endfor %}
Perturbation Level Average {{ metric|capitalize }} Worst {{ metric|capitalize }} Impact
{{ level }} {{ "%.4f"|format(level_data.overall_result.all_features.mean_score) }} {{ "%.4f"|format(level_data.overall_result.all_features.worst_score) }} {{ "%.4f"|format(base_score - level_data.overall_result.all_features.mean_score) }}
{% endif %}
{% if report_data and report_data.alternative_models %}

Alternative Models

Model Comparison

{% for model_name, model_data in report_data.alternative_models|dictsort %} {% endfor %}
Model Base {{ metric|capitalize }} Robustness Score Average Impact Model Type
{{ model_name }} {{ "%.4f"|format(base_score) }} {{ "%.4f"|format(robustness_score) }} {{ "%.4f"|format(raw_impact) }} {{ model_type }}
{{ model_name }} {{ "%.4f"|format(model_data.base_score) }} {{ "%.4f"|format(model_data.get('robustness_score', 1.0 - model_data.get('raw_impact', 0))) }} {{ "%.4f"|format(model_data.get('raw_impact', 0)) }} {{ model_data.get('model_type', 'Unknown') }}
{% endif %}