{% from 'macros.html' import visualize_check_score, visualize_score, visualize_series_name %}

Data quality report for {{ block_evaluation.flow_evaluation.series_set_name }} ({{ block_evaluation.flow_evaluation.group.flow.name }})

{{ now|dateformat }}

Introduction

This document reports on the data quality of the time series set "{{ block_evaluation.flow_evaluation.series_set_name }}" in the flow {{ block_evaluation.flow_evaluation.group.flow.name }}.

It first summarizes the quality score for the complete flow. Then it reports on the quality KPIs that determine the score. The worst scoring time series are mentioned. Finally an overview of all time series mentioned in the report is presented.

Summary

The data quality of a time series source is scored along the KPIs defined in the KPI set. The score for each quality KPI is determined by the score of all time series in the source for the checks that make up the KPI.

The aggregated quality score of "{{ block_evaluation.flow_evaluation.series_set_name }}" is {{ source_score }}%.

{% for kpi in kpis %} {% if kpi.name in kpi_scores %} {{ visualize_score(kpi_scores[kpi.name]) }} {% else %} {% endif %} {% endfor %}
KPI Score
{{ kpi.name }}-

Compared to the median scores of all time series sources in Timeseer, "{{ block_evaluation.flow_evaluation.series_set_name }}" scores {% if kpi_comparison.better|count > 0 %} better for {% for kpi, score in kpi_comparison.better %} {{ kpi }} ({{score}}%) {%- if not loop.revindex0 < 2 -%} , {% elif not loop.revindex0 < 1 %} and {%- endif -%} {%- endfor -%} {%- if kpi_comparison.better|count > 0 and kpi_comparison.worse|count > 0 -%} , {% endif %} {% endif %} {% if kpi_comparison.worse|count > 0 %} worse for {% for kpi, score in kpi_comparison.worse %} {{ kpi }} ({{score}}%) {%- if not loop.revindex0 < 2 -%} , {% elif not loop.revindex0 < 1 %} and {%- endif -%} {%- endfor -%} {%- endif -%} {% if kpi_comparison.better|count + kpi_comparison.worse|count > 0 and kpi_comparison.equal|count > 0 %} and {% endif %} {% if kpi_comparison.equal|count > 0 %} equal for {% for kpi, score in kpi_comparison.equal %} {{ kpi }} {%- if not loop.revindex0 < 2 -%} , {%- elif not loop.revindex0 < 1 %} and {%- endif -%} {%- endfor -%} {%- endif -%} .

{% for kpi in kpis %}

{{ kpi.name }} KPI

{% if kpi.short_help_text != "" %}

{{ kpi.short_help_text }}

{{ kpi.html_help_text|default('')|inlineimages|safe }} {% endif %} {% if kpi_score_scores[kpi.name]|count == 0 %}

No {{ kpi.name.lower() }} checks have been performed.

{% endif %}
{% if kpi_score_scores[kpi.name]|count > 0 %} {% if kpi.name == 'Metadata' %} {% for score in kpi_score_scores[kpi.name]|sort(attribute='metadata.name')|sort(attribute='score') %} {{ visualize_score(score.score) }} {% endfor %}
Metadata check Score (higher is better)
{{ score.metadata.name|capitalize }}
{% else %} {% for score in kpi_score_scores[kpi.name]|sort(attribute='metadata.name')|sort(attribute='score') %} {{ visualize_score(score.score) }} {% endfor %}
Check Score (higher is better)
{{ score.metadata.name|capitalize }}
{% endif %} {% endif %}
{% for score in kpi_score_scores[kpi.name]|sort(attribute='name') %}

{{ score.metadata.name|capitalize }} check

{% if score.metadata.short_help_text %}

{{ score.metadata.short_help_text }}

{{ score.metadata.html_help_text|default('')|inlineimages|safe }} {% endif %} {% if score_distribution[score.metadata.name]|map('last')|sum > 0 %}

Evaluated over {{ score_distribution[score.metadata.name]|map('last')|sum }} series:

{% if score.metadata.data_type == 'bool' %}
{% for result, count in score_distribution[score.metadata.name] %} {% if result in [0, 90] %} {% endif %} {% endfor %}
{{ result }} {{ count }}
{% else %}
{% for result, count in score_distribution[score.metadata.name] %} {% endfor %}
{{ score.metadata.name|capitalize }} score Bin start Bin end Number of time series
{{ result }} - {{ result + 10 }} {{ result }} {{ result + 10 }} {{ count }}
{% endif %} {% endif %} {% if score.metadata.name in csv_exports and score.score != 100 %}

Export the list of time series as CSV. The CSV contains suggested values where available.

{% endif %} {% endfor %}
{% endfor %}

Worst scoring time series

{% for series, score in series_scores.items() %} {{ visualize_score(score) }} {% endfor %}
Series name Score (higher is better)
{{ visualize_series_name(series.name) }}
{% if kpi_scores|count > 0 %} {% endif %}