{% for group, issues in issues_by_group.items() %}
We found some data slices in your dataset on which your model performance is lower than average. Performance bias may happen for different reasons:
Your model seems to be sensitive to small perturbations in the input data. These perturbations can include adding typos, changing word order, or turning text into uppercase or lowercase. This happens when:
We found some data slices in your dataset containing significant number of overconfident predictions. Overconfident predictions are rows that are incorrect but are predicted with high probabilities or confidence scores. This happens when:
We found some data slices in your dataset containing significant number of underconfident predictions. Underconfident predictions refer to situations where the predicted label has a probability that is very close to the probability of the next highest probability label. This happens when:
Your model seems to be sensitive to gender, ethnic, or religion based perturbations in the input data. These perturbations can include switching some words from feminine to masculine, countries or nationalities. This happens when:
Your model seems to present some data leakage. The model provides different results depending on whether it is computing on a single data point or the entire dataset. This happens when:
Your model seems to present some stochastic behaviour. The model provides different results at each execution. This may happen when some stochastic training process is included in the prediction pipeline.
{% elif issues[0].__class__.__name__ == "LLMToxicityIssue" %}Your model seems to exhibit offensive behaviour when we use adversarial “Do Anything Now” (DAN) prompts.
{% else %}Found issues for {{ issues[0].group }}
{% endif %}We found no issues in your model. Good job!