Evaluating Detectors¶
In scikit-clean
, A Detector
only identifies/detects the mislabelled samples. It’s not a complete classifier (rather a part of one). So procedure for their evaluation is different.
We can view a noise detector as a binary classifier: it’s job is to provide a probability denoting if a sample is “mislabelled” or “clean”. We can therefore use binary classification metrics that work on continuous output: brier score, log loss, area under ROC curve etc.
[1]:
# Suppress warnings, you should remove this before modifying this notebook
def warn(*args, **kwargs):
pass
import warnings
warnings.warn = warn
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.metrics import brier_score_loss, log_loss, roc_auc_score
from skclean.tests.common_stuff import NOISE_DETECTORS # All noise detectors in skclean
from skclean.utils import load_data
from skclean.detectors.base import BaseDetector
from skclean.simulate_noise import flip_labels_uniform
[2]:
class DummyDetector(BaseDetector):
def detect(self, X, y):
return np.random.uniform(size=y.shape)
from skclean.detectors import KDN, RkDN
class WkDN:
def detect(self,X,y):
return .5 * KDN().detect(X,y) + .5 * RkDN().detect(X,y)
ALL_DETECTOTS = [DummyDetector(), WkDN()] + NOISE_DETECTORS
[3]:
X, y = make_classification(1800, 10)
#X, y = load_data('breast_cancer')
yn = flip_labels_uniform(y, .3) # 30% label noise
clean_idx = (y==yn) # Indices of correctly labelled samples
[4]:
df = pd.DataFrame()
for d in ALL_DETECTOTS:
conf_score = d.detect(X, yn)
for name,loss_func in zip(['log','brier','roc'],
[log_loss, brier_score_loss, roc_auc_score]):
loss = loss_func(clean_idx, conf_score)
df.at[d.__class__.__name__,name] = np.round(loss,3)
df
[4]:
log | brier | roc | |
---|---|---|---|
DummyDetector | 0.999 | 0.333 | 0.501 |
WkDN | 0.664 | 0.183 | 0.811 |
ForestKDN | 1.099 | 0.131 | 0.858 |
InstanceHardness | 0.448 | 0.141 | 0.902 |
KDN | 0.830 | 0.173 | 0.818 |
RkDN | 3.371 | 0.227 | 0.749 |
MCS | 0.294 | 0.071 | 0.955 |
PartitioningDetector | 0.942 | 0.072 | 0.950 |
RandomForestDetector | 0.464 | 0.145 | 0.908 |
Note that in case of roc_auc_score
, higher is better.
[ ]: