skclean.handlers.IPF

class skclean.handlers.IPF(classifier, detector, n_estimator=5, max_iter=3, n_jobs=1, random_state=None)

Iteratively detects and filters out mislabelled samples unless a stopping criterion is met. See [KR07] for details/usage.

Differs slightly from CLNI in terms of how stopping criterion is implemented.

Parameters
  • classifier (object) – A classifier instance supporting sklearn API.

  • detector (BaseDetector) – To compute conf_score. All iterative handlers require this.

  • threshold (float, default=.4) – Samples with lower conf_score will be filtered out.

  • eps (float, default=.99) – Stopping criterion for main detection->cleaning loop, indicates ratio of total number of mislabelled samples identified in two successive iterations.

  • n_jobs (int, default=1) – No of parallel cpu cores to use

  • random_state (int, default=None) – Set this value for reproducibility

Methods

__init__(classifier, detector[, …])

Initialize self.

clean(X, y)

fit(X, y[, conf_score])

get_params([deep])

Get parameters for this estimator.

predict(X)

predict_proba(X)

score(X, y[, sample_weight])

Return the mean accuracy on the given test data and labels.

set_params(**params)

Set the parameters of this estimator.

Attributes

iterative