skclean.handlers.Filter¶
-
class
skclean.handlers.
Filter
(classifier, detector=None, threshold: float = 0.5, frac_to_filter: float = None, n_jobs=1, random_state=None)¶ Removes from dataset samples most likely to be noisy. Samples-to-be-removed can be selected in two ways: either a specified percentage of samples with lowest conf_score, or samples with lower conf_score than a specified threshold.
- Parameters
classifier (object) – A classifier instance supporting sklearn API.
detector (BaseDetector or None, default=None) – To compute conf_score. Set it to None only if conf_score is expected in fit() (e.g. when used inside a Pipeline with a BaseDetector preceding it). Otherwise a Detector must be supplied during instantiation.
threshold (float, default=.5) – Samples with higher conf_score will be kept, rest will be filtered out. A value of .5 implies majority voting, whereas .99 (i.e. a value closer to, but less than 1.0) implies onsensus voting.
frac_to_filter (float, default=None) – Percentages of samples to filter out. Exactly one of either threshold or frac_to_filter must be set.
n_jobs (int, default=1) – No of parallel cpu cores to use
random_state (int, default=None) – Set this value for reproducibility
Methods
__init__
(classifier[, detector, threshold, …])Initialize self.
fit
(X, y[, conf_score])get_params
([deep])Get parameters for this estimator.
predict
(X)predict_proba
(X)score
(X, y[, sample_weight])Return the mean accuracy on the given test data and labels.
set_params
(**params)Set the parameters of this estimator.
Attributes
iterative