skclean.handlers.WeightedBagging

class skclean.handlers.WeightedBagging(classifier=None, detector=None, n_estimators=100, replacement=True, sampling_ratio=1.0, n_jobs=1, random_state=None, verbose=0)

Similar to regular bagging- except cleaner samples will be chosen more often during bagging. That is, a sample’s probability of getting selected in bootstrapping process is directly proportional to it’s conf_score. See [WCO+18] for details.

Parameters
  • classifier (object) – A classifier instance supporting sklearn API. Same as base_estimator of scikit-learn’s BaggingClassifier.

  • detector (BaseDetector or None, default=None) – To compute conf_score. Set it to None only if conf_score is expected in fit() (e.g. when used inside a Pipeline with a BaseDetector preceding it). Otherwise a Detector must be supplied during instantiation.

  • n_estimators (int, default=10) – The number of base classifiers in the ensemble.

  • replacement (bool, default=True) – Whether to sample instances with/without replacement at each base classifier

  • sampling_ratio (float, 0.0 to 1.0, default=1.0) – No of samples drawn at each tree equals: len(X) * sampling_ratio

  • n_jobs (int, default=1) – No of parallel cpu cores to use

  • random_state (int, default=None) – Set this value for reproducibility

  • verbose (int, default=0) – Controls the verbosity when fitting and predicting

Methods

__init__([classifier, detector, …])

Initialize self.

decision_function(X)

Average of the decision functions of the base classifiers.

fit(X, y[, conf_score])

Build a Bagging ensemble of estimators from the training

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict class for X.

predict_log_proba(X)

Predict class log-probabilities for X.

predict_proba(X)

Predict class probabilities for X.

score(X, y[, sample_weight])

Return the mean accuracy on the given test data and labels.

set_params(**params)

Set the parameters of this estimator.

Attributes

classifier

estimators_samples_

The subset of drawn samples for each base estimator.

iterative