skclean.models.RobustForest¶
-
class
skclean.models.
RobustForest
(method='simple', K=5, n_estimators=100, max_leaf_nodes=128, random_state=None, n_jobs=None)¶ Uses a random forest to to compute pairwise similarity/distance, and then a simple K Nearest Neighbor that works on that similarity matrix. For a pair of samples, the similarity value is proportional to how frequently they belong to the same leaf. See [LM17] for details.
- Parameters
method (string, default='simple') – There are two different ways to compute similarity matrix. In ‘simple’ method, the similarity value is simply the percentage of times two samples belong to same leaf. ‘weighted’ method also takes the size of those leaves into account- it exactly matches above paper’s algorithm, but it is computationally slow.
K (int, default=5) – No of nearest neighbors to consider for final classification
n_estimators (int, default=101) – No of trees in Random Forest.
max_leaf_nodes (int, default=128) – Maximum no of leaves in each tree.
n_jobs (int, default=1) – No of parallel cpu cores to use
random_state (int, default=None) – Set this value for reproducibility
Methods
__init__
([method, K, n_estimators, …])Initialize self.
fit
(X, y)get_params
([deep])Get parameters for this estimator.
pairwise_distance
(train_X, test_X)predict
(X)score
(X, y[, sample_weight])Return the mean accuracy on the given test data and labels.
set_params
(**params)Set the parameters of this estimator.