Version 0.2¶
Changelog¶
Bug fixes¶
- Fixed a bug in
under_sampling.NearMiss
which was not picking the right samples during under sampling for the method 3. By `Guillaume Lemaitre`_. - Fixed a bug in
ensemble.EasyEnsemble
, correction of the random_state generation. By `Guillaume Lemaitre`_ and `Christos Aridas`_. - Fixed a bug in
under_sampling.RepeatedEditedNearestNeighbours
, add additional stopping criterion to avoid that the minority class become a majority class or that a class disappear. By `Guillaume Lemaitre`_. - Fixed a bug in
under_sampling.AllKNN
, add stopping criteria to avoid that the minority class become a majority class or that a class disappear. By `Guillaume Lemaitre`_. - Fixed a bug in
under_sampling.CondensedNeareastNeigbour
, correction of the list of indices returned. By `Guillaume Lemaitre`_. - Fixed a bug in
ensemble.BalanceCascade
, solve the issue to obtain a single array if desired. By `Guillaume Lemaitre`_. - Fixed a bug in
pipeline.Pipeline
, solve to embed Pipeline in other Pipeline. #231 by `Christos Aridas`_ . - Fixed a bug in
pipeline.Pipeline
, solve the issue to put to sampler in the same Pipeline. #188 by `Christos Aridas`_ . - Fixed a bug in
under_sampling.CondensedNeareastNeigbour
, correction of the shape of sel_x when only one sample is selected. By `Aliaksei Halachkin`_. - Fixed a bug in
under_sampling.NeighbourhoodCleaningRule
, selecting neighbours instead of minority class misclassified samples. #230 by `Aleksandr Loskutov`_. - Fixed a bug in
over_sampling.ADASYN
, correction of the creation of a new sample so that the new sample lies between the minority sample and the nearest neighbour. #235 by `Rafael Wampfler`_.
New features¶
- Added AllKNN under sampling technique. By `Dayvid Oliveira`_.
- Added a module metrics implementing some specific scoring function for the problem of balancing. #204 by `Guillaume Lemaitre`_ and `Christos Aridas`_.
Enhancement¶
- Added support for bumpversion. By `Guillaume Lemaitre`_.
- Validate the type of target in binary samplers. A warning is raised for the moment. By `Guillaume Lemaitre`_ and `Christos Aridas`_.
- Change from cross_validation module to model_selection module for sklearn deprecation cycle. By `Dayvid Oliveira`_ and `Christos Aridas`_.
API changes summary¶
- size_ngh has been deprecated in
combine.SMOTEENN
. Use n_neighbors instead. By `Guillaume Lemaitre`_, `Christos Aridas`_, and Dayvid Oliveira . - size_ngh has been deprecated in
under_sampling.EditedNearestNeighbors
. Use n_neighbors instead. By `Guillaume Lemaitre`_, `Christos Aridas`_, and `Dayvid Oliveira`_. - size_ngh has been deprecated in
under_sampling.CondensedNeareastNeigbour
. Use n_neighbors instead. By `Guillaume Lemaitre`_, `Christos Aridas`_, and `Dayvid Oliveira`_. - size_ngh has been deprecated in
under_sampling.OneSidedSelection
. Use n_neighbors instead. By `Guillaume Lemaitre`_, `Christos Aridas`_, and `Dayvid Oliveira`_. - size_ngh has been deprecated in
under_sampling.NeighbourhoodCleaningRule
. Use n_neighbors instead. By `Guillaume Lemaitre`_, `Christos Aridas`_, and `Dayvid Oliveira`_. - size_ngh has been deprecated in
under_sampling.RepeatedEditedNearestNeighbours
. Use n_neighbors instead. By `Guillaume Lemaitre`_, `Christos Aridas`_, and `Dayvid Oliveira`_. - size_ngh has been deprecated in
under_sampling.AllKNN
. Use n_neighbors instead. By `Guillaume Lemaitre`_, `Christos Aridas`_, and `Dayvid Oliveira`_. - Two base classes
BaseBinaryclassSampler
andBaseMulticlassSampler
have been created to handle the target type and raise warning in case of abnormality. By `Guillaume Lemaitre`_ and `Christos Aridas`_. - Move random_state to be assigned in the
SamplerMixin
initialization. By `Guillaume Lemaitre`_. - Provide estimators instead of parameters in
combine.SMOTEENN
andcombine.SMOTETomek
. Therefore, the list of parameters have been deprecated. By `Guillaume Lemaitre`_ and `Christos Aridas`_. - k has been deprecated in
over_sampling.ADASYN
. Use n_neighbors instead. #183 by `Guillaume Lemaitre`_. - k and m have been deprecated in
over_sampling.SMOTE
. Use k_neighbors and m_neighbors instead. #182 by `Guillaume Lemaitre`_. - n_neighbors accept KNeighborsMixin based object for
under_sampling.EditedNearestNeighbors
,under_sampling.CondensedNeareastNeigbour
,under_sampling.NeighbourhoodCleaningRule
,under_sampling.RepeatedEditedNearestNeighbours
, andunder_sampling.AllKNN
. #109 by `Guillaume Lemaitre`_.
Documentation changes¶
- Replace some remaining UnbalancedDataset occurences. By `Francois Magimel`_.
- Added doctest in the documentation. By `Guillaume Lemaitre`_.