forpy
2
|
Namespaces | |
threading | |
Classes | |
class | ClassificationError |
Computes the classification error as \(1-\max(p_i)\). More... | |
class | ClassificationForest |
class | ClassificationLeaf |
Stores the probability distributions for n_classes at a leaf. More... | |
class | ClassificationOpt |
Optimize split thresholds to optimize classification results. More... | |
class | ClassificationTree |
struct | DeciderDesk |
Desk for decider training. More... | |
struct | Desk |
Main thread desk object. More... | |
struct | Empty |
A struct to represent an empty variant. More... | |
class | EmptyException |
class | EntropyGain |
Calculates the gain as difference of current entropy and the weighted sum of subgroup entropies. More... | |
class | FastClassOpt |
Optimize split thresholds to optimize classification results. More... | |
class | FastDecider |
A classifier manager for weak classifiers with a filter function, a feature calculation function and a thresholding. More... | |
class | FastDProv |
Use the provided data plain throughout the training. More... | |
class | Forest |
class | ForpyException |
struct | get_core |
Get the core datatype with removed pointer, reference and const modifiers. More... | |
class | IDataProvider |
A data provider for the training of one tree. More... | |
class | IDecider |
Interface for the decider. It does the optimization of the deciding classifier for each node and stores the parameters. More... | |
class | IEntropyFunction |
Interface for an entropy calculation functor. More... | |
class | IGainCalculator |
Interface for a gain calculator class. More... | |
class | ILeaf |
Stores and returns leaf values, and combines them to forest results. More... | |
class | InducedEntropy |
Computes the induced p entropy. More... | |
class | IThreshOpt |
Find an optimal threshold. More... | |
struct | LeafDesk |
Desk for leaf manager training. More... | |
struct | MatEqVis |
Comparison visitor. More... | |
struct | Name |
Struct for translating primitive types to a short name. More... | |
struct | Name< double > |
struct | Name< float > |
struct | Name< int > |
struct | Name< int16_t > |
struct | Name< uint > |
struct | Name< uint8_t > |
struct | ptr_variant |
struct | RandomDesk |
Desk for coordinating the random engines. More... | |
class | RegressionForest |
class | RegressionLeaf |
Manages the leaf nodes of regression trees. More... | |
class | RegressionOpt |
Optimize split thresholds to optimize regression results (MSE). More... | |
class | RegressionTree |
class | RenyiEntropy |
Computes the Renyi entropy. More... | |
class | SamplingWithoutReplacement |
A lazy evaluation sampling without replacement. More... | |
class | ShannonEntropy |
Computes the classical Shannon-Entropy. More... | |
struct | SplitOptRes |
class | ThreadControl |
struct | TodoMark |
Stores the parameters for one marked tree node. More... | |
class | Tree |
The main tree class for the forpy framework. More... | |
struct | TreeDesk |
Desk for tree training. More... | |
class | TsallisEntropy |
Computes the Tsallis entropy. More... | |
struct | vector_hasher |
A simple vector<size_t> hasher. More... | |
struct | VReset |
Call the reset operation on a pointer variant. More... | |
Typedefs | |
template<typename DT > | |
using | Mat = Eigen::Matrix< DT, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor > |
Parameterized Matrix type (row major). More... | |
template<typename DT > | |
using | MatCM = Eigen::Matrix< DT, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor > |
Parameterized column major matrix type. More... | |
template<typename DT > | |
using | MatCRef = Eigen::Ref< const Mat< DT > > |
Parameterized const matrix ref type. More... | |
template<typename DT > | |
using | MatCMCRef = Eigen::Ref< const MatCM< DT > > |
Parameterized const matrix column major matrix ref type. More... | |
template<typename DT > | |
using | MatRef = Eigen::Ref< Mat< DT > > |
Parameterized standard non-const matrix ref type. More... | |
template<typename DT > | |
using | Vec = Eigen::Matrix< DT, Eigen::Dynamic, 1, Eigen::ColMajor > |
template<typename DT > | |
using | VecRM = Eigen::Matrix< DT, 1, Eigen::Dynamic, Eigen::RowMajor > |
template<typename DT > | |
using | VecRef = Eigen::Ref< Vec< DT > > |
template<typename DT > | |
using | VecRMRef = Eigen::Ref< VecRM< DT > > |
template<typename DT > | |
using | VecCRef = Eigen::Ref< const Vec< DT > > |
template<typename DT > | |
using | VecCMap = Eigen::Map< const Eigen::Matrix< DT, 1, Eigen::Dynamic, Eigen::RowMajor >, Eigen::Unaligned, Eigen::InnerStride<> > |
typedef size_t | id_t |
Element id type. More... | |
typedef std::function< id_t(const Data< MatCRef > &, const id_t &, const std::function< void(void *)> &)> | node_predf |
typedef unsigned int | uint |
Convenience typedef for unsigned int. More... | |
typedef mu::variant< SplitOptRes< float >, SplitOptRes< double >, SplitOptRes< uint >, SplitOptRes< uint8_t > > | OptSplitV |
typedef std::pair< id_t, id_t > | interv_t |
typedef std::pair< ptrdiff_t, ptrdiff_t > | regint_t |
typedef std::vector< std::pair< std::shared_ptr< std::vector< size_t > >, std::shared_ptr< std::vector< float > const > > > | usage_map_t |
Describes how each sample is used for each tree. More... | |
typedef std::pair< std::shared_ptr< std::vector< id_t > >, id_t > | include_pair_t |
A pair containing information about newly included samples. More... | |
typedef std::unordered_set< std::vector< size_t >, vector_hasher > | proposal_set_t |
The type of a set of dimension selections. More... | |
template<template< typename > class STOT> | |
using | DataStore = typename mu::variant< std::shared_ptr< const STOT< float > >, std::shared_ptr< const STOT< double > >, std::shared_ptr< const STOT< uint > >, std::shared_ptr< const STOT< uint8_t > >> |
Variant for storing shared_ptrs to the stored data matrix type. More... | |
template<template< typename > class STOT> | |
using | Data = typename mu::variant< Empty, STOT< float >, STOT< double >, STOT< uint >, STOT< uint8_t > > |
Storing a variant of the provided data container type. More... | |
using | DataV = typename mu::variant< std::vector< float >, std::vector< double >, std::vector< uint >, std::vector< uint8_t > > |
Enumerations | |
enum | ECompletionLevel { ECompletionLevel::Node, ECompletionLevel::Level, ECompletionLevel::Complete } |
Specifies the completion level for one training step. More... | |
enum | EThresholdSelection { EThresholdSelection::LessEqOnly, EThresholdSelection::GreaterOnly, EThresholdSelection::Both } |
Specifies which thresholds should be used for a decision. More... | |
enum | ESearchType { ESearchType::DFS, ESearchType::BFS } |
Functions | |
void | init () |
const MatCRef< float > | FORPY_ZERO_MATR (Mat< float >::Zero(0, 1)) |
template<typename T > | |
static std::vector< size_t > | argsort (const T *v, const size_t n) |
Highly efficient argsort realized with few STL commands. More... | |
template<typename T > | |
static std::vector< size_t > | argsort (const std::vector< T > &v) |
Highly efficient argsort realized with few STL commands. More... | |
template<typename T > | |
static bool | safe_pos_sum_lessoe_than (const std::vector< T > &vec, const T &limit) |
Tests whether the sum of all elements in vec is less than limit. More... | |
template<typename T > | |
static bool | safe_pos_sum_lessoe_than (const std::vector< T > &vec1, const std::vector< T > &vec2, const T &limit) |
Tests whether the sum of all elements in vec1 and vec2 is less than limit. More... | |
template<typename T > | |
static bool | safe_pos_sum_lessoe_than (const std::vector< T > &vec) |
Tests whether the sum of all elements in vec is less than the numeric limit of its type. More... | |
template<typename T > | |
static bool | safe_pos_sum_lessoe_than (const std::vector< T > &vec1, const std::vector< T > &vec2) |
Tests whether the sum of all elements in vec1 and vec2 is less than the numeric limit of their type. More... | |
static bool | check_elem_ids_ok (const size_t &n_samples, const std::vector< size_t > &elem_ids) |
Tests whether all element ids are valid. More... | |
int | ipow (int base, unsigned int exp) |
Computes an int power by an int. More... | |
float | fpowi (float base, unsigned int exp) |
Computes a float power by an unsigned int. More... | |
static size_t | hash_fnv_1a (const unsigned char *key, const size_t &len) |
Quick and easy implementation of 64-bit FNV 1a hash. More... | |
static int64_t | ibinom (const int &n, int k) |
Integer binomial with overflow detection. More... | |
template<typename T > | |
static std::vector< T > | unique_indices (T num, T min, const T &max, std::mt19937 *random_engine, bool return_sorted=false) |
Sampling without replacement. More... | |
template<typename V , class... VarArgs> | |
V | GetWithDefVar (const std::unordered_map< std::string, mu::variant< VarArgs... >> &m, std::string const &key, const V &defval) |
Variables | |
const int | DLOG_FD_V = 100 |
const size_t | LOG_FD_NID = 12043 |
const bool | LOG_FD_ALLN = true |
static const bool | SKLEARN_COMPAT |
static const float | ENTROPY_EPS = 1E-7 |
const float | CLASSOPT_EPS = 1E-7f |
Classification epsilon. No differences less than this are considered existent. This is relevant for: More... | |
const float | REGOPT_EPS = 1E-7f |
Regression epsilon. No differences less than this are considered existent. This is relevant for: More... | |
const double | GAIN_EPS = 1E-7 |
static const double | D_PI = 4. * atan(1.) |
static const float | TWO_PI = static_cast<float>(2. * D_PI) |
static const float | TWO_PI_E = TWO_PI * expf(1.f) |
const int | DLOG_COPT_V = 100 |
Variables to control debugging and log output for the forpy::RegressionOpt. More... | |
const size_t | LOG_COPT_NID = 0 |
Variables to control debugging and log output for the forpy::RegressionOpt. More... | |
const bool | LOG_COPT_ALLN = false |
Variables to control debugging and log output for the forpy::RegressionOpt. More... | |
const int | DLOG_FCOPT_V = 100 |
Variables to control debugging and log output for the forpy::RegressionOpt. More... | |
const size_t | LOG_FCOPT_NID = 3 |
Variables to control debugging and log output for the forpy::RegressionOpt. More... | |
const bool | LOG_FCOPT_ALLN = true |
Variables to control debugging and log output for the forpy::RegressionOpt. More... | |
const int | DLOG_ROPT_V = 1 |
Variables to control debugging and log output for the forpy::RegressionOpt. More... | |
const size_t | LOG_ROPT_NID = 12043 |
Variables to control debugging and log output for the forpy::RegressionOpt. More... | |
const bool | LOG_ROPT_ALLN = false |
Variables to control debugging and log output for the forpy::RegressionOpt. More... | |
using forpy::Data = typedef typename mu::variant<Empty, STOT<float>, STOT<double>, STOT<uint>, STOT<uint8_t> > |
using forpy::DataStore = typedef typename mu::variant< std::shared_ptr<const STOT<float> >, std::shared_ptr<const STOT<double> >, std::shared_ptr<const STOT<uint> >, std::shared_ptr<const STOT<uint8_t> >> |
using forpy::DataV = typedef typename mu::variant<std::vector<float>, std::vector<double>, std::vector<uint>, std::vector<uint8_t> > |
typedef size_t forpy::id_t |
typedef std::pair<std::shared_ptr<std::vector<id_t> >, id_t> forpy::include_pair_t |
typedef std::pair<id_t, id_t> forpy::interv_t |
using forpy::Mat = typedef Eigen::Matrix<DT, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor> |
using forpy::MatCM = typedef Eigen::Matrix<DT, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor> |
using forpy::MatCMCRef = typedef Eigen::Ref<const MatCM<DT> > |
using forpy::MatCRef = typedef Eigen::Ref<const Mat<DT> > |
using forpy::MatRef = typedef Eigen::Ref<Mat<DT> > |
typedef std::function<id_t(const Data<MatCRef> &, const id_t &, const std::function<void(void *)> &)> forpy::node_predf |
typedef mu::variant<SplitOptRes<float>, SplitOptRes<double>, SplitOptRes<uint>, SplitOptRes<uint8_t> > forpy::OptSplitV |
typedef std::unordered_set<std::vector<size_t>, vector_hasher> forpy::proposal_set_t |
typedef std::pair<ptrdiff_t, ptrdiff_t> forpy::regint_t |
typedef unsigned int forpy::uint |
typedef std::vector<std::pair<std::shared_ptr<std::vector<size_t> >, std::shared_ptr<std::vector<float> const> > > forpy::usage_map_t |
using forpy::Vec = typedef Eigen::Matrix<DT, Eigen::Dynamic, 1, Eigen::ColMajor> |
using forpy::VecCMap = typedef Eigen::Map<const Eigen::Matrix<DT, 1, Eigen::Dynamic, Eigen::RowMajor>, Eigen::Unaligned, Eigen::InnerStride<> > |
using forpy::VecCRef = typedef Eigen::Ref<const Vec<DT> > |
using forpy::VecRef = typedef Eigen::Ref<Vec<DT> > |
using forpy::VecRM = typedef Eigen::Matrix<DT, 1, Eigen::Dynamic, Eigen::RowMajor> |
using forpy::VecRMRef = typedef Eigen::Ref<VecRM<DT> > |
|
strong |
|
strong |
|
strong |
|
static |
Highly efficient argsort realized with few STL commands.
Inspired by http://stackoverflow.com/questions/1577475/c-sorting-and-keeping-track-of-indexes.
v | Vector to sort. |
n | Number of vector elements. |
|
static |
Highly efficient argsort realized with few STL commands.
Inspired by http://stackoverflow.com/questions/1577475/c-sorting-and-keeping-track-of-indexes.
v | Vector to sort. |
|
static |
|
inline |
Computes a float power by an unsigned int.
Fast implementation similar to the cryptographic fast int pow ipow. For exp values up to including 5, the calculation is explicitly hard coded.
Definition at line 56 of file exponentials.h.
V forpy::GetWithDefVar | ( | const std::unordered_map< std::string, mu::variant< VarArgs... >> & | m, |
std::string const & | key, | ||
const V & | defval | ||
) |
|
static |
Quick and easy implementation of 64-bit FNV 1a hash.
The FNV 1a is easy to implement and has still good enough characteristics to be used for this application.
See http://www.isthe.com/chongo/tech/comp/fnv/index.html and for comparisons and more information http://eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx and http://burtleburtle.net/bob.
|
static |
Integer binomial with overflow detection.
The code here is based on the following short article: http://etceterology.com/fast-binomial-coefficients. In the article, the possibility to use a lookup-table is introduced. This is not done here, since it is not particularly useful for the use-case. The code has been thoroughly reviewed and tested.
Definition at line 30 of file sampling.h.
|
inline |
Computes an int power by an int.
Fast implementation using in-place multiplication and bit-shifts only. Original version can be found here: http://stackoverflow.com/questions/101439/the-most-efficient-way-to-implement-an-integer-based-power-function-powint-int/101613#101613, the signature of the method has been adjusted to unsigned int exp to avoid the mentioned problems with possible negative exponents.
Definition at line 35 of file exponentials.h.
|
static |
|
static |
|
static |
|
static |
|
static |
Sampling without replacement.
Returns a set of num unique numbers in range [min, max]. T must be an integral datatype.
This implementation does not need to be stateful, since the algorithm completes in one go. VERY efficient in any case. It is inspired by various algorithms from the below sources, but surpasses them in terms of efficiency and distribution of the values. The algorithm it was mainly inspired by iterates over the sample range once and picks the next number by a random distribution. In the original version, the random distribution is badly designed.
See: http://codegolf.stackexchange.com/questions/4772/random-sampling-without-replacement http://www.cplusplus.com/reference/cstdlib/rand/ http://stackoverflow.com/questions/311703/algorithm-for-sampling-without-replacement
num | Number of examples to be selected from the range. |
min | Minimum of range (inclusive). |
max | Maximum of range (inclusive). |
random_engine | The random engine to use for random number generation. |
return_sorted | If true, returns the numbers sorted (no overhead), otherwise they will be shuffled (overhead). |
Definition at line 156 of file sampling.h.
const float forpy::CLASSOPT_EPS = 1E-7f |
Classification epsilon. No differences less than this are considered existent. This is relevant for:
Definition at line 38 of file classification_opt.h.
|
static |
Portable double pi value.
Definition at line 12 of file exponentials.h.
const int forpy::DLOG_COPT_V = 100 |
Variables to control debugging and log output for the forpy::RegressionOpt.
Definition at line 21 of file classification_opt.h.
const int forpy::DLOG_FCOPT_V = 100 |
Variables to control debugging and log output for the forpy::RegressionOpt.
Definition at line 22 of file fastclassopt.h.
const int forpy::DLOG_FD_V = 100 |
Definition at line 24 of file fastdecider.h.
const int forpy::DLOG_ROPT_V = 1 |
Variables to control debugging and log output for the forpy::RegressionOpt.
Definition at line 19 of file regression_opt.h.
|
static |
Definition at line 18 of file ientropyfunction.h.
const bool forpy::LOG_COPT_ALLN = false |
Variables to control debugging and log output for the forpy::RegressionOpt.
Definition at line 23 of file classification_opt.h.
const size_t forpy::LOG_COPT_NID = 0 |
Variables to control debugging and log output for the forpy::RegressionOpt.
Definition at line 22 of file classification_opt.h.
const bool forpy::LOG_FCOPT_ALLN = true |
Variables to control debugging and log output for the forpy::RegressionOpt.
Definition at line 24 of file fastclassopt.h.
const size_t forpy::LOG_FCOPT_NID = 3 |
Variables to control debugging and log output for the forpy::RegressionOpt.
Definition at line 23 of file fastclassopt.h.
const bool forpy::LOG_FD_ALLN = true |
Definition at line 26 of file fastdecider.h.
const size_t forpy::LOG_FD_NID = 12043 |
Definition at line 25 of file fastdecider.h.
const bool forpy::LOG_ROPT_ALLN = false |
Variables to control debugging and log output for the forpy::RegressionOpt.
Definition at line 21 of file regression_opt.h.
const size_t forpy::LOG_ROPT_NID = 12043 |
Variables to control debugging and log output for the forpy::RegressionOpt.
Definition at line 20 of file regression_opt.h.
const float forpy::REGOPT_EPS = 1E-7f |
Regression epsilon. No differences less than this are considered existent. This is relevant for:
Definition at line 36 of file regression_opt.h.
|
static |
|
static |
Precomputed value for the computation of the differential induced entropy.
Definition at line 16 of file exponentials.h.
|
static |
Precomputed value for the computation of the differential shannon entropy.
Definition at line 22 of file exponentials.h.