pyliblinear package
Submodules
Module contents
- Copyright
Copyright 2015 - 2022 André Malo or his licensors, as applicable
- License
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
pyliblinear - a liblinear python API
pyliblinear - a liblinear python API
- class pyliblinear.FeatureMatrix
Feature matrix to be used for training or prediction.
- static __new__(cls, iterable, assign_labels=None)
Create FeatureMatrix instance from a single iterable. If assign_labels is omitted or
None
, the iterable is expected to provide 2-tuples, containing the label and the accompanying feature vector. If assign_labels is passed and notNone
, the iterable should only provide the feature vectors. All labels are then assigned to the value of assign_labels.- Parameters
iterable (iterable) – Iterable providing the feature vectors and/or tuples of label and feature vector. See description.
assign_labels (int) – Value to be assigned to all labels. In this case the iterable is expected to provide only the feature vectors.
- Returns
New feature matrix instance
- Return type
- features(self)
Return the features as iterator of dicts.
- Returns
The feature vectors
- Return type
iterable
- from_iterables(cls, labels, features)
Create FeatureMatrix instance from a two separated iterables - labels and features.
- Parameters
labels (iterable) – Iterable providing the labels per feature vector (assigned by order)
features (iterable) – Iterable providing the feature vector per label (assigned by order)
- Returns
New feature matrix instance
- Return type
- Raises
ValueError – The lengths of the iterables differ
- height
The matrix height (number of labels and vectors).
- Type
int
- labels(self)
Return the labels as iterator.
- Returns
The labels
- Return type
iterable
- load(cls, file)
Create FeatureMatrix instance from a file.
Each line of the file contains the label and the accompanying sparse feature vector, separated by a space/tab sequence. The feature vector consists of index/value pairs. The index and the value are separated by a colon (
:
). The pairs are separated by space/tab sequences. Accepted line endings are\r
,\n
and\r\n
.All numbers are represented as strings parsable either as ints (for indexes) or doubles (for values and labels).
Note that the exact I/O exceptions depend on the stream passed in.
- Parameters
file (file or str) – Either a readable stream or a filename. If the passed object provides a
read
attribute/method, it’s treated as readable file stream, as a filename otherwise. If it’s a stream, the stream is read from the current position and remains open after hitting EOF. In case of a filename, the accompanying file is opened in text mode, read from the beginning and closed afterwards.- Returns
New feature matrix instance
- Return type
- Raises
IOError – Error reading the file
ValueError – Error parsing the file
- save(self, file)
Save FeatureMatrix instance to a file.
Each line of the file contains the label and the accompanying sparse feature vector, separated by a space. The feature vector consists of index/value pairs. The index and the value are separated by a colon (
:
). The pairs are separated by a space again. The line ending is\n
.All numbers are represented as strings parsable either as ints (for indexes) or doubles (for values and labels).
Note that the exact I/O exceptions depend on the stream passed in.
- Parameters
file (file or str) – Either a writeable stream or a filename. If the passed object provides a
write
attribute/method, it’s treated as writeable stream, as a filename otherwise. If it’s a stream, the stream is written to the current position and remains open when done. In case of a filename, the accompanying file is opened in text mode, truncated, written from the beginning and closed afterwards.- Raises
IOError – Error writing the file
- width
The matrix width (number of features).
- Type
int
- class pyliblinear.Model
Classification model. Use its Model.load or Model.train methods to construct a new instance
- bias
Bias used to create the model
None
if no bias was applied or applicable.- Type
double
- is_oneclass
Is model a oneclass SVM model?
- Type
bool
- is_probability
Is model a probability model?
- Type
bool
- is_regression
Is model a regression model?
- Type
bool
- load(cls, file, mmap=False)
Create Model instance from a file (previously created by Model.save())
Note that the exact I/O exceptions depend on the stream passed in.
- Parameters
file (file or str) – Either a readable stream or a filename. If the passed object provides a
read
attribute/method, it’s treated as readable file stream, as a filename otherwise. If it’s a stream, the stream is read from the current position and remains open after hitting EOF. In case of a filename, the accompanying file is opened in text mode, read from the beginning and closed afterwards.mmap (bool) – Load the model into a file-backed memory area? Default: false
- Returns
New model instance
- Return type
- Raises
IOError – Error reading the file
ValueError – Error parsing the file
- predict(self, matrix, label_only=True, probability=False)
Run the model on matrix and predict labels.
- Parameters
matrix (pyliblinear.FeatureMatrix or iterable) – Either a feature matrix or a simple iterator over feature vectors to inspect and predict upon.
label_only (bool) – Return the label only? If false, the decision dict for all labels is returned as well.
probability (bool) – Use probability estimates?
- Returns
- Result iterator. Either over labels or over label/decision dict
tuples.
- Return type
iterable
- rho
Rho value of the model
None
if not applicable.- Type
double
- save(self, file)
Save Model instance to a file.
After some basic information about solver type, dimensions and labels the model matrix is stored as a sequence of doubles per line. The matrix is transposed, so the height is the number of features (including the bias feature) and the width is the number of classes.
All numbers are represented as strings parsable either as ints (for dimensions and labels) or doubles (other values).
Note that the exact I/O exceptions depend on the stream passed in.
- Parameters
file (file or str) – Either a writeable stream or a filename. If the passed object provides a
write
attribute/method, it’s treated as writeable stream, as a filename otherwise. If it’s a stream, the stream is written to the current position and remains open when done. In case of a filename, the accompanying file is opened in text mode, truncated, written from the beginning and closed afterwards.- Raises
IOError – Error writing the file
- solver_type
Solver type used to create the model
- Type
str
- train(cls, matrix, solver=None, bias=None)
Create model instance from a training run
- Parameters
matrix (pyliblinear.FeatureMatrix) – Feature matrix to use for training
solver (pyliblinear.Solver) – Solver instance. If omitted or
None
, a default solver is picked.bias (float) – Bias to the hyperplane. Of omitted or
None
, no bias is applied.bias >= 0
.
- Returns
New model instance
- Return type
- class pyliblinear.Solver
Solver container
- C
The configured C parameter.
- Type
float
- static __new__(cls, type=None, C=None, eps=None, p=None, nu=None, weights=None)
Construct new solver instance.
- Parameters
type (str or int) – The solver type. One of the keys or values of the
SOLVER_TYPES
dict. If omitted orNone
, the default solver type is applied (L2R_L2LOSS_SVC_DUAL == 1
)C (float) – Cost parameter, if omitted or
None
, it defaults to1
.C > 0
.eps (float) – Tolerance of termination criterion. If omitted or
None
, a default is applied, depending on the solver type.eps > 0
p (float) – Epsilon in loss function of epsilon-SVR. If omitted or
None
it defaults to0.1
.p >= 0
.nu (float) – approximates the fraction of data as outliers (only for ONECLASS_SVM solver). If omitted or
None
it defaults to0.5
.weights (mapping) – Iterator over label weights. This is either a
dict
, mapping labels to weights ({int: float, ...}
) or an iterable of 2-tuples doing the same ([(int, float), ...]
). If omitted orNone
, no weight is applied.
- Returns
New Solver instance
- Return type
- Raises
ValueError – Some invalid parameter
- eps
The configured eps parameter.
- Type
float
- nu
The configured nu parameter.
- Type
float
- p
The configured p parameter.
- Type
float
- type
The configured solver type.
- Type
str
- weights(self)
Return the configured weights as a dict (label -> weight).
- Returns
The weights (maybe empty)
- Return type
dict