vbvarsel
Submodules
Attributes
Classes
Class representing the hyperparameters for the simulation. |
|
Class representing the simulation parameters. |
|
Functions
|
The main entry point to the package. |
Package Contents
- vbvarsel.main(hyperparameters: vbvarsel.global_parameters.Hyperparameters, simulation_parameters: vbvarsel.global_parameters.SimulationParameters = SimulationParameters(), Ctrick: bool = True, user_data: str | os.PathLike = None, user_labels: str | list[str] = None, cols_to_skip: list[str] = None, annealing_type: str = 'fixed', save_output: bool = False) _Results [source]
The main entry point to the package.
Params
- hyperparameters: Hyperparameters (Required)
An object of hyperparamters to apply to the simulation.
- simulation_parameters: SimulationParameters (Optional) (Default: SimulationParameters())
An object of simulation paramaters to apply to the simulation. Note: This is a required parameter if a user does not supply their own data.
- Ctrick: bool (Optional) (Default: True)
Flag to determine whether or not to apply replica trick to the simulation
- user_data: str or os.PathLike (Optional) (Default: None)
A location of a csv document for data a user whishes to test.
- user_labels: str | list[str] (Optional) (Default: None)
A string or list of strings to identify labels. A string value will try to extract a column of the same name from the supplied data.
- cols_to_skip: list[str] (Optional) (Default: None)
An optional list of columns to drop from the dataframe. This should be used to remove any non-numeric data from the dataframe. If a column shares the same name as a label column, the labels will be extracted before the column is dropped.
Hint: an unnamed column can be passed by using “Unnamed: [index]”, eg “Unnamed: 0” to drop a blank name first column.
- annealing_type: str (Optional) (Default: “fixed”)
Optional type of annealing to apply to the simulation, can be one of “geometric”, “harmonic” or “fixed”, the latter of which does not apply any annealing.
- save_output: bool (Optional) (Default: False)
Optional flag for users to save their output to a csv file. Data is saved in the current working directory with the file naming format “results-timestamp.csv”.
Returns
- results: dataclass
An object of results stored in a series of arrays from the clustering algorithm. Some arrays may be populated by nan values. This is the case if a user supplies their own data but does not have corresponding labels. Additionally, some fields are only captured during entirely simulated runs, as such will be nan-ed if a user provides their own dataset.
- class vbvarsel.SimulationParameters[source]
Class representing the simulation parameters.
These parameters are the “settings” for the simulation experiment. For more information regarding the simulation, see [INSERT PAPER HERE]. Mixture proportions must be numbers between 0 and 1. The n_relevants array must be all numbers that are less than the n_variables value, as it is not possible to have a higher number of relevant variables than total variables.