skvai.tasks package¶
Submodules¶
skvai.tasks.classification module¶
skvai.tasks.clustering module¶
Clustering task module for skvai.
- skvai.tasks.clustering.cluster(data: CSVData, model: str = 'KMeans', n_clusters: int = 3, output: list = ['labels'], save_path: str = 'clusterer.pkl', labels_csv: str = 'labels.csv', random_state: int = 42)¶
Fit and apply a clustering model on CSVData.
- Parameters:
data (CSVData) – Loaded dataset with attribute X and df.
model (str) – Which clusterer: ‘KMeans’ or ‘DBSCAN’.
n_clusters (int) – Number of clusters for KMeans.
output (list) – What to output: ‘metrics’, ‘plot’, ‘csv’, ‘save’.
save_path (str) – Path to save the clusterer (.pkl).
labels_csv (str) – Path to save full-data labels (.csv).
random_state (int) – Random seed for reproducibility.
- Returns:
{‘labels’, ‘model’}
- Return type:
dict
skvai.tasks.regression module¶
- skvai.tasks.regression.regress(data: CSVData, model: str = 'LinearRegression', output: list = ['metrics'], save_path: str = 'regressor.pkl', prediction_csv: str = 'predictions.csv', test_size: float = 0.2, random_state: int = 42)¶
Train and evaluate a regression model on CSVData.
- Parameters:
data (CSVData) – Loaded dataset with X, y, df.
model (str) – ‘LinearRegression’ or ‘RandomForestRegressor’.
output (list) – Options: ‘metrics’, ‘plot’, ‘csv’, ‘save’.
save_path (str) – Save path for model.
prediction_csv (str) – Save path for predictions.
test_size (float) – Split size for testing.
random_state (int) – Seed for reproducibility.
- Returns:
mse, r2, predictions, model
- Return type:
dict
Module contents¶
- class skvai.tasks.Task¶
Bases:
object
- load_data(path)¶
- set_target(col_name)¶
Call this before load_data() to override the default target column.
- train_and_output(format='graph')¶
- skvai.tasks.cluster(data: CSVData, model: str = 'KMeans', n_clusters: int = 3, output: list = ['labels'], save_path: str = 'clusterer.pkl', labels_csv: str = 'labels.csv', random_state: int = 42)¶
Fit and apply a clustering model on CSVData.
- Parameters:
data (CSVData) – Loaded dataset with attribute X and df.
model (str) – Which clusterer: ‘KMeans’ or ‘DBSCAN’.
n_clusters (int) – Number of clusters for KMeans.
output (list) – What to output: ‘metrics’, ‘plot’, ‘csv’, ‘save’.
save_path (str) – Path to save the clusterer (.pkl).
labels_csv (str) – Path to save full-data labels (.csv).
random_state (int) – Random seed for reproducibility.
- Returns:
{‘labels’, ‘model’}
- Return type:
dict
- skvai.tasks.regress(data: CSVData, model: str = 'LinearRegression', output: list = ['metrics'], save_path: str = 'regressor.pkl', prediction_csv: str = 'predictions.csv', test_size: float = 0.2, random_state: int = 42)¶
Train and evaluate a regression model on CSVData.
- Parameters:
data (CSVData) – Loaded dataset with X, y, df.
model (str) – ‘LinearRegression’ or ‘RandomForestRegressor’.
output (list) – Options: ‘metrics’, ‘plot’, ‘csv’, ‘save’.
save_path (str) – Save path for model.
prediction_csv (str) – Save path for predictions.
test_size (float) – Split size for testing.
random_state (int) – Seed for reproducibility.
- Returns:
mse, r2, predictions, model
- Return type:
dict