Report generated on {{ time_signature }}
Below is an overview of the dataset used in this evaluation:
Note: The feature summary is only for the first 10 features.
{% endif %}Below you will see a list of preprocessing steps performed on the dataset before any models were applied. Preprocessing steps are divided into categories of tasks that they are responsible for. The selection of certain steps can be highly configured with parameters passed to Mamut Classifier. Some steps are dynamically selected based on the dataset characteristics (e.g. PowerTransformer for skewed features).
Pipeline:
Below you will see the results of the Principal Component Analysis (PCA) performed on the dataset. PCA is a dimensionality reduction technique that is used to reduce the number of features in the dataset. The results below show the input of each feature to the principal components.
Below you will see the results of the model evaluation. The list of models tested during this training session includes:
Each model has been tuned for optimal performance using the hyperparameter tuning in Optuna.
The optimizer used for tuning was: {{ optimizer }}. Optimization was performed with
respect to the metric: {{ metric }} for {{ n_trials }} iterations. Access any model's
hyperparameters by getting the models from mamut.raw_fitted_models_ field and running .get_params() on
your model of interest.
After tuning, all models were evaluated on the test set that was split from the original dataset.
The best model was {{ best_model }}.
Below you will see the evaluation of the ensemble model. The ensemble model was created by combining individual models using the {{ ensemble_method }} method. {% if ensemble_method == 'Stacking' %} The meta-learner used was RandomForestClassifier. {% endif %} The models for the ensemble were selected using greedy approach with respect to the metric: {{ metric }}. The best ensemble model was selected.
Ensemble Stacking Model:
The results of the powerful greedy ensemble created during the experiment on the test set are available in the below table.
{{ ensemble_summary | safe }}Below you will see the feature importances in the dataset. Feature importance is calculated using the {{ feature_importance_method }} method. The method used to calculate feature importance is based on the model type. For example, tree-based models use the Gini importance, while linear models use the coefficients. The feature importance is calculated using the training set and is used to determine the most important features for the model.
Below you will see the SHAP values for the best model. SHAP values provide a way to interpret the impact of each feature on the model's output. Each point on the summary plot represents one feature value of one observation. The position on the y-axis is determined by the feature importance and on the x-axis by the Shapley value. The color represents the value of the feature from low to high. Overlapping points are jittered in y-axis direction, so we get a sense of the distribution of the Shapley values per feature.
1. X-Axis Spread: A wider spread signifies varying importance levels of that feature across the dataset.
2. Relative Impact: Features with points shifted to the right (higher SHAP values) indicate more substantial positive contributions, while those shifted to the left (lower SHAP values) represent negative contributions.
3. Overall Importance: Features on the Y-Axis are ordered by importance, with the most important features at the top.
4. Comparative Importance: Features with more spread-out or consistently shifted points might hold higher significance in the model’s predictions.
{% if not binary %}5. Multiclass Classification: The SHAP values can only be display for each class separately, so we display it for class with label 0.
{% endif %}