Cluster Overlap Control#
repliclust.overlap.centers#
This module implements a ClusterCenterSampler based on achieving the desired degree of pairwise overlap between clusters by minimizing an objective function.
- class repliclust.overlap.centers.ConstrainedOverlapCenters(max_overlap=0.1, min_overlap=0.09, packing=0.1, **optimization_args)#
Bases:
ClusterCenterSampler
This class provides an implementation for optimizing the location of cluster centers to achieve the desired degrees of overlap between pairs of clusters.
- Parameters
max_overlap (float between 0 and 1) – The maximum allowed overlap between two cluster centers, measured as a fraction of cluster mass.
min_overlap (float) – The minimum overlap each cluster needs to have with some other cluster, preventing it to be isolated. The overlap is measured as a fraction of cluster mass.
packing (float) – Sets the ratio of total cluster volume to the sampling volume. Used when choosing random cluster centers for initializing the optimization.
learning_rate (float) – The rate at which cluster centers are optimized. If numerical instabilities appear, it is recommended to lower this number.
max_epoch (int) – The maximum number of optimization epochs to run. Increasing this number may slow down the optimization.
tol (float) – Numerical tolerance for achieving the desired overlap between pairs of clusters.
- sample_cluster_centers(archetype, print_progress=False, quiet=False)#
Sample cluster centers at random and iteratively adjust them until the desired degrees of overlap between clusters are satisfied.
- Parameters
archetype (Archetype) – Archetype conveying the desired number of clusters and other attributes.
print_progress (bool) – If true, print step-by-step progress updates during the optimization. Even if
print_progress=False
, will still print a summary of the optimization status unlessquiet=True
.quiet (bool) – If true, suppress all print output.
- Returns
centers – The optimized cluster centers.
- Return type
ndarray