Creating dendrogram objects
idendrogram.idendrogram
__init__(link_factory=lambda x: ClusterLink(**x), node_factory=lambda x: ClusterNode(**x), axis_label_factory=lambda x: AxisLabel(**x))
Initializes the idendrogram object, optionally with different formatting defaults.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
link_factory |
Callable[[Dict], ClusterLink]
|
idendrogram.ClusterLink factory that can be used to override link formatting defaults. |
lambda x: ClusterLink(**x)
|
node_factory |
Callable[[Dict], ClusterNode]
|
idendrogram.ClusterNode factory that can be used to override node formatting defaults. |
lambda x: ClusterNode(**x)
|
axis_label_factory |
Callable[[Dict], AxisLabel]
|
idendrogram.AxisLabel factory that can be used to override axis label formatting defaults. |
lambda x: AxisLabel(**x)
|
Example
Customizing the Dendrogram to show smaller nodes and dashed link lines:
#define a subclass of `ClusterNode` and redefine radius and text label sizes
@dataclass
SmallClusterNode(ClusterNode):
radius: 3
labelsize: 3
#define a subclass of `ClusterLink` and redefine stroke dash pattern
@dataclass
DashedLink(ClusterLink):
strokedash: List = field(default_factory= lambda: [1, 5, 5, 1])
#instantiate the idendrogram object with the factories for links and nodes
dd = idendrogram.idendrogram(
link_factory=lambda x: DashedLink(**x),
node_factory=lambda x: SmallClusterNode(**x)
)
#proceed as usual
cdata = idendrogram.ClusteringData(
linkage_matrix=model,
cluster_assignments=cluster_assignments,
threshold=threshold
)
dd.set_cluster_info(cdata)
dendrogram = dd.create_dendrogram().to_altair()
convert_scipy_dendrogram(R, compute_nodes=True, node_label_func=callbacks.cluster_id_if_cluster, node_hover_func=callbacks.default_hover, link_color_func=callbacks.link_painter())
Converts a dictionary representing a dendrogram generated by SciPy to idendrogram.Dendrogram object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
R |
ScipyDendrogram
|
Dictionary as generated by SciPy's |
required |
compute_nodes |
bool
|
Whether to compute nodes (requires idendrogram.ClusteringData to be set via idendrogram.idendrogram.set_cluster_info and can be computationally expensive on large datasets). |
True
|
node_label_func |
Callable[[], str]
|
Callback function to generate dendrogram node labels. See idendrogram.idendrogram.create_dendrogram for usage details. |
callbacks.cluster_id_if_cluster
|
node_hover_func |
Callable[[], Union[Dict, str]]
|
Callback function to generate dendrogram hover text labels. See idendrogram.idendrogram.create_dendrogram for usage details. |
callbacks.default_hover
|
Returns:
Name | Type | Description |
---|---|---|
Dendrogram |
Dendrogram
|
[idendrogram.Dendrogram] object |
Example
#your clustering workflow
Z = scipy.cluster.hierarchy.linkage(*)
threshold = 42
cluster_assignments = scipy.cluster.hierarchy.fcluster(Z, threshold=threshold, *)
R = scipy.cluster.hierarchy.dendrogram(Z, no_plot=True, get_leaves=True, *)
#Render scipy's dendrogram in plotly without any additional modifications
dd = idendrogram.idendrogram()
dendrogram = dd.convert_scipy_dendrogram(R, compute_nodes = False)
dendrogram.to_plotly()
create_dendrogram(truncate_mode='level', p=4, sort_criteria='distance', sort_descending=False, link_color_func=callbacks.link_painter(), leaf_label_func=callbacks.counts, compute_nodes=True, node_label_func=callbacks.cluster_id_if_cluster, node_hover_func=callbacks.default_hover)
Creates an idendrogram dendrogram object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
truncate_mode |
level | lastp | None
|
Truncation mode used to condense the dendrogram. See scipy's dendrogram() for details. |
'level'
|
p |
int
|
truncate_mode parameter. See scipy's dendrogram() for details. |
4
|
sort_criteria |
count | distance
|
Node order criteria. |
'distance'
|
sort_descending |
bool
|
Accompanying parameter to sort_criteria to indicate whether sorting should be descending. |
False
|
link_color_func |
Callable[[ClusteringData, int], str]
|
A callable function that determines colors of nodes and links. See below for details. |
callbacks.link_painter()
|
leaf_label_func |
Callable[[ClusteringData, int], str]
|
A callable function that determines leaf node labels. See below for details. |
callbacks.counts
|
compute_nodes |
bool
|
Whether nodes should be computed (can be computationally expensive on large datasets). |
True
|
node_label_func |
Callable[[ClusteringData, int], str]
|
A callable function that determines node text labels. See below for details. |
callbacks.cluster_id_if_cluster
|
node_hover_func |
Callable[[ClusteringData, int], Dict[str, str]]
|
A callable function that determines node hover text. See below for details. |
callbacks.default_hover
|
Returns:
Name | Type | Description |
---|---|---|
Dendrogram |
Dendrogram
|
[idendrogram.Dendrogram] object |
Usage notes
For how-to examples, see How-to Guide.
SciPy's dendrogram parameters
idendrogram uses SciPy to generate the initial dendrogram structure and passes a few parameters directly to scipy.cluster.hierarchy.dendrogram
:
truncate_mode
andp
are passed without modificationssort_criteria
andsort_descending
map tocount_sort
anddistance_sort
leaf_color_func
andleaf_label_func
are passed on with an additional wrapper that enables access to the linkage matrix (see below for details)
To fully understand these parameters, it is easiest to explore scipy's documentation directly.
Callback functions
idendrogram uses callbacks to allow customizing link/node colors (link_color_func
), leaf axis labels (leaf_label_func
),
node labels (node_label_func
) and tooltips (node_hover_func
). All callback functions will be called with 2 parameters:
- an instance of idendrogram.ClusteringData object that provides access to linkage matrix and other clustering information.
- linkage ID (integer)
The return types should be:
link_color_func
should return the color for the link/node represented by the linkage ID.leaf_label_func
should return the text label to be used for the axis label of the leaf node represented by the linkage ID.node_label_func
should return the text label to be used for the node represented by the linkage ID.node_hover_func
should return a dictionary of key:value pairs that will be displayed in a tooltip of the node represented by the linkage ID.
This setup allows nearly endless customization - examples are provided in How-to Guide.
set_cluster_info(cluster_data)
Sets the clustering data (linkage matrix and other parameters) that are required for some of the dendrogram generation features.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cluster_data |
ClusteringData
|
instance of idendrogram.ClusteringData |
required |
Example
#your clustering workflow
Z = scipy.cluster.hierarchy.linkage(...)
threshold = 42
cluster_assignments = scipy.cluster.hierarchy.fcluster(Z, threshold=threshold, ...)
#dendrogram creation
dd = idendrogram.idendrogram()
cdata = idendrogram.ClusteringData(
linkage_matrix=Z,
cluster_assignments=cluster_assignments,
threshold=threshold
)
dd.set_cluster_info(cdata)