Compression Algorithm Glossary
- Louvain Clustering
Louvain Community Detection algorithm (Blondel et al. [1]) is used in compressing large-scale gene networks by identifying densely connected subgroups of nodes. This is a heuristic method based on modularity optimization.
- Greedy Algorithm
This method uses Clauset-Newman-Moore greedy modularity maximization (Clauset et al. [2]) to find the community partition with the largest modularity. It iteratively merges nodes to minimize the graph size while preserving its structure.
- Label Propagation
Label Propagation (Traag and Šubelj [10]) can compress networks by propagating labels across the network to identify and merge similar node communities. The algorithm is probabilistic and the found communities may vary in different executions.
- Asynchronous Fluid Communities
The asynchronous fluid communities algorithm (Parés et al. [8]) is based on the simple idea of fluids interacting in an environment, expanding and pushing each other. Its initialization is random, so found communities may vary on different executions.
- Spectral Clustering
Spectral clustering (von Luxburg [11]), applied to community detection task by using the adjacency matrix as affinity, utilizes graph Laplacian eigenstructure to partition the graph into clusters. This method is effective for identifying complex, non-convex cluster shapes like nested circles on a 2D plane. By specifying
affinity='precomputed'
, and providing the adjacency matrix as input, it is possible to accurately identify densely connected subgraphs (communities) within the network.- Hierarchical Clustering
Agglomerative clustering, applied to community detection tasks by utilizing the adjacency matrix as the metric, merges the most similar communities iteratively. This method is effective for identifying hierarchical structures within networks, where nodes are progressively merged based on their pairwise distances.
- Node2Vec
Node2Vec (Grover and Leskovec [5]), applied to community detection tasks, embeds nodes through biased random walks to capture complex relationships within the network. These embeddings can then be clustered to identify communities.
- DeepWalk
DeepWalk (Perozzi et al. [9]), applied to community detection tasks, learns node representations via truncated random walks. These representations are effective for large networks, preserving both local and global network structures.
- Clique Percolation Method (CPM)
Clique Percolation Method (CPM) (Palla et al. [7]) is a community detection algorithm designed to identify overlapping communities by finding k-cliques that share (k-1) nodes. This method is particularly useful for detecting functionally significant modules within networks.
- Non-negative Matrix Factorization (NMF)
Non-negative Matrix Factorization (NMF) (Lin [6]) is a dimensionality reduction technique that factorizes the adjacency matrix of a graph to detect communities. This method effectively preserves community structure in a reduced space, making it useful for identifying clusters within networks.