You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This example demonstrates the variability of stochastic community detection methods by analyzing the consistency of multiple partitions using similarity measures normalized mutual information (NMI), variation of information (VI), rand index (RI) on both random and structured graphs.
8
+
This example demonstrates the use of stochastic community detection methods to check whether a network possesses a strong community structure, and whether the partitionings we obtain are meaningul. Many community detection algorithms are randomized, and return somewhat different results after each run, depending on the random seed that was set. When there is a robust community structure, we expect these results to be similar to each other. When the community structure is weak or non-existent, the results may be noisy and highly variable. We will employ several partion similarity measures to analyse the consistency of the results, including the normalized mutual information (NMI), the variation of information (VI), and the Rand index (RI).
9
9
10
10
"""
11
11
# %%
12
-
# Import libraries
13
12
importigraphasig
14
13
importmatplotlib.pyplotasplt
15
14
importitertools
15
+
importrandom
16
16
17
17
# %%
18
-
# First, we generate a graph.
19
-
# Load the karate club network
18
+
# .. note::
19
+
# We set a random seed to ensure that the results look exactly the same in
20
+
# the gallery. You don't need to do this when exploring randomness.
21
+
random.seed(42)
22
+
23
+
# %%
24
+
# We will use Zachary's karate club dataset [1]_, a classic example of a network
25
+
# with a strong community structure:
20
26
karate=ig.Graph.Famous("Zachary")
21
27
22
28
# %%
23
-
#For the random graph, we use an Erdős-Rényi :math:`G(n, m)` model, where 'n' is the number of nodes
24
-
#and 'm' is the number of edges. We set 'm' to match the edge count of the empirical (Karate Club)
25
-
#network to ensure structural similarity in terms of connectivity, making comparisons meaningful.
26
-
n_nodes=karate.vcount()
27
-
n_edges=karate.ecount()
28
-
#Generate an Erdős-Rényi graph with the same number of nodes and edges
# We have used, stochastic community detection using the Louvain method, iteratively generating partitions and computing similarity metrics to assess stability.
65
-
# The Louvain method is a modularity maximization approach for community detection.
66
-
# Since exact modularity maximization is NP-hard, the algorithm employs a greedy heuristic that processes vertices in a random order.
67
-
# This randomness leads to variations in the detected communities across different runs, which is why results may differ each time the method is applied.
axes[i][1].set_title(f"Probability Density of {measure} - Erdős-Rényi Graph")
107
-
axes[i][1].set_xlabel(f"{measure}Score")
108
-
axes[i][1].set_xlim(lower, upper) # Set axis limits explicitly
122
+
axes[1][i].set_title(f"{measure} - Random network")
123
+
axes[1][i].set_xlabel(f"{measure}score")
124
+
axes[0][i].set_ylabel("PDF")
109
125
110
126
plt.tight_layout()
111
127
plt.show()
112
128
113
129
# %%
114
-
# We have compared the probability density of NMI, VI, and RI for the Karate Club network (structured) and an Erdős-Rényi random graph.
130
+
# We have compared the pairwise similarities using the NMI, VI, and RI measures
131
+
# between partitonings obtained for the karate club network (strong community
132
+
# structure) and a comparable random graph (which lacks communities).
115
133
#
116
-
# **NMI (Normalized Mutual Information):**
117
-
#
118
-
# - Karate Club Network: The distribution is concentrated near 1, indicating high similarity across multiple runs, suggesting stable community detection.
119
-
# - Erdős-Rényi Graph: The values are more spread out, with lower NMI scores, showing inconsistent partitions due to the lack of clear community structures.
134
+
# The Normalized Mutual Information (NMI) and Rand Index (RI) both quantify
135
+
# similarity, and take values from :math:`[0,1]`. Higher values indicate more
136
+
# similar partitionings, with a value of 1 attained when the partitionings are
137
+
# identical.
120
138
#
121
-
# **VI (Variation of Information):**
139
+
# The Variation of Information (VI) is a distance measure. It takes values from
140
+
# :math:`[0,\infty]`, with lower values indicating higher similarities. Identical
141
+
# partitionings have a distance of zero.
122
142
#
123
-
# - Karate Club Network: The values are low and clustered, indicating stable partitioning with minor variations across runs.
124
-
# - Erdős-Rényi Graph: The distribution is broader and shifted toward higher VI values, meaning higher partition variability and less consistency.
125
-
#
126
-
# **RI (Rand Index):**
127
-
#
128
-
# - Karate Club Network: The RI values are high and concentrated near 1, suggesting consistent clustering results across multiple iterations.
129
-
# - Erdős-Rényi Graph: The distribution is more spread out, but with lower RI values, confirming unstable community detection.
130
-
#
131
-
# **Conclusion**
132
-
#
133
-
# The Karate Club Network exhibits strong, well-defined community structures, leading to consistent results across runs.
134
-
# The Erdős-Rényi Graph, being random, lacks clear communities, causing high variability in detected partitions.
143
+
# For the karate club network, NMI and RI value are concentrated near 1, while
144
+
# VI is concentrated near 0, suggesting a robust community structure. In contrast
145
+
# the values obtained for the random network are much more spread out, showing
146
+
# inconsistent partitionings due to the lack of a clear community structure.
147
+
148
+
# %%
149
+
# .. [1] W. Zachary: "An Information Flow Model for Conflict and Fission in Small Groups". Journal of Anthropological Research 33, no. 4 (1977): 452–73. https://www.jstor.org/stable/3629752
0 commit comments