Evaluation and Benchmarking

The evaluation of Community Discovery algorithms is not an easy task. cdlib implements two families of evaluation strategies:

  • Internal evaluation through fitness scores;

  • External evaluation through partition comparison.

Moreover, cdlib integrates both standard synthetic network benchmarks and real networks with annotated ground truths, thus allowing for testing identified communities against ground truths.

Finally, cdlib also provides a way to generate rank clustering results algorithms over a given input graph.

Note

The following lists are aligned to CD evaluation methods available in the GitHub main branch of cdlib.

Internal Evaluation: Fitness scores

Fitness functions allow to summarize the characteristics of a computed set of communities. cdlib implements the following quality scores:

avg_distance(graph, communities, **kwargs)

Average distance.

avg_embeddedness(graph, communities, **kwargs)

Average embeddedness of nodes within the community.

average_internal_degree(graph, community[, ...])

The average internal degree of the community set.

avg_transitivity(graph, communities, **kwargs)

Average transitivity.

conductance(graph, community[, summary])

Fraction of total edge volume that points outside the community.

cut_ratio(graph, community[, summary])

Fraction of existing edges (out of all possible edges) leaving the community.

edges_inside(graph, community[, summary])

Number of edges internal to the community.

expansion(graph, community[, summary])

Number of edges per community node that point outside the cluster.

fraction_over_median_degree(graph, community)

Fraction of community nodes of having internal degree higher than the median degree value.

hub_dominance(graph, communities, **kwargs)

Hub dominance.

internal_edge_density(graph, community[, ...])

The internal density of the community set.

normalized_cut(graph, community[, summary])

Normalized variant of the Cut-Ratio

max_odf(graph, community[, summary])

Maximum fraction of edges of a node of a community that point outside the community itself.

avg_odf(graph, community[, summary])

Average fraction of edges of a node of a community that point outside the community itself.

flake_odf(graph, community[, summary])

Fraction of nodes in S that have fewer edges pointing inside than to the outside of the community.

scaled_density(graph, communities, **kwargs)

Scaled density.

significance(graph, communities, **kwargs)

Significance estimates how likely a partition of dense communities appear in a random graph.

size(graph, communities, **kwargs)

Size is the number of nodes in the community

surprise(graph, communities, **kwargs)

Surprise is statistical approach proposes a quality metric assuming that edges between vertices emerge randomly according to a hyper-geometric distribution.

triangle_participation_ratio(graph, community)

Fraction of community nodes that belong to a triad.

purity(communities)

Purity is the product of the frequencies of the most frequent labels carried by the nodes within the communities

Among the fitness function, a well-defined family of measures is the Modularity-based one:

erdos_renyi_modularity(graph, communities, ...)

Erdos-Renyi modularity is a variation of the Newman-Girvan one.

link_modularity(graph, communities, **kwargs)

Quality function designed for directed graphs with overlapping communities.

modularity_density(graph, communities[, lmbd])

The modularity density is one of several propositions that envisioned to palliate the resolution limit issue of modularity based measures.

modularity_overlap(graph, communities[, weight])

Determines the Overlapping Modularity of a partition C on a graph G.

newman_girvan_modularity(graph, communities, ...)

Difference the fraction of intra community edges of a partition with the expected number of such edges if distributed according to a null model.

z_modularity(graph, communities, **kwargs)

Z-modularity is another variant of the standard modularity proposed to avoid the resolution limit.

Some measures will return an instance of FitnessResult that takes together min/max/mean/std values of the computed index.

FitnessResult(min, max, score, std)

External Evaluation: Partition Comparisons

It is often useful to compare different graph partitions to assess their resemblance. cdlib implements the following partition comparisons scores:

adjusted_mutual_information(first_partition, ...)

Adjusted Mutual Information between two clusterings.

mi(first_partition, second_partition)

This function calculates the Mutual Information (MI) between two clusterings.

rmi(first_partition, second_partition[, ...])

This function calculates the Reduced Mutual Information (RMI) between two clusterings.

normalized_mutual_information(...)

Normalized Mutual Information between two clusterings.

overlapping_normalized_mutual_information_LFK(...)

Overlapping Normalized Mutual Information between two clusterings.

overlapping_normalized_mutual_information_MGH(...)

Overlapping Normalized Mutual Information between two clusterings.

variation_of_information(first_partition, ...)

Variation of Information among two nodes partitions.

rand_index(first_partition, second_partition)

This function calculates the Rand index between two clusterings.

adjusted_rand_index(first_partition, ...)

Rand index adjusted for chance.

omega(first_partition, second_partition)

Index of resemblance for overlapping, complete coverage, network clusterings.

f1(first_partition, second_partition)

Compute the average F1 score of the optimal algorithms matches among the partitions in input.

nf1(first_partition, second_partition)

Compute the Normalized F1 score of the optimal algorithms matches among the partitions in input.

southwood_index(first_partition, ...)

This function calculates the Southwood index between two clusterings.

rogers_tanimoto_index(first_partition, ...)

This function calculates the Rogers and Tanimoto index between two clusterings.

sorensen_index(first_partition, second_partition)

This function calculates the Sorensen between two clusterings.

dice_index(first_partition, second_partition)

This function calculates the Czekanowski between two clusterings.

czekanowski_index(first_partition, ...)

This function calculates the Czekanowski between two clusterings.

fowlkes_mallows_index(first_partition, ...)

This function calculates the Fowlkes and Mallows index between two clusterings

jaccard_index(first_partition, second_partition)

This function calculates the Jaccard index between two clusterings.

sample_expected_sim(first_partition, ...[, ...])

This function calculates the expected Similarity for all pair-wise comparisons between Clusterings drawn from one of six random models.

overlap_quality(first_partition, ...)

This function calculates the overlap quality between two (overlapping) clusterings.

geometric_accuracy(first_partition, ...)

This function calculates the geometric accuracy between two (overlapping) clusterings.

classification_error(first_partition, ...)

This function calculates the Jaccard index between two clusterings.

ecs(first_partition, second_partition[, ...])

The element-centric clustering similarity.

Some measures will return an instance of MatchingResult that takes together the computed index’s mean and standard deviation values.

MatchingResult(score, std)

Synthetic Benchmarks

External evaluation scores can be fruitfully used to compare alternative clusterings of the same network and to assess to what extent an identified node clustering matches a known ground truth partition.

To facilitate such a standard evaluation task, cdlib exposes a set of standard synthetic network generators providing topological community ground truth annotations.

In particular, cdlib make available benchmarks for:

  • static community discovery;

  • dynamic community discovery;

  • feature-rich (i.e., node-attributed) community discovery.

All details can be found on the dedicated page.

Networks With Annotated Communities

Although evaluating a topological partition against an annotated “semantic” one is not among the safest paths to follow [Peel17], cdlib natively integrates well-known medium-size network datasets with ground-truth communities.

Due to the non-negligible sizes of such datasets, we designed a simple API to gather them transparently from a dedicated remote repository.

All details on remote datasets can be found on the dedicated page.

[Peel17]

Peel, Leto, Daniel B. Larremore, and Aaron Clauset. “The ground truth about metadata and community detection in networks.” Science Advances 3.5 (2017): e1602548.