Robust graph-Theoretic clustering approaches using node-based resilience measures


This paper examines a schema for graph-Theoretic clustering using node-based resilience measures. Node-based resilience measures optimize an objective based on a critical set of nodes whose removal causes some severity of disconnection in the network. Beyond presenting a general framework for the usage of node based resilience measures for variations of clustering problems, we emphasize the unique potential of such methods to accomplish the following properties: (i) clustering a graph in one step without knowing the number of clusters a priori, and (ii) removing noise from noisy data. We first present results of clustering experiments using a β-parametrized generalization of vertex attack tolerance, showing high clustering accuracy for both real datasets and equal density synthetic data sets, as well as successful removal of noise nodes. It is shown that arbitrarily increasing β increases the number of noise nodes removed in some cases, and that internal validation measures can be used to determine the correct number of clusters in a class of datasets. Further results are presented using five different resilience measures with a general node-based resilience clustering technique. In a subset of cases a resilience measure, such as integrity, is able to cluster to high accuracy in one step, giving the correct clustering while also determining the correct number of clusters. Integrity is also shown to be promising with respect to noise removal, removing up to 80% of noise on some datasets.


Engineering Program

Document Type

Conference Proceeding



Publication Date


Journal Title

Proceedings - IEEE International Conference on Data Mining, ICDM