College of Natural and Applied Sciences

Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization

Thy Nguyen, MSU Undergraduate
Jason Viehman
Dacosta Yeboah, MSU Graduate Student
Gayla R. Olbricht
Tayo Obafemi-Ajayi, Missouri State UniversityFollow

Abstract

Clustering is a relevant exploratory tool for a broad range of machine learning applications as it aids identification of meaningful subgroups. For a given clustering algorithm, multiple partitions can be obtained on the same data set by varying algorithmic parameters. Internal validation indices provide a means to objectively evaluate how well groupings obtained from a clustering configuration partitions the data, since there is no prior labeled data. This work presents a rigorous statistical evaluation framework that analyzes performance of internal validation indices based on correlation with external indices. A synthetic data generator that captures a wide range of complexity is proposed. Evaluation is conducted on a varied set of synthetic data types and real data sets to investigate performance of the indices.

Department(s)

Engineering Program

Document Type

Conference Proceeding

DOI

https://doi.org/10.1109/SSCI47803.2020.9308412

Keywords

clustering, statistical analysis, validation indices

Publication Date

12-1-2020

Recommended Citation

Nguyen, Thy, Jason Viehman, Dacosta Yeboah, Gayla R. Olbricht, and Tayo Obafemi-Ajayi. "Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization." In 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 3081-3090. IEEE, 2020.

Journal Title

2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020

Link to Full Text

COinS

College of Natural and Applied Sciences

Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization

Abstract

Department(s)

Document Type

DOI

Keywords

Publication Date

Recommended Citation

Journal Title

Browse

Search

Author Corner

College of Natural and Applied Sciences

Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization

Authors

Abstract

Department(s)

Document Type

DOI

Keywords

Publication Date

Recommended Citation

Journal Title

Share

Browse

Search

Author Corner