Ensemble validation paradigm for intelligent data analysis in autism spectrum disorders


Cluster analysis is an important exploratory tool for a broad range of applications including data analysis of biomedical datasets to uncover meaningful subgroups such as in autism spectrum disorder (ASD). For a given clustering algorithm, multiple results can be obtained on the same dataset by varying the algorithm parameters. In biomedical applications, discovering meaningful subgroups, not just the optimal number of clusters, is expedient. It is imperative to develop quality measures capable of identifying optimal partitions for a given dataset. In this paper, we apply varied clustering methods to subgroup an ASD simplex sample based on relevant phenotype features that may uncover meaningful subtypes. We present a detailed cluster validation analysis using an ensemble validation paradigm and visualization techniques. We present a rigorous clinical/behavioral analysis of the top highly ranked results. The evaluation demonstrated that both configurations yielded similar clinical significance results: 2-subgroups configuration with distinct clinical profile.


Engineering Program

Document Type

Conference Proceeding



Publication Date


Journal Title

2018 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2018