We present a framework for an explainable and statistically validated ensemble clustering model applied to Traumatic Brain Injury (TBI). The objective of our analysis is to identify patient injury severity subgroups and key phenotypes that delineate these subgroups using varied clinical and computed tomography data. Explainable and statistically-validated models are essential because a data-driven identification of subgroups is an inherently multidisciplinary undertaking. In our case, this procedure yielded six distinct patient subgroups with respect to mechanism of injury, severity of presentation, anatomy, psychometric, and functional outcome. This framework for ensemble cluster analysis fully integrates statistical methods at several stages of analysis to enhance the quality and the explainability of results. This methodology is applicable to other clinical data sets that exhibit significant heterogeneity as well as other diverse data science applications in biomedicine and elsewhere.


Engineering Program

Document Type




Rights Information

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/


canonical discriminant analysis, Clustering, ensemble learning, explainable AI, hybrid human-machine systems, mixed models, multicollinearity, precision medicine

Publication Date


Journal Title

IEEE Access