Genotype Combinations Linked to Phenotype Subgroups in Autism Spectrum Disorders
This paper investigates a computational model that allows for systematic comparison of phenotype data with genotype (Single Nucleotide Polymorphisms (SNPs)) data based on machine learning techniques to identify discriminant genotype markers associated with the phenotypic subgroups. The proposed discriminant SNP identifier model is empirically evaluated using Autism Spectrum Disorder (ASD) simplex sample. Six phenotype markers were selected to cluster the sample in a hexagonal lattice format yielding five multidimensional subgroups based on extremities of the phenotype markers. The SNP selection model includes random subspace selection of SNPs in conjunction with feature selection algorithms to determine which set of SNPs were discriminant among these five subgroups. This yielded a set of SNPs that attained a mean ROC performance of 95% using a Support Vector Machine prediction model. Biological analysis of these SNPs and associated genes across the subgroups is presented to examine their potential clinical significance.
autism spectrum disorder, clustering, feature selection, SNP analysis
Zhao, Junya, Thy Nguyen, Jonathan Kopel, Perry B. Koob, Donald A. Adieroh, and Tayo Obafemi-Ajayi. "Genotype Combinations Linked to Phenotype Subgroups in Autism Spectrum Disorders." In 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pp. 1-8. IEEE, 2019.