Random Subspace Projection for Predicting Biogeographical Ancestry

Abstract

Human biogeographical ancestry estimation using genomic information is an important problem with applications in population stratification, admixture mapping, forensic ancestry inference, and in healthcare. Various studies have proposed panels of ancestry informative single nucleotide polymorphisms (SNPs) for distinguishing between widely separated continental populations. There has been limited investigation on identifying SNP panels for sub-continental ancestry prediction, especially given the difficult challenge of identifying SNP markers to distinguish closely associated sub-populations, for instance, within a continent. In this study, we propose an ancestry informative SNP selection algorithm exploiting the concept of random subspace projection using supervised learning. The proposed approach identifies small panels of useful SNPs for subcontinental level ancestry classification. We show results for sub-continental level classification for all five continents in our dataset.

Department(s)

Engineering Program

Document Type

Conference Proceeding

DOI

https://doi.org/10.1109/BIBM.2018.8621222

Keywords

Ancestry Classification, DNA, Random Subspace Projection, Single Chromosome, SNP, SNP Selection

Publication Date

1-21-2019

Share

COinS