Set of approaches based on 3D structure and position specific-scoring matrix for predicting DNA-binding proteins


Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. Motivation: Because DNA-binding proteins (DNA-BPs) play a vital role in all aspects of genetic activity, the development of reliable and efficient systems for automatic DNA-BP classification is becoming a crucial proteomic technology. Key to this technology is the discovery of powerful protein representations and feature extraction methods. The goal of this article is to develop experimentally a system for automatic DNA-BP classification by comparing and combining different descriptors taken from different types of protein representations. Results: The descriptors we evaluate include those starting from the position-specific scoring matrix (PSSM) of proteins, those derived from the amino-acid sequence (AAS), various matrix representations of proteins and features taken from the three-dimensional tertiary structure of proteins. We also introduce some new variants of protein descriptors. Each descriptor is used to train a separate support vector machine (SVM), and results are combined by sum rule. Our final system obtains state-or-the-art results on three benchmark DNA-BP datasets. Availability and implementation: The MATLAB code for replicating the experiments presented in this paper is available at https://github.com/LorisNanni.


Information Technology and Cybersecurity

Document Type




Rights Information

© 2018 The Author(s).

Publication Date


Journal Title