Abstract

In gene expression data analysis, biclustering has proven to be an effective method of finding local patterns among subsets of genes and conditions. The task of evaluating the quality of a bicluster when ground truth is not known is challenging. In this analysis, we empirically evaluate and compare the performance of eight popular biclustering algorithms across 119 synthetic datasets that span a wide range of possible bicluster structures and patterns. We also present a method of enhancing performance (relevance score) of the biclustering algorithms to increase confidence in the significance of the biclusters returned based on four internal validation measures. The experimental results demonstrate that the Average Spearman’s Rho evaluation measure is the most effective criteria to improve bicluster relevance with the proposed performance enhancement method, while maintaining a relatively low loss in recovery scores.

Department(s)

Engineering Program

Document Type

Conference Proceeding

DOI

https://doi.org/10.5220/0006662502020213

Rights Information

This paper is distributed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.

Keywords

Biclustering, Evaluation, Gene Expression Pattern Recognition, Validation Measures

Publication Date

1-1-2018

Journal Title

ICPRAM 2018 - Proceedings of the 7th International Conference on Pattern Recognition Applications and Methods

Share

COinS