Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions

Samuel A. Danziger, David J. Reiss, Alexander V. Ratushny, Jennifer J. Smith, Christopher L. Plaisier, John D. Aitchison, Nitin S. Baliga

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Biclustering is a popular method for identifying under which experimental conditions biological signatures are co-expressed. However, the general biclustering problem is NP-hard, offering room to focus algorithms on specific biological tasks. We hypothesize that conditional co-regulation of genes is a key factor in determining cell phenotype and that accurately segregating conditions in biclusters will improve such predictions. Thus, we developed a bicluster sampled coherence metric (BSCM) for determining which conditions and signals should be included in a bicluster. Results: Our BSCM calculates condition and cluster size specific p-values, and we incorporated these into the popular integrated biclustering algorithm cMonkey. We demonstrate that incorporation of our new algorithm significantly improves bicluster co-regulation scores (p-value = 0.009) and GO annotation scores (p-value = 0.004). Additionally, we used a bicluster based signal to predict whether a given experimental condition will result in yeast peroxisome induction. Using the new algorithm, the classifier accuracy improves from 41.9% to 76.1% correct. Conclusions: We demonstrate that the proposed BSCM helps determine which signals ought to be co-clustered, resulting in more accurately assigned bicluster membership. Furthermore, we show that BSCM can be extended to more accurately detect under which experimental conditions the genes are co-clustered. Features derived from this more accurate analysis of conditional regulation results in a dramatic improvement in the ability to predict a cellular phenotype in yeast. The latest cMonkey is available for download at https://github.com/baliga-lab/cmonkey2. The experimental data and source code featured in this paper is available http://AitchisonLab.com/BSCM. BSCM has been incorporated in the official cMonkey release.

Original languageEnglish (US)
Article numberS1
JournalBMC systems biology
Volume9
Issue number2
DOIs
StatePublished - Apr 15 2015
Externally publishedYes

ASJC Scopus subject areas

  • Structural Biology
  • Modeling and Simulation
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Bicluster Sampled Coherence Metric (BSCM) provides an accurate environmental context for phenotype predictions'. Together they form a unique fingerprint.

Cite this