TY - JOUR
T1 - cMonkey2
T2 - Automated, systematic, integrated detection of co-regulated gene modules for any organism
AU - Reiss, David J.
AU - Plaisier, Christopher L.
AU - Wu, Wei Ju
AU - Baliga, Nitin S.
N1 - Publisher Copyright:
© The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2015/3/26
Y1 - 2015/3/26
N2 - The cMonkey integrated biclustering algorithm identifies conditionally co-regulated modules of genes (biclusters). cMonkey integrates various orthogonal pieces of information which support evidence of gene co-regulation, and optimizes biclusters to be supported simultaneously by one or more of these prior constraints. The algorithm served as the cornerstone for constructing the first global, predictive Environmental Gene Regulatory Influence Network (EGRIN) model for a free-living cell, and has now been applied to many more organisms. However, due to its computational inefficiencies, long run-time and complexity of various input data types, cMonkey was not readily usable by the wider community. To address these primary concerns, we have significantly updated the cMonkey algorithm and refactored its implementation, improving its usability and extendibility. These improvements provide a fully functioning and user-friendly platform for building co-regulated gene modules and the tools necessary for their exploration and interpretation. We show, via three separate analyses of data for E.coli, M. tuberculosis and H.sapiens, that the updated algorithm and inclusion of novel scoring functions for new data types (e.g. ChIP-seq and transcription factor over-expression [TFOE]) improve discovery of biologically informative co-regulated modules. The complete cMonkey2 software package, including source code, is available at https://github.com/baliga-lab/cmonkey2.
AB - The cMonkey integrated biclustering algorithm identifies conditionally co-regulated modules of genes (biclusters). cMonkey integrates various orthogonal pieces of information which support evidence of gene co-regulation, and optimizes biclusters to be supported simultaneously by one or more of these prior constraints. The algorithm served as the cornerstone for constructing the first global, predictive Environmental Gene Regulatory Influence Network (EGRIN) model for a free-living cell, and has now been applied to many more organisms. However, due to its computational inefficiencies, long run-time and complexity of various input data types, cMonkey was not readily usable by the wider community. To address these primary concerns, we have significantly updated the cMonkey algorithm and refactored its implementation, improving its usability and extendibility. These improvements provide a fully functioning and user-friendly platform for building co-regulated gene modules and the tools necessary for their exploration and interpretation. We show, via three separate analyses of data for E.coli, M. tuberculosis and H.sapiens, that the updated algorithm and inclusion of novel scoring functions for new data types (e.g. ChIP-seq and transcription factor over-expression [TFOE]) improve discovery of biologically informative co-regulated modules. The complete cMonkey2 software package, including source code, is available at https://github.com/baliga-lab/cmonkey2.
UR - http://www.scopus.com/inward/record.url?scp=84939604164&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84939604164&partnerID=8YFLogxK
U2 - 10.1093/nar/gkv300
DO - 10.1093/nar/gkv300
M3 - Review article
C2 - 25873626
AN - SCOPUS:84939604164
SN - 0305-1048
VL - 43
SP - e87
JO - Nucleic acids research
JF - Nucleic acids research
IS - 13
ER -