### Abstract

Motivation: In the maximum parsimony (MP) method, the tree requiring the minimum number of changes (discrepancy) to explain the given set of DNA or amino acid sequences is chosen to represent their evolutionary relationships. To find the MP tree, the branch-and-bound algorithm is normally used. For a partial phylogenetic-tree (one that has a subset of the organisms) the traditional algorithm assigns a cost equal to the discrepancy of the partial phylogenetic-tree. We propose a single column discrepancy heuristic which increases this cost by predicting a minimum additional discrepancy needed to attach the sequences yet to be added to the partial phylogenetic-tree. A dynamic Max-mini order of sequence addition is also proposed to quickly terminate branch-and-bound search paths that are guaranteed to lead to suboptimal solutions. Results: We studied the running time of 47 problems generated from 17 data sets. The use of single column discrepancy heuristic speeded up the computation to 2.4-fold for static and 18.2-fold for dynamic search order The improvement appeared to increase exponentially with the number of sequences. The proposed strategies are also likely to be useful in speeding up the MP tree search using heuristic searches that are based on banch-and-bound-like algorithms. Contact s.kumar@asu.edu.

Original language | English (US) |
---|---|

Pages (from-to) | 140-151 |

Number of pages | 12 |

Journal | Bioinformatics |

Volume | 16 |

Issue number | 2 |

State | Published - Feb 2000 |

### Fingerprint

### ASJC Scopus subject areas

- Clinical Biochemistry
- Computer Science Applications
- Computational Theory and Mathematics

### Cite this

*Bioinformatics*,

*16*(2), 140-151.

**Single column discrepancy and dynamic max-mini optimizations for quickly finding the most parsimonious evolutionary trees.** / Purdom, P. W.; Bradford, P. G.; Tamura, K.; Kumar, S.

Research output: Contribution to journal › Article

*Bioinformatics*, vol. 16, no. 2, pp. 140-151.

}

TY - JOUR

T1 - Single column discrepancy and dynamic max-mini optimizations for quickly finding the most parsimonious evolutionary trees

AU - Purdom, P. W.

AU - Bradford, P. G.

AU - Tamura, K.

AU - Kumar, S.

PY - 2000/2

Y1 - 2000/2

N2 - Motivation: In the maximum parsimony (MP) method, the tree requiring the minimum number of changes (discrepancy) to explain the given set of DNA or amino acid sequences is chosen to represent their evolutionary relationships. To find the MP tree, the branch-and-bound algorithm is normally used. For a partial phylogenetic-tree (one that has a subset of the organisms) the traditional algorithm assigns a cost equal to the discrepancy of the partial phylogenetic-tree. We propose a single column discrepancy heuristic which increases this cost by predicting a minimum additional discrepancy needed to attach the sequences yet to be added to the partial phylogenetic-tree. A dynamic Max-mini order of sequence addition is also proposed to quickly terminate branch-and-bound search paths that are guaranteed to lead to suboptimal solutions. Results: We studied the running time of 47 problems generated from 17 data sets. The use of single column discrepancy heuristic speeded up the computation to 2.4-fold for static and 18.2-fold for dynamic search order The improvement appeared to increase exponentially with the number of sequences. The proposed strategies are also likely to be useful in speeding up the MP tree search using heuristic searches that are based on banch-and-bound-like algorithms. Contact s.kumar@asu.edu.

AB - Motivation: In the maximum parsimony (MP) method, the tree requiring the minimum number of changes (discrepancy) to explain the given set of DNA or amino acid sequences is chosen to represent their evolutionary relationships. To find the MP tree, the branch-and-bound algorithm is normally used. For a partial phylogenetic-tree (one that has a subset of the organisms) the traditional algorithm assigns a cost equal to the discrepancy of the partial phylogenetic-tree. We propose a single column discrepancy heuristic which increases this cost by predicting a minimum additional discrepancy needed to attach the sequences yet to be added to the partial phylogenetic-tree. A dynamic Max-mini order of sequence addition is also proposed to quickly terminate branch-and-bound search paths that are guaranteed to lead to suboptimal solutions. Results: We studied the running time of 47 problems generated from 17 data sets. The use of single column discrepancy heuristic speeded up the computation to 2.4-fold for static and 18.2-fold for dynamic search order The improvement appeared to increase exponentially with the number of sequences. The proposed strategies are also likely to be useful in speeding up the MP tree search using heuristic searches that are based on banch-and-bound-like algorithms. Contact s.kumar@asu.edu.

UR - http://www.scopus.com/inward/record.url?scp=0342424769&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0342424769&partnerID=8YFLogxK

M3 - Article

C2 - 10842736

AN - SCOPUS:0342424769

VL - 16

SP - 140

EP - 151

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 2

ER -