Document Detail

Combinatorics of distance-based tree inference.
MedLine Citation:
PMID:  23012403     Owner:  NLM     Status:  MEDLINE    
Several popular methods for phylogenetic inference (or hierarchical clustering) are based on a matrix of pairwise distances between taxa (or any kind of objects): The objective is to construct a tree with branch lengths so that the distances between the leaves in that tree are as close as possible to the input distances. If we hold the structure (topology) of the tree fixed, in some relevant cases (e.g., ordinary least squares) the optimal values for the branch lengths can be expressed using simple combinatorial formulae. Here we define a general form for these formulae and show that they all have two desirable properties: First, the common tree reconstruction approaches (least squares, minimum evolution), when used in combination with these formulae, are guaranteed to infer the correct tree when given enough data (consistency); second, the branch lengths of all the simple (nearest neighbor interchange) rearrangements of a tree can be calculated, optimally, in quadratic time in the size of the tree, thus allowing the efficient application of hill climbing heuristics. The study presented here is a continuation of that by Mihaescu and Pachter on branch length estimation [Mihaescu R, Pachter L (2008) Proc Natl Acad Sci USA 105:13206-13211]. The focus here is on the inference of the tree itself and on providing a basis for novel algorithms to reconstruct trees from distances.
Fabio Pardi; Olivier Gascuel
Related Documents :
18082913 - Qsar, action mechanism and molecular design of flavone and isoflavone derivatives with ...
17097093 - Evaluation of chromatographic descriptors for the prediction of gastro-intestinal absor...
20024663 - Rigorous kinetic model considering positional specificity of lipase for enzymatic stepw...
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't     Date:  2012-09-25
Journal Detail:
Title:  Proceedings of the National Academy of Sciences of the United States of America     Volume:  109     ISSN:  1091-6490     ISO Abbreviation:  Proc. Natl. Acad. Sci. U.S.A.     Publication Date:  2012 Oct 
Date Detail:
Created Date:  2012-10-10     Completed Date:  2013-01-08     Revised Date:  2013-07-11    
Medline Journal Info:
Nlm Unique ID:  7505876     Medline TA:  Proc Natl Acad Sci U S A     Country:  United States    
Other Details:
Languages:  eng     Pagination:  16443-8     Citation Subset:  IM    
Institut de Biologie Computationelle, Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier, Centre National de la Recherche Scientifique, Université Montpellier II, Montpellier, France.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Cluster Analysis
Computational Biology / methods*
Computer Simulation
Evolution, Molecular
Models, Genetic*
Reproducibility of Results

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Prolonged myelination in human neocortical evolution.
Next Document:  Mammalian transcription factor A is a core component of the mitochondrial transcription machinery.