MCNet: a method for the construction of phylogenetic networks based on the MonteCarlo method.  
Jump to Full Text  
MedLine Citation:

PMID: 20727135 Owner: NLM Status: MEDLINE 
Abstract/OtherAbstract:

BACKGROUND: A phylogenetic network is a generalization of phylogenetic trees that allows the representation of conflicting signals or alternative evolutionary histories in a single diagram. There are several methods for constructing these networks. Some of these methods are based on distances among taxa. In practice, the methods which are based on distance perform faster in comparison with other methods. The NeighborNet (NNet) is a distancebased method. The NNet produces a circular ordering from a distance matrix, then constructs a collection of weighted splits using circular ordering. The SplitsTree which is a program using these weighted splits makes a phylogenetic network. In general, finding an optimal circular ordering is an NPhard problem. The NNet is a heuristic algorithm to find the optimal circular ordering which is based on neighborjoining algorithm. RESULTS: In this paper, we present a heuristic algorithm to find an optimal circular ordering based on the MonteCarlo method, called MCNet algorithm. In order to show that MCNet performs better than NNet, we apply both algorithms on different data sets. Then we draw phylogenetic networks corresponding to outputs of these algorithms using SplitsTree and compare the results. CONCLUSIONS: We find that the circular ordering produced by the MCNet is closer to optimal circular ordering than the NNet. Furthermore, the networks corresponding to outputs of MCNet made by SplitsTree are simpler than NNet. 
Authors:

Changiz Eslahchi; Mahnaz Habibi; Reza Hassanzadeh; Ehsan Mottaghi 
Related Documents
:

16701355  Do early branching lineages signify ancestral traits? 19640885  Velvet worm development links myriapods with chelicerates. 15019625  Phylogeny and evolution of basils and allies (ocimeae, labiatae) based on three plastid... 19582725  The new framework for understanding placental mammal evolution. 19382535  Creation of functional micro/nano systems through topdown and bottomup approaches. 22934035  The importance of modeling epileptic seizure dynamics as spatiotemporal patterns. 
Publication Detail:

Type: Journal Article; Research Support, NonU.S. Gov't Date: 20100820 
Journal Detail:

Title: BMC evolutionary biology Volume: 10 ISSN: 14712148 ISO Abbreviation: BMC Evol. Biol. Publication Date: 2010 
Date Detail:

Created Date: 20100914 Completed Date: 20101222 Revised Date: 20130528 
Medline Journal Info:

Nlm Unique ID: 100966975 Medline TA: BMC Evol Biol Country: England 
Other Details:

Languages: eng Pagination: 254 Citation Subset: IM 
Affiliation:

Faculty of Mathematics, Shahid Beheshti University, GC, Tehran, Iran. cheslahchi@sbu.ac.ir 
Export Citation:

APA/MLA Format Download EndNote Download BibTex 
MeSH Terms  
Descriptor/Qualifier:

Algorithms Computational Biology / methods Models, Theoretical Monte Carlo Method* Phylogeny* 
Comments/Corrections 
Full Text  
Journal Information Journal ID (nlmta): BMC Evol Biol ISSN: 14712148 Publisher: BioMed Central 
Article Information Copyright ©2010 Eslahchi et al; licensee BioMed Central Ltd. openaccess: Received Day: 9 Month: 9 Year: 2009 Accepted Day: 20 Month: 8 Year: 2010 collection publication date: Year: 2010 Electronic publication date: Day: 20 Month: 8 Year: 2010 Volume: 10First Page: 254 Last Page: 254 Publisher Id: 1471214810254 PubMed Id: 20727135 DOI: 10.1186/1471214810254 
MCNet: a method for the construction of phylogenetic networks based on the MonteCarlo method  
Changiz Eslahchi1  Email: cheslahchi@sbu.ac.ir 
Mahnaz Habibi1  Email: mhabibi@ipm.ir 
Reza Hassanzadeh12  Email: re.hassanzadeh@mail.sbu.ac.ir 
Ehsan Mottaghi1  Email: mottaghi.ehsan@mail.sbu.ac.ir 
1Faculty of Mathematics, Shahid Beheshti University, G.C., Tehran, Iran 

2School of Computer Science, Institute for Studies in Theoretical Physics and Mathematics (IPM), Tehran, Iran 
Phylogenetics is concerned with the construction and analysis of phylogenetic trees or networks to understand the evolution of species, populations, and individuals. Evolutionary processes such as hybridization between species, lateral transfer of genes, recombination within a population, and convergent evolution can all lead to evolutionary histories that are distinctly nontreelike. Moreover, even when the underlying evolution is treelike, the presence of conflicting or ambiguous signals can make a single tree representation inappropriate. In these situations, phylogenetic network methods can be particularly useful.
Phylogenetic network is a generalization of phylogenetic trees that can represent several trees simultaneously. For any network construction method, the conflicting signals should be represented in the network but it is vital that the network does not depict more conflict than is found in the data. At the same time, when the data fits well to a tree, the method should return a network that is close to a tree. Recently, in addition to biology, the phylogenetic networks methods are widely used for classifying different types of data such as those finding in linguistics, music, etc. There are many different methods to construct phylogenetic trees or networks which are based on distance matrix such as ME (minimum evolution) [^{1}], LS (least squares) [^{2},^{3}], NJ (neighborjoining) [^{4}], AddTree [^{5}], NNet (neighbornet) [^{6}] and QNet [^{7}]. All these methods are called distancebased methods.
ME is one of the most wellknown methods. It was first introduced by Kidd and SgamarellaZonta [^{1}]. Given a distance matrix, the ME principle consists of selecting the tree whose length (sum of its branch lengths) is minimal among all tree topologies for taxa. Comparative studies of treebuilding methods show that ME generally is an accurate criterion for selecting a true tree. Nei and Rzhetsky have shown that ME principle is statistically consistent when branch lengths are assigned by ordinary leastsquares (OLS) fitting [^{8}]. In the OLS framework, we simply minimize
∑i,j∈X(dij −δi,j)2, 
where δ_{ij }is an estimation of input d_{ij }and X is the set of taxa. In fact, the main goal is to find a tree whose induced metric is closer to d_{ij}. The LS was first introduced in [^{2}] and [^{3}].
Nearly 20 years have passed by since the landmark paper in Molecular Biology and Evolution introducing NJ [^{4}]. The method has become the most widely used method for building phylogenetic trees from distances. Steel and Gascuel showed that NJ is a greedy algorithm for ME principle [^{9}]. The NNet is a hybrid of NJ and split decomposition [^{10}]. It is applicable to data sets containing hundreds of taxa. The NNet is an algorithm for constructing phylogenetic networks.
Split decomposition, implemented in SplitsTree [^{11}], decomposes the distance matrix into simple components based on weighted splits. These splits are then represented using a special type of phylogenetic network called split network. The NNet works in a similar way: it first produces a circular ordering from distance matrix and then constructs a collection of weighted splits. Dan Levy and Lior Patcher showed that the NNet is a greedy algorithm for the traveling salesman problem that minimizes the balanced length of the split system at every step and it is optimal for circular distance matrices [^{12}]. Balanced minimum evolution (BME) is designed under the ME principle [^{13}]. The BME is a special version of the ME principle where tree length is estimated by the weighted least squares [^{13}].
In this work, we introduce MCNet algorithm (MonteCarlo Network algorithm) which works in a similar way: First, it finds a circular ordering for taxa, based on MonteCarlo with simulated annealing, it then extracts splits from the circular ordering and uses nonnegative least squares for weighting splits. We compare the results of the NNet and the MCNet for several data sets.
A split of a given set X of taxa is a bipartition of the set X into two nonempty subsets of X. A split is called trivial if one of the two subsets contains only one taxon. Let T be a nonempty tree. Let the leaves of the T are labeled by the set of taxa, X ={x_{1},...,x_{n}}. Every edge e of T defines a split S = AB, where A and B are two sets of taxa contained in the two components of T  e. For example, Figure 1 shows an eightleaf tree. Removing the edge e from the tree produces two sets of leaves
A={t3,t4,t5} and B={t1,t2,t6,t7,t8}. 
In an edgeweighted tree, the weight of each edge is assigned to its corresponding split. The Phyletic distance between any two taxa x and y in an edgeweighted tree is the sum of the weights of the edges along the path from x to y. Hence, the phyletic distance between x and y equals the sum of split weights for all those splits in which x and y belong to separate components.
Two different splits S_{1 }= A_{1}B_{1}, and S_{2 }= A_{2}B_{2}, are compatible, if one of the following conditions holds:
A1⊆A2,A1⊆B2,B1⊆A2 or B1⊆B2. 
A collection of splits is called compatible, if all possible pairing of splits are compatible. A compatible collection of splits is represented by a phylogenetic tree [^{14},^{15}]. Dress and Huson introduced SplitsTree to display more complex evolutionary patterns [^{16}]. For a set of incompatible splits, SplitsTree outputs the split network using bands of parallel edges.
Circular collection of splits is a mathematical generalization of compatible collections of splits. Formally, a collection of splits of X is circular if there exists an ordering x_{1},⋯,x_{n }of X such that every split is of the form {x_{i}, x_{i+1},⋯,x_{j}}X  {x_{i},⋯,x_{j}} for some i and j, 1 ≤ i ≤ j ≤ n. A Compatible collection of splits are always circular [^{10}]. On the other hand, the class of circular collection of splits contains the class of the collection of splits corresponding to a tree. Andreas Dress and Daniel Huson proved that circular collections of splits always have a planar splits graph representation [^{16}]. A distance matrix is circular (also called Kalmanson) if it is the phyletic distances for a circular collection of splits with positive weights. Because compatible splits are circular, treelike distances are circular too [^{6}].
As mentioned above, the ME principle consists of selecting a tree whose length is minimal. In fact, the ME principle is equivalent to finding a circular ordering σ = {x_{σ(1)},...,x_{σ(n)}} in order to find the minimum of the function η
(1)
η:∑→ℜη(σ)=d(xσ(1),xσ(n))+∑k=1n−1d(xσ(k),xσ(k+1)) 
Where Σ is the set of all circular orderings of taxa x_{1},...,x_{n}. We call function η the energy function, and any circular ordering that minimizes η is called the optimal circular ordering.
There are a number of different methods for constructing various kinds of phylogenetic networks. A phylogenetic network can be constructed from a collection of weighted splits. NNet uses circular ordering to construct a collection of weighted splits. Since finding an optimal circular ordering is an NPhard problem, so we introduce a heuristic algorithm based on the MonteCarlo method to find optimal circular ordering. The MCNet seeks to find an optimal circular ordering from the distance matrix and then extracts a collection of weighted splits based on that ordering.
In this section, a new algorithm called the MCNet, is presented to construct a set of weighted splits for taxa set X = {x_{1},...,x_{n}}with a given distance matrix. The MCNet consists of two steps. In the first step, we find a circular ordering. In the second step, the splits which are obtained from the circular ordering are weighted. The core of the first step contains two procedures, namely, INITIAL and the MonteCarlo. The INITIAL is a greedy algorithm to obtain a circular ordering, namely, the initial circular ordering. The INITIAL works in the following way:
Suppose x_{σ(1)},...,x_{σ(k) }are ordered and let x¯ be an element of S = X  {x_{σ(1)},...,x_{σ(k)}} such that
d(x¯,r)=min{d(x,r)x∈S,r∈{xσ(1),xσ(k)}}. 
If r = x_{σ(1)}, we consider the new ordering x¯,xσ(1),…,xσ(k). Otherwise the ordering xσ(1),…,xσ(k),x¯ is considered. This process stops when all taxa are ordered.
The second procedure, or the MonteCarlo procedure, relies on random iteration to find the optimal circular ordering. The MonteCarlo algorithm starts its movement from the initial circular ordering, σ_{0 }For each circular ordering σ, we define the neighborhood of σ, N (σ), by:
N(σ)={σ~∈∑∃k; 2≤k≤n−1; σ~={xσ(1),…,xσ(k−1),xσ(k+1),…,xσ(n),xσ(k)}}, 
where Σ is the set of all circular orderings.
We choose σ_{1 }∈ N (σ_{0}) randomly. if η (σ_{1}) ≤ η (σ_{0}), then the system moves into ordering σ_{1}. However we allow nongreedy movements for the system in order to avoid having the system trapped in local minima. In other words, if η (σ_{1}) >η (σ_{0}), then the system moves into ordering σ_{1 }with a small probability e−η(σ1)+η(σ0)T, where T is a constant temperature. For each temperature, these movements are carried out t times. To compute the minimum energy we allow this process to continue until the temperature drops to zero (see the appendix for more details). Pseudo code of the MonteCarlo algorithm is shown in Table 1. It is notticeable that the second procedure can start from any circular ordering other than the one obtained by the INITIAL procedure.
In the final step, we use the least squares algorithm to weight the splits of obtained circular ordering. Let A be the matrix with rows indexed by pairs of taxa and columns indexed by splits. Then for each pair of texa i and j and for each split k, A_{ij,k }is defined by:
Aij,k={1if i and j are on opposite of split k;0otherwise. 
The matrix A = [A_{ij,k}] is full rank [^{17}].
Let d = (d_{12}, d_{13},...,d_{(n1)n}) be an n(n  1)/2 dimensional vector corresponding to rows of A where d_{ij }is obtained by distance matrix. Let b be the weight vector of splits, then the phyletic distance vector is p = Ab.
Now, the ordinary least squares(OLS) is used to estimate b by the following standard formula
(2)
b=(A′A)−1A′d. 
If we discard splits with negative weights and leave the remaining splits unchanged, the weight of the remaining splits are often grossly overestimated. Similar to the NNet algorithm, we compute the optimal least square estimates with a nonnegative constraint. In this paper, we use the FNNLS algorithm [^{18}].
In this section, we compare the results of the MCNet and the NNet on some data sets. We use SplitsTree4 program [^{19}] for drawing phylogenetic networks. Due to the limitation of space, we insert only six figures in this article.
One of the data sets, a collection of 110 Salmonella MLST Data, was obtained from authors of the NNet. The other data sets presented as the examples in SplitsTree4 program (version 4.10): Its(46 taxa), Jsa (46 taxa), Mammals (30 taxa), Primates (12 taxa), Rubber (23 taxa), Dolphins (36 taxa) and Myosin (143 taxa).
There are two parameters, T_{low }and cooling coefficient, in the MonteCarlo procedure. We first adjust T_{low }between 10^{5 }and 0.2 to obtain the best cooling coefficient. The value of energy function and running time of algorithm for each T_{low }for JSA data are given in Figure 2 (for the other data sets, the figures are the same as JSA). According to Figure 2, when cooling coefficient is 0.95, running time of the algorithm compared to other coefficients increases considerably. On the other hand, the value of energy function for 0.95 or 0.9 as a cooling coefficient is significantly better than the other cooling coefficients. Hence, we conclude that the best value of energy function with respect to running time of the algorithm is achieved when cooling coefficient is 0.9 and T_{low }< 10^{3}.
The initial test for performance of our method is done by calculating the value of energy function for circular orderings obtained by the MCNet and the NNet (Table 2). The first two rows of Table 2 show that in all data sets except Salmonella, the value of energy function for the MCNet is less than those obtained from the NNet. The interesting feature of the MCNet algorithm is in finding different circular orderings by changing initial ordering. So, the MCNet algorithm could take the circular ordering obtained by the NNet as initial ordering. The third row of Table 2 shows the values of energy function for circular orderings achieved by the MCNet with the circular ordering obtained by the NNet as an initial ordering. For four data sets, Its, Rubber, Salmonella, Myosin, the third row indicates better results than the first row. But for the other data sets, the conclusions mentioned above are the vice versa.
Another test for the performance of our method is comparing the number of splits obtained by both the algorithms. In Table 3, the number of splits of circular orderings obtained by the MCNet and the NNet on different data sets are shown. In all data sets the number of splits obtained by the MCNet is less than the NNet except Primates. In this case, these two numbers are equal.
Let d be the input distance vector and P and P' are the phyletic distance vector of weighted splits obtained by the MCNet and the NNet, respectively. In Table 4, the value of norm of P  d and P'  d for each data set are shown. The norm of P  d is less than P'  d in all data sets even in Primates. It means that the results of the MCNet algorithm give better approximation for input distance vector.
To illustrate difference between two algorithms, we present some examples of networks obtained by both the MCNet and the NNet using SplitsTree4 (Figures 3,4,5,6, 7 and 8). It is obvious that both algorithms give the same classification of taxa and exhibit the same major splits. For example, in Figures 5 and 6, we highlight some edges such that by removing the samecolored edges, the same clustering of taxa is obtained. But according to what we see in Tables 3 and 4, split networks obtained by the MCNet are less complicated than split networks obtained by the NNet. It means that the networks obtained by the MCNet have less noise than the networks obtained by the NNet. According to Corollary 1 (see Appendix), when t approaches to 1, the MCNet finds optimal circular ordering with the probability 1. We examined our algorithm on several treelike distance matrices and it returned corresponding trees quickly. The MCNet has been implemented in Matlab and is available for download at http://bioinf.cs.ipm.ac.ir/softwares/mc.net.
In this work, we propose an algorithm, MCNet, which is a distance based method for constructing phylogenetic networks. The MCNet scales well and can quickly produce detailed and informative networks for large number of taxa. We compare the performance of the MCNet with the NNet on eight different data sets. We have shown (Tables 2, 3 and 4) that the MCNet performs better than the NNet for almost test cases and the networks obtained by the MCNet are simpler than the NNet with the same major splits. The NNet is a part of SplitsTree program. So, the results of the MCNet could be used in SplitsTree program too.
CE, RH and EM performed initial studies. MH designed the algorithm. RH and EM analysis the data sets. All authors participated in the writing of the manuscript. All authors read and approved the final manuscript.
Let S = {E_{1},...,E_{s}} be a finite set of states, and consider a physical process having these discrete states at time t. A Markov chain is a stochastic model of this system, such that the state of system at time t + 1 depends only on the state of system at time t.
Consider X_{0}, X_{1},..., be a collection of Markov random variables, such that X_{n }is the state of the system at time n. Let p_{ij }be the probability that the system enters into the state E_{j }from the state E_{i}, where i, j ∈ {1,...,s} The matrix P = (p_{ij})_{1≤i, j≤s }is called transition matrix. A probability distribution q = (q_{1},...,q_{s}) such that q_{i }is the probability that system starts its movement from the state E_{i }, is called initial probability distribution. A Markov chain is a stochastic model X_{0}, X_{1},..., such that X_{t }is the state of the system at time t. For each i and j in {1,2,...,s};
prob(X0=Ei)=qi,prob(Xt+1=EjXt=Ei)=pij. 
The Markov chain is irreducible, if for all i, j ∈ {1,...,s} there exists n > 0 such that pij(n)>0, where
∀α pij(n)=prob(Xn+α=EjXα=Ei). 
In other words, the Markov chain is irreducible, if there exist n such that the probability that the system enters into the state E_{j }from the state E_{i }after n times is positive. The irreducible Markov chain is called aperiodic, if for some n ≥ 0 and some state E_{j},
prob(Xn=EjX0=Ej)>0&prob(Xn+1=EjX0=Ej)>0. 
If the Markov chain is irreducible and aperiodic then
limt→∞prob(Xt=Ej)=πj j=1,…,s 
such that π = (π_{1},...,π_{s}) is a unique probability distribution and πj=∑i=1sπipij.
The probability distribution is π is called stationary probability of the Markov chain.
It means that if P is the transition matrix and P^{(t) }is the t^{th }power of P, when t → ∞ the j^{th }column of transition matrix is approximately equal to π_{j}. In the MonteCarlo algorithm, a special kind of Markov chain is used. Let Σ be the finite set of states and q=(1Σ,…,1Σ) is the initial probability distribution. For each state i the neighborhood of i, N(i), is defined as the set of all the states that are reachable from i by one movement. In this system the set of neighborhoods have to satisfy the following properties:
1. i, ∉ N(i).
2. i ∈ N(j) ⇔ j ∈ N(i)
3. if i ≠ j, then there exit i_{1},i_{2},...,i_{1 }∈ Σ such that
i∈N(i1),i1∈N(i2),…,il∈N(j). 
The matrix PT=(pijT)i,j∈Σ is defined as the transition matrix by
pijT={1N(i)if j∈N(i) and η(j)≤η(i),e−(η(j)−η(i))/TN(i)if j∈N(i) and η(j)>η(i),1−∑k∈Σ,k≠ipikTif i=j,0otherwise, 
where T is a positive constant number (constant temperature). The third property of the neighborhood shows that this Markov chain is irreducible. Also, if PiiT>0 and P^{T }contains nonnegative entries then (PT)ii(t)>0 for all t ≥ 0. So, it is a finite, aperiodic and irreducible Markov chain. The theorem 1 shows that for each constant temperature T and i ∈ Σ, there exists a stationary probability distribution πiT such that:
limt→∞prob(Xt=i)=πiT, 
Where πiT=e−η(i)T∑j∈∑e−η(j)T (see page 45 in [^{20}]).
Proposition 1. Let (πiT)i∈Σ be a probability distribution such that:
πiT=e−η(i)T∑j∈∑e−η(j)T 
and suppose that m_{0 }= min{η(i)  i ∈ Σ} and, η_{0 }= {i ∈ Σ  η(i) = m_{0}} then for each i∈Σ,limT→0+πiT=πi0, where
πi0={1η0if i∈η0;0otherwise. 
Proof: The proof is presented in [^{20}] (claim 2.8 and claim 2.9).
Corollary 1. Let Σ be the finite set of states, then for each i ∈ Σ we have
limT→0+limt→∞prob(Xt=i)=πi0. 
The corollary 1 illustrates that by cooling temperature (T → 0^{+}), system enters into one of the states of η_{0 }with the probability 1 after t (t → ∞) time. In this article, we define the set of all circular orderings of taxa as the finite set of states. Our definition of neighborhood in the MCNet satisfies in three properties of neighborhood and every elements of η_{0 }is an optimal circular ordering. Therefore, the MCNet yields a circular ordering with approximately minimal energy function.
We are grateful to the faculty of mathematics of Shahid Beheshti University. This work is supported in part by IPM(cs138502). The authors would like to thank Prof. Hamid Pezeshk for many useful comments.
References
Kidd KK,SgamarellaZonta LA,Phylogenetic analysis: concepts and methodsAm J Human GeneticsYear: 197123235252  
CavalliSforza LL,Edwards AWF,Phylogenetic analysis: models and estimating proceduresAm J Hum GenetYear: 196719233257 (1967). 6026583  
Fitch WM,Margoliash E,Construction of phylogenetic treesScienceYear: 196715527928410.1126/science.155.3760.2795334057  
Saitou N,Nei N,The neighbor joining method: a new method for reconstructing phylogenetic treesMolecular Biology and EvolutionYear: 198744064253447015  
Sattath S,Tversky A,Phylogenetic similarity treesPsychometrikaYear: 19774231934510.1007/BF02293654  
Bryant D,Moulton V,NeighborNet: An agglomerative method for the construction of planar phylogenetic networksMolecular Biology And EvolutionYear: 20042125526510.1093/molbev/msh01814660700  
Grunewald S,Forslund K,Dress A,Moulton V,QNet: An agglomerative method for the construction of phylogenetic networks from weighted quartetsMolecular Biology and EvolutionYear: 20072453253810.1093/molbev/msl18017119010  
Rzhetsky A,Nei M,Theoretical foundation of the minimumevolution method of phylogenetic inferenceMol Biol EvolYear: 199310107310958412650  
Gascuel O,Steel M,Neighborjoining revealedMolecular Biology and EvolutionYear: 2006231997200010.1093/molbev/msl07216877499  
Bandelt HJ,Dress AWM,Split decomposition: A new and useful approach to phylogenetic analysis of distance dataMol Phyl EvolYear: 1992124225210.1016/10557903(92)900218  
Huson DH,SplitsTree: A program for analyzing and visualizing evolutionary dataBioinformaticsYear: 19981410687310.1093/bioinformatics/14.1.689520503  
Levy D,Patcher L,The NeighborNet AlgorithmAdvances in Applied Mathematics in press . 20161590  
Desper R,Gascuel O,Gascuel OThe MinimumEvolution Distance Based Approach to Phylogenetic InferenceMath Evolution and PhylogenyYear: 2005Oxford Univ. Press  
Semple C,Steel M,Cyclic permutations and evolutionary treesAdv Appl MathYear: 200432466968010.1016/S01968858(03)000988  
Semple C,Steel M,PhylogeneticsYear: 2003Oxford, UK: Oxford University Press  
Dress A,Huson DH,Constructing splits graphsIEEE/ACM Transactions in Computational Biology and BioinformaticsYear: 2004110911510.1109/TCBB.2004.27  
Bandelt HJ,Dress A,A canonical decomposition theory for metrics on a finite setAdv MathYear: 1992924710510.1016/00018708(92)90061O  
Bro R,Jong SD,A Fast Nonnegativityconstrained Least Squares AlgorithmJournal of ChemometricsYear: 199711539340110.1002/(SICI)1099128X(199709/10)11:5<393::AIDCEM483>3.0.CO;2L  
Huson D,Bryant D,Application of phylogenetic networks in evolutionary studiesMolecular Biology and EvolutionYear: 20052325426710.1093/molbev/msj03016221896  
Clote P,Backofen R,Computational molecular biologyYear: 2000New York, WILEY 
Figures
[Figure ID: F1] 
Figure 1
The split S = AB is obtained by removing the edge e of T. 
[Figure ID: F2] 
Figure 2
The value of energy function (b) and running time of algorithm (a) for each T_{low }for JSA data. 
[Figure ID: F3] 
Figure 3
The NNet network for the Rubber data set. 
[Figure ID: F4] 
Figure 4
The MCNet network for the Rubber data set. 
[Figure ID: F5] 
Figure 5
The NNet network for the Mammal data set. 
[Figure ID: F6] 
Figure 6
The MCNet network for the Mammal data set. 
[Figure ID: F7] 
Figure 7
The NNet network for the Salmonella data set. Group A includes the isolates Sty54, Sty54*, Sty2, She9, Sty87, Snp40*, Sty13, Snp41*, Sen5, Sha160, Sha141, Sty20*, Sha58, Sse18, Sha71, Sty31. Group B includes the isolates Sty61, Sha148, Smb17, Sag75, Sha124. Group C includes the isolates UND3, Sha150, Sha173, Sen23*, Sha153, Sha140, San96, Sen30*, Sen24*, Sha138, Sha176, Sha130, Sha164, Sha157, Sen29*, Sca93, Sha122, Sht20, Sha186. Group D includes the isolates She3, Sha50, Sse95, Sha56, Sen24, Sen34, Sha177, Sty13*, Swo44, Sty86, Ste41, Sha77, UND80. Group E includes the isolates Ssc40, Sse28, Sty89, Sty15*, Ske69, UND110, Sha49, Sen4, Sha48, Sha165, Sty92, Snp33*, Sty52, UND109, Sha131, Sha102, Sty6, Sha175. 
[Figure ID: F8] 
Figure 8
The MCNet network for the Salmonella data set. Group A includes the isolates Sty54, Sty54*, Sty2, She9, Sty87, Snp40*, Sty13, Snp41*, Sen5, Sha160, Sha141, Sty20*, Sha58, Sse18, Sha71, Sty31. Group B includes the isolates Sty61, Sha148, Smb17, Sag75, Sha124. Group C includes the isolates UND3, Sha150, Sha173, Sen23*, Sha153, Sha140, San96, Sen30*, Sen24*, Sha138, Sha176, Sha130, Sha164, Sha157, Sen29*, Sca93, Sha122, Sht20, Sha186. Group D includes the isolates She3, Sha50, Sse95, Sha56, Sen24, Sen34, Sha177, Sty13*, Swo44, Sty86, Ste41, Sha77, UND80. Group E includes the isolates Ssc40, Sse28, Sty89, Sty15*, Ske69, UND110, Sha49, Sen4, Sha48, Sha165, Sty92, Snp33*, Sty52, UND109, Sha131, Sha102, Sty6, Sha175. 
Tables
Pseudo code of the MonteCarlo algorithm with simulated annealing
Input: T initial temperature 
σ_{0 }initial ordering 
T_{low }low temperature 
t constant number 

σ =σ_{0} 
While T > T_{low} 
Repeat t time 
choose random σ~∈N(σ) 
If η(σ~)≤η(σ) 
σ=σ~ 
Else 
x = random(0, 1) 
If x<e−η(σ~)+(σ)T 
σ=σ~ 
T = T * 0.9 
Return σ and η(σ) 
Values of energy function: the values of energy function for circular orderings obtained by the NNet, the MCNet and the MCNet with initial ordering of the NNet.
Data set  Its  Jsa  Mammals  Primates 

NNet  0.4096  0.2808  4.4275  2.1465 
MCNet  0.4079  0.2728  4.4172  2.1410 
start NNet  0.3979  0.2767  4.4202  2.1410 




Data set  Rubber  Dolphins  Salmonella  Myosin 


NNet  0.7723  2.2  0.2546  43.8199 
MCNet  0.7596  2.1667  0.2575  43.8019 
start NNet  0.7547  2.2  0.2515  43.6935 
The number of splits obtained by the MCNet and the NNet for all data sets.
Data set  Its  Jsa  Mammals  Primates 

NNet  110  83  103  34 
MCNet  105  78  99  34 




Data set  Rubber  Dolphins  Salmonella  Myosin 


NNet  55  67  107  520 
MCNet  53  62  90  507 
The value of norm for all data sets.
Data set  Its  Jsa  Mammals  Primates 

NNet  0.0444  0.0329  0.0717  0.0385 
MCNet  0.0358  0.0292  0.0648  0.0358 




Data set  Rubber  Dolphins  Salmonella  Myosin 


NNet  0.0362  0.1068  0.0487  0.0291 
MCNet  0.0316  0.1019  0.0405  0.0207 
Article Categories:

Previous Document: Kafka, paranoic doubles and the brain: hypnagogic vs. hyperreflexive models of disrupted self in ne...
Next Document: Risk factors and correlates for anemia in HIV treatmentnaïve infected patients: a crosssectional ...