******************************************************************************** ******************************************************************************** ***** IQPNNI: Moving Fast Through Tree Space and Stopping in Time ***** ******************************************************************************** ******************************************************************************** Copyright (C) 2004, John von Neumann, Forschungszentrum Juelich, Germany and Bioinformatics Institute, Heinrich Heine University Duesseldorf, Germany. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation. ================================================================================ The method is described in detail in the following article: Le Sy Vinh and Arndt von Haeseler, IQPNN: Moving fast through tree space and stopping in time, Mol. Biol. Evol. 21(8):1565-1571,2004. http://dx.doi.org/10.1093/molbev/msh176dd Main Contributors Le Sy Vinh NIC-Forschungszentrum Juelich, Germany vinh(AT)cs.uni-duesseldorf.de Arndt von Haeseler NIC-Forschungszentrum Juelich, Germany and Bioinformatics Institute, Heinrich Heine University Duesseldorf, Germany haeseler(AT)cs.uni-duesseldorf.de Heiko A. Schmidt (technical constributions and testing) NIC-Forschungszentrum Juelich, Germany hschmidt(AT)cs.uni-duesseldorf.de ================================================================================ NEW FEATURES:We have already included three new features into IQPNNI version 2.0 1. General Time Reversible model of evolution 2. Site-specific substitution rates 3. Checking point: If the program was crashed or stopped by users, it can continue from the last stopped point. IQPNNI is a computer program to recontruct the evolutionary relationships among contemporary species. It is menu-driven program which allows users to specify the parameter values or let the program estimate them from the input data (a nucleotide or amino acid alignment in PHYLIP format). The options are classified into four main groups, general options, IQP options, subsitution process options, and rate heterogeneity options. GENERAL OPTIONS o Display as outgroup? T0 n Number of iterations? 44 s Stopping rule (if applicable)? Yes IQP OPTIONS p Probability of deleting a sequence? 0.5 k Number representatives? 4 SUBSITUTION PROCESS d Type of sequence input data? Nucleotides m Model of substitution? HKY85 (Hasegawa et al. 1985) t Ts/Tv ratio (0.5 for JC69)? Estimate from data f Base frequencies? Estimate from data RATE HETEROGENEITY w Model of rate heterogeneity? Uniform rate GENERAL OPTIONS The option 'o': Users can specify a sequence as the outgroup sequence. The final tree with the highest likelihood will be rooted with respect to the outgroup sequence. The option 'n': Users can specify the number of iterations or use the default value. The option 's': Users can choose one of four posibilities to stop the program. 1. The first possibility is "s Stopping rule (if applicable)? Yes" It means that the program will stop and output the optimal tree with 95% confidence if at least three better trees found during the search, otherwise it will stop after 'n' iterations. 2. The second possibility is "s Stopping rule (if applicable)? Yes, but at least 'n' iterations" It is similar to the first possibility, but the program will run at least 'n' iterations. 3. The third possibility is "s Stopping rule (if applicable)? Yes, but at least 44 iterations" It is similar to the first possibility, but the program will run at most 'n' iterations. 4. The last possibility is "s Stopping rule? No" It means that the program will stop after 'n' iterations. IQP OPTIONS The option 'p': Users can specify the probability of deleting a sequence or let the program estimate it from the input data. Note that, when the sequence length is very long users should increase the value of p and try different runs with various choices of p. The option 'k': One can specify number of representatives leaves for a rooted tree. However, we strongly recommend to use the default value. THE SUBSITUTION PROCESS If the input data is nucleotide the program can work with Juke-Cantor 69, Kimura 80, Felsenstein 81, and HKY85 and General Time Reversible models of evolution. In case of Amino acids, the following models are available: Dayhoff 1978, Def. JTT 1992, VT 2000, mtREV 1996, BLOSUM62 1992, WAG 2000. The option 'd': Users must specify the type of sequence input data: 1. Nucleotides or 2. Amino acids. The option 'f': Users can specify the base frequencies, or let the program estimate them from the input data. The option 't': If users chose HKY85 model or TN93 model, ones can specify the transition/transversion ratio, or let the program estimate it from the input data. If ones decide to estimate 't', then we estimate ts/tv value between 0.2 and 32.0. The option 'u': If users chose TN93 model, ones can specify the pyrimidine/purine ratio, or let the program estimate it from the input data. If ones decide to estimate 'u', then we estimate py/pu value between 0.2 and 32.0. The option 'g': If users chose General Time Reversible model, ones can specify six different rate parameters: 1. Transversion rate from A to C, 2. Transition rate from A to G, 3. Transversion rate from A to T, 4. Transversion rate from C to G, 5. Transition rate from C to T, 6. Transversion rate from G to T, or let the program estimate them from the input data. RATE HETEROGENEITY The program can also work with rate heterogeneity assumption. Users can chose uniform rate over all sites (rate homogeneity), site-specific substitution rates (Sonja Meyer and Arndt von Haeseler, Identifying Site-Specific Substitution Rates, Mol. Biol. Evol. 20(2).2003), or Gamma distributed rates. The option 'a': If users chose Gamma distributed rate ones can specify the gamma distribution pharameter alpha or let the program estimate it from the input data. If ones decide to estimate the gamma distribution pharameter alpha, then we estimate the alpha value between 0.1 and 100.0. The option 'c': If users chose Gamma distributed rate ones must specify the number of Gamma rate categogies between 2 and 32. ******************************************************************************** * This program is distributed in the hope that it will be useful, but * * WITHOUT ANY WARRANTY; without even the implied warranty of * * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * * General Public License for more details. * ********************************************************************************