
PDA - Phylogenetic Diversity Algorithm version 0.1
--------------------------------------------------

Copyright (C) 2006 Bui Quang Minh, Steffen Klaere, Arndt von Haeseler 

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or   
(at your option) any later version. 

===========================================================================

DESCRIPTION:
------------

This software implements two efficient algorithms to solve the 
Phylogenetic Diversity (PD) problem as formulated by Faith (1992) as 
follows: Given a phylogenetic tree of n species (either bifurcating
or multifurcating), find a subset of k species whose subtree has maximal 
sum of branch lengths (PD score). Our algorithms, gPDA and pPDA, apply 
two strategies, namely greedy and pruning, to solve this problem in time 
O(nlogk) and O(n+(n-k)log(n-k)), respectively. 

===========================================================================

METHOD:
-------

The method is described in detail in the following paper:

Minh B.Q., Klaere S. and von Haeseler A. (2006): Phylogenetic Diversity
within Seconds. Syst. Biol., in press.

===========================================================================

COMMAND-LINE OPTIONS:
---------------------

Usage: ./pda <user_tree> [OPTIONS]
OPTIONS:
  -h                Display the help screen.
  <user_tree>       User tree file in NEWICK format.
  -k <num_leaves>   Number of leaves to be preserved.
  -r <num_leaves>   Create a random tree under Yule-Harding model and
                write to the <user_tree>.
  -ru <num_leaves>  Create a random tree under Uniform model and
                write to the <user_tree>.

  -g, --greedy      Only run the greedy algorithm.
  -p, --pruning     Only run the pruning algorithm.
  -b, --both        Run both algorithm.
    NOTE that by default, program automatically choose the better of them.
  -e <file>         File containing weights of taxa.
  -i <file>         File containing taxa to be included into PD-tree.

Example usages:
---------------
./pda test.tree -k 10
Infer the maximal PD-tree of 10 taxa from the tree in test.tree (in NEWICK
format). gPDA or pPDA algorithm will be determined automatically. 
Resulting tree will be written to test.tree.10.pdtree.

./pda test.tree -k 10 -g
Same as above, but only apply the gPDA algorithm.

./pda test.tree -k 10 -b
Run both algorithms. Resulting trees will be written into 
test.tree.10.greedy and test.tree.10.pruning.

./pda test.tree -k 10 -e test.pam
Read the weight information from test.pam file (more detail below) and
integrate this into the tree in test.tree. Then run the program as
the first example command.

./pda test.tree -k 10 -i test.taxa
Include the "favourite" taxa listed in test.taxa (more detail below) into
the final PD-set. 

./pda test.tree -k 10 -e test.pam -i test.taxa
Combining both features of the above two example commands.

./pda 1000.tree -r 1000
Generate a 1000-taxa random tree under Yule Harding Model. Write resulting
tree into 1000.tree file under NEWICK format.

<user_tree> option:
-------------------
The <user_tree> will be the input tree file if you specify to run
the algorithm by -k <num_leaves>. Otherwise, if you set -r[u] <num_leaves>,
the program will generate a random tree and write it into the <user_tree>
file.

More information on NEWICK tree format can be found at
http://evolution.genetics.washington.edu/phylip/newicktree.html.

-e <file> option:
-----------------
The <file> containing weights of taxa must be in the following
format:
	1. First line is a coefficient, which every branch length should
	be multiplied with. 
	2. Each of the rest lines contain taxon name and its weight the 
	"importance" of that taxon.
Any taxa which are not listed in the parameter file will be assigned a
weight of ZERO. If you prefer some taxa, you can give them a positive
weight. Specify a very high weight to your "favourite" taxa if you want
to include them into your final optimal PD set.

Please note that the additional parameters will be incoporated into the 
resulting PD tree, i.e., the final tree will also reflect the coefficient
and weight in its branch lengths.

More information on those additional parameters can be found in Steel (2005).

-i <file> option:
-----------------
The file containing all taxa names, which you want to include into your
final PD-set. The format is simply to list all names separated by blank(s)
or new line. NOTE that all names must be corresponding to the user tree 
file, otherwise an error will be displayed.

===========================================================================

OUTPUTS:
--------

Resulting trees are written into:

* If you specify -b or --both:
    <user_tree>.<k>.greedy for greedy algorithm, and
    <user_tree>.<k>.pruning for pruning algorithm.
* Otherwise:
    <user_tree>.<k>.pdtree.


If you choose option to generate a random tree, it will be written to the
<user_file>.

===========================================================================

INSTALLATION:
-------------

    To build PDA from the sources you need a functional C++ compiler
    installed (This is usually the case on UNIX/Linux systems. For 
    Windows you might want to obtain CygWin or XCode for MacOSX). 
    Then you can follow the procedure below:

    1) Download the current version of the software (pda-XXX.tar.gz where 
       XXX is the current version number) from its 
       web page <http://www.cibiv.at/software/pda>.
    2) Extract the files (e.g., with tar xvzf 'pda-XXX.tar.gz' under Unix)
       This should create a directory pda-XXX.
    3) Change into this directory.
    4) To compile the program, type the following:

         ./configure

       This should configure the package for the build. You might also 
       want to refer to the INSTALL file for more (general) details.

         make

       This compiles and builds the executable 'pda'
       (or 'pda.exe' on Windows systems) to be found in the 'src'
       directory.  This executable can copied to your system's search path
       such that it is found by your system or it can be installed
       to the default destination (e.g., /usr/local/bin on UNIX/Linux) using 

         make install

    If you encounter problems, please ask your local administrator for help.


*****************************************************************************
*    This program is distributed in the hope that it will be useful, but    *
*    WITHOUT ANY WARRANTY; without even the implied warranty of             *
*    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU       *
*    General Public License for more details.                               *
*****************************************************************************
