The 
CIBIV wants to understand the
            processes that have shaped the genomes of contemporary species.
            To this end we apply methods from statistics, computer sciences,
            mathematics and computational statistics to develop models that
            mimic the process of evolution.
            These methods are further investigated in close collaboration with
            "wet" biologists to address real biological questions.
            
            
            
Currently we are working (in collaboration with various colleagues) on
            the following aspects of molecular evolution:
            
            
            
            
                
                - Alignments
                    
 Statistics of sequence alignment (i.e. mcmcalgn).
                    Recently we have extended this approach to reconstruct an alignment and
                    a phylogenetic tree simultaneously.
                - Sequence evolution
                    
 To understand sequence evolution it is necessary to model the substitution
                    process. We are working on models sequence that allow dependencies among
                    sequence sites (Markov fields seem to be an appropriate tool).
                    We are developing test statistics to select the "best" model, to detect
                    groups of sequence that evolve differently form the rest of a gene family,
                    say. We have developed a test to detect change points (branches where the
                    substitution model changes) in a phylogenetic tree.
                    Currently we are working on methods to detect the dependency structure
                    among sequence positions in an alignment.
                - Gene trees
                    
 We develop efficient heuristic algorithms to reconstruct trees based
                    on sequence data (i.e. TREE-PUZZLE).
                    To this end we have developed parallel TREE-PUZZLE program.
                    Moreover, we are currently developing a variant of TREE-PUZZLE,
                    which computes (maximum) likelihood trees for up to 1,000 sequences
                    in reasonable time. We are also working on super tree methods to
                    merge different gene trees to form one species tree.
                    Quartet based tree reconstruction method appear as a versatile tool
                    to study super trees from a new perspective.
                - Population genetics
                    
 Gene trees appear in a natural context also in populations, here, however,
                    the gene tree in a population is a random variable if a sample of sequences
                    is drawn from the population. We are interested in the development and
                    application of coalescence based methods to infer the demographic history
                    of populations. In the future we plan to work on coalescence processes with
                    complex interactions patterns. In this context we have constructed the so
                    called hvrbase, where currently most
                    of the hypervariable regions from the mitochondrial genome from primates
                    are collected in a multiple sequence alignment. This user friendly database
                    is currently extended to store other genomic regions.
                - Complex pattern of evolution
                    
 To reconstruct the evolutionary history it is necessary to take more
                    complex events like lateral gene transfer (between species),
                    gene duplication, and gene loss into account. A combination of these
                    events may disturb the relation between species trees and gene trees.
                    Recently, we have developed a maximum likelihood based method to estimate
                    the amount of gene flow among prokaryotes by analyzing the COG database.
                    This full genome analysis poses a collection of new computational problems
                    as well as modeling problems. Our "Jukes Cantor" type of modeling gene
                    transfer needs refinements. Moreover, we have to take into account
                    duplication and losses of genes. This will be done in the next future.
                - Species tree
                
 The topics outlined above will eventually be employed to reconstruct
                    one gigantic species tree utilizing all the sequence data available for
                    the different species. Models of sequence evolution are necessary to
                    detect differently evolving regions in complete genomes. Tree
                    reconstruction methods for a large number of sequences allow the
                    reconstruction of gene trees with several hundred sequences, and finally
                    the patchiness of the available sequence data for different species
                    makes it necessary to apply super tree methods. A better understanding 
                    of complex evolutionary patterns will also reveal instances where the gene 
                    trees are different from the species tree. Once this is well understood 
                    it seems reasonable to construct a sequenced based tree of life.