Up to Research, Theory group at Nada, KTH.
Computational biology
This project focus on developing new and biologically relevant
algorithms in the the following areas: genome evolution, phylogeny,
and identification of regulatory sequences.
One approach to reveal the function of genes is to correlate phenotype
evolution with genome evolution. Genome evolution also provides an
opportunity to establish the correspondence between genes in different
genomes (orthology analysis), which can be used to translate knowledge
of gene function in model organism to the corresponding knowledge for
humans.
In a genome, the genes evolve through nucleotide substitutions. The
evolution of the genome is also shaped by a multitude of other
evolutionary events acting at different organizational levels. Larger
genome segments are affected by processes such as duplication, lateral
transfer (where a segment of an organisms genome is transfered to the
genome of another organism), inversion, transposition, deletion and
insertion. Being able to identify genes that have been laterally
transfered and count the number of lateral transfer events is crucial
for the resolution of the existence of a tree of life. Finally, the
whole genome is influenced by speciation and hybridization of organism
lineages (where a new species is created by the fusion of two
organisms genomes). The complexity of genome evolution poses a
serious challenge in developing mathematical models and algorithms.
A classical problem in computational biology is that of inferring the
evolutionary history of a set of species. The evolutionary history is
represented by a phylogenetic tree. Due to duplications and lateral
transfers gene trees (i.e. phylogenetic trees for gene families) and
the corresponding species tree may disagree. We have studied the
algorithmic problem: for a given a set of disagreeing gene trees, find
the species tree that explains the disagreement using a minimum number
of duplications. We have also given a mathematically rigid and
biologically sound model for lateral transfers and a fast algorithm
for the problem: given a gene tree and a species tree, find the
minimum number of lateral transfers that explains the difference
between the given trees.
Recently, we have started developing algorithms for identification of
regulatory sequences. There exist basically two algorithmic approaches
two this problem. In the first, promoter regions of co-regulated genes
are searched for similar substrings. In the second, promoter regions
of a gene family are searched for similar substrings. The approaches
yields different algorithmic questions, since in the latter case the
notion of similarity can be defined relative to a species tree.
Up to Research, Theory group at Nada, KTH.
Responsible for this page: Jens Lagergren <jensl@nada.kth.se>
Latest change October 17, 2002
Technical support: <webmaster@nada.kth.se>