In the tradition of earlier Bertinoro Computational Biology meetings, invited speakers will present new results in an environment that promotes informal, interdisciplinary discussion. The schedule is unhurried with lots of coffee breaks.


Note the time gaps. Presentations are 40 minutes long or, if marked with *, 20 minutes long. However, the latter may end slightly later so as to accomodate for a full 20 minutes talk.

23 24 25 26 27 28 29

Sun Mon Tue Wed Thu Fri Sat
08.00–09.00 Arrivals Breakfast Departures
09.20–10.00 Tomas Vinar David Fernández-Baca Adam Siepel
David Bryant
10.00–10.30 Coffe
10.30–11.10 Vincent Daubin Webb Miller Jens Lagergren Oliver Eulenstein Daniel Huson
11.20–12.00 Luay Nakhleh Dannie Durand Thi Minh Anh Nguyen* Tamir Tuller Maureen Stolzer*
Joel Sjöstrand*
12.30–13.30 Lunch
14.20–15.00 Mukul Bansal Craig Nelson Sightseeing and dinner Todd Vision Departures
15.00–15.30 Coffe Coffe
15.30–16.10 Fredrik Ronquist Matthew Rasmussen* Ilan Wapinski
Mayo Röttger*
16.20–17.00 Jakub Kovac* Vincent Berry* Ovidiu Popa*

Preliminary abstracts

Conversion events in mammalian gene clusters
Webb Miller, Pennsylvania State University

A conversion event acts on two highly similar DNA regions (say, at least 95% nucleotide identity), overwriting an interval in one region with the contents of the corresponding interval of the other region. It is important to understand and identify these events because they can confuse evolutionary analyses, including attempts to assign orthologs and/or transfer functional annotations from one genome to another. We have developed an efficient computational method to identify where these events have occurred, and applied it to the entire human and mouse genomes. We also are applying the method as part of a collaborative effort to accurately sequence and analyze biomedically important primate gene clusters.

Genome networks root the tree of life between prokaryotic domains
Mayo Röttger, Heinrich-Heine University in Düsseldorf

Eukaryotes arose from prokaryotes, hence the root in the tree of life resides among the prokaryotic domains. The position of the root is still debated, though pinpointing it would aid our understanding of the early evolution of life. Because prokaryote evolution was long viewed as a tree-like process of lineage bifurcations, efforts to identify the most ancient microbial lineage split have traditionally focused on positioning a root on a phylogenetic tree constructed from one or several genes. Such studies have delivered widely conflicting results on the position of the root, this being mainly due to methodological problems inherent to deep gene phylogeny and the workings of lateral gene transfer among prokaryotes over evolutionary time. Here we report the position of the root determined with whole genome data using network-based procedures that take into account both gene presence or absence and the level of sequence similarity among all individual gene families that are shared across genomes. On the basis of 562,321 protein coding gene families distributed across 191 genomes, we find that the deepest divide in the prokaryotic world is interdomain, that is, separating the archaebacteria from the eubacteria. This result resonates with some older views but conflicts with the results of most studies over the last decade that have addressed the issue. In particular, several studies have suggested that the molecular distinctness of archaebacteria is not evidence for their antiquity relative to eubacteria, but instead stems from some kind of inherently elevated rate of archaebacterial sequence change. Here we specifically test for such a rate elevation across all prokaryotic lineages through the analysis of all possible quartets among eight genes duplicated in all prokaryotes, hence the last common ancestor thereof. The results show that neither the archaebacteria as a group nor the eubacteria as a group harbor evidence for elevated evolutionary rates, either in the recent evolutionary past or in their common ancestor. The interdomain prokaryotic position of the root is thus not attributable to lineage specific rate variation.

Reconstructing Ancestral Genomic Sequences by Co-Evolution: Efficient Algorithms, Computational Issues, and Biological Examples
Tamir Tuller, Weizmann Institute of Science

The inference of ancestral genomes is a fundamental problem in molecular evolution. Due to the statistical nature of this problem, the most likely or the most parsimonious ancestral genomes usually include considerable error rates. In general, these errors cannot be abolished by utilizing more exhaustive computational approaches, by using longer genomic sequences, or by analyzing more taxa.

This talk will describe a new approach for inferring ancestral genomic sequences, the ancestral co-evolver (ACE), which utilizes co-evolutionary information to improve the accuracy of such reconstructions over previous approaches. The talk will include 1) computational/algorithmic aspects of this approach and 2) a discussion about the biological soundness of the ancestral genomes inferred by the ACE.

Detecting Highways of Horizontal Gene Transfer
Mukul S. Bansal, Tel-Aviv University

Horizontal gene transfer (HGT) is an evolutionary process in which genes are transferred between two organisms that do not share an ancestor-descendant relationship. HGT plays an important role in bacterial evolution by allowing them to transfer genes across species. Thus, inferring the HGT events that occurred during the evolution of a set of species is an important biological problem. Each HGT event is associated with a pair of (possibly ancestral) species, between which genes have been transferred. It has been observed that between certain pairs of species many different genes have been transferred. Such a pair is called a highway of gene sharing. To date, no systematic methods exist for detecting such highways.

We present a method for inferring such highways for a given set of species. Our method is based on the fact that the evolutionary histories of horizontally transferred genes tend to disagree with the corresponding species phylogeny. Specifically, given a set of gene trees and a trusted rooted species tree, our method decomposes each gene tree into its constituent quartet trees and combines the quartet trees from all the gene trees to obtain a single weighted set of quartets. This set is then analyzed against the given species tree in order to infer the highways of gene sharing. We give an efficient algorithm that reduces the time complexity of finding each highway from O(n^6) to O(n^4), where 'n' is the number of species considered in the analysis. This makes it possible to apply our method to large-scale datasets. An application of our method to a dataset of 1128 genes from 11 cyanobacterial species illustrates the efficacy of our method in detecting highways of gene sharing.

A directed network of recent lateral gene transfer within prokaryotes reveals trends and barriers in gene acquisition
Ovidiu Popa, Heinrich-Heine University in Düsseldorf

Lateral gene transfer (LGT) is an important mechanism of natural variation among prokaryotes, but the extent of genomic exchange among different species and possible barriers to it are still debated. Here we report the use of directed phylogenetic networks that capture both vertical inheritance and recent lateral gene transfer among 657 prokaryotes. Most of the detectable gene transfers occur between species within the same taxonomical group. Gene acquisition occurs more frequently from donors having similar genome sequence or similar proteome to that of the recipient. This indicates that donor-recipient genome similarity is a barrier for lateral gene transfer in nature. However, species having the proteins that enable non-homologous end-joining (NHEJ) acquire genes more frequently from dissimilar donors than species lacking that mechanism. This suggests that NHEJ has a possible role in gene acquisition within prokaryotes supplying a bypass for the donor-recipient genome similarity barrier.

Reconstructing duplication histories of primate gene clusters
Tomas Vinar, Comenius University in Bratislava

Clusters of genes that have evolved by repeated segmental duplication present difficult challenges throughout genomic analysis, from sequence assembly to functional analysis. These clusters are one of the major sources of evolutionary innovation, and they are linked to multiple diseases, including HIV and a variety of cancers. Understanding their evolutionary histories is a key to the application of comparative genomics methods in these regions of the genome.

We will present our efforts in reconstructing duplication histories of several complex gene clusters on a phylogeny of primate genomes. We will describe both algorithmic advances, based on MCMC sampling, and advances in our data gathering efforts in this area. This is a joint work with Adam Siepel (Cornell University), Webb Miller (Penn State University), and Eric Green (NHGRI).

Estimation of ancestral human demography from individual genome sequences
Adam Siepel, Cornell University

Complete genome sequences are now available for individuals representing several distinct human populations. Interest in these sequences so far has focused on the technical feasibility of individual genome sequencing, the identification of single nucleotide and structural variations, and implications for personalized medicine. However, these data also represent a rich source of information about human evolution. Here I will describe an effort currently in progress to estimate evolutionary parameters such as the times at which major population groups diverged and the effective sizes of ancestral human populations, based on sequence data for seven human individuals, including West Africans, East Asians, individuals of European descent, and a Khoisan hunter gatherer from the Kalahari Desert in Southern Africa. This work involves an interesting mixture of traditional phylogenetics and population genetics, and also requires a number of challenging technical issues to be addressed. I will describe our statistical model, which is derived from the MCMCCOAL model by Rannala and Yang, and our Markov chain Monte Carlo methods for inference. I will also present preliminary results from our analysis and discuss their relationship with what is currently known about early human evolution.