Reconstructing the evolutionary history of a set of species is a
fundamental problem in biology and methods for solving this problem
are gaged based on two characteristics: accuracy and
efficiency. Neighbor Joining (NJ) is a so-called distance-based method
that, thanks to its good accuracy and speed, has been embraced by the
phylogeny community. It takes the distances between n taxa and
produces in Θ(n3) time a phylogenetic tree, i.e., a tree which
aims to describe the evolutionary history of the taxa. In addition to
performing well in practice, the NJ algorithm has optimal
reconstruction radius.
The contribution of this paper is twofold: (1) we present an algorithm
called Fast Neighbor Joining (FNJ) with optimal reconstruction radius
and optimal run time complexity O(n2) and (2) we present a greatly
simplified proof for the correctness of NJ. Initial experiments show
that FNJ in practice has almost the same accuracy as NJ, indicating
that the property of optimal reconstruction radius has great
importance to their good performance. Moreover, we show how improved
running time can be achieved for computing the so-called correction
formulas.
Joint work with Jens Lagergren.
Probabilistically Checkable Proofs or
more succinctly PCPs have played a
significant role in theoretical computer
science lately.
Not only are they interesting in their own
right but they also lead to strong inapproximability
results for interesting optimization problems.
As a concrete example take satisfiability of
Boolean formulas. A classical NP-proof that a
formula is satisfiable is given by an assignment that
satisfies the formula and this is verified
by reading the entire proof and checking that
indeed the assignment satisfies the formula.
In PCP a probabilistic verifier also tries to verify
a written proof but is only allowed to do
a few random spot checks and may only read a very
small portion of the proof.
The PCP-theorem says that for satisfiability
and hence for any NP-statement,
there is a PCP that allows proofs of
polynomial size and such that the verifier reads
a constant number of bits, always accepts a
correct proof and rejects and proof of a false
NP-statement with probability at least 1/2.
In the application to inapproximability it is
important to optimize some of the parameters of
the PCP and in particular we will be interested
in proofs where the verifier only reads three bits.
The plan is to have an informal and hopefully
interactive seminar discussing these issues.
We will explain, but not prove the PCP-theorem.
We will then see how to change the proof
to decrease the number of bits read by the
verifier. In the process we will discuss
concepts such as Raz parallel repetition theorem and the
long code, the longest binary code defined
by Bellare, Goldreich and Sudan.
The seminar will be given at the board and depending
on audience participation the total duration can
be anywhere from one to two double lectures.
If all present understand Swedish the lecture will
be given in Swedish and otherwise in English.