bild
School of
Electrical Engineering
and Computer Science

LAPD: Estimating Protein Distances

Description

This program estimates pairwise distances for a set of protein sequences. The output can be used in conjunction with other phylogeny programs, such as Phylip.

A special feature is that the specialized rate matrices, estimated by modelestimator, can be used as input to the program.

Availability

lapd is distributed under the GNU General Public License, and is available for immediate download. The software is a Perl script that reads input, produces script files for, and pipes them to, Octave that does the actual computations. A fairly recent version of Octave is needed as the 'list' function must be supported.

Join the lapd mailing-list for notices about updates, communicating other users, etc! The list covers both lapd and modelestimator.

Please acknowledge the use of LAPD in your research, preferably by citing this web site.

Bootstrapping

lapd is not a direct replacement for protdist (from Phylip), and in particular it cannot handle multiple datasets as is needed for Phylip-style bootstrapping. For that purpose an accompanying program is available: bootstrap_lapd. Like lapd, it is written in Perl. It repeatedly calls lapd on the datasets in its infile.

Usage

Usage: lapd [<options>] <infile> 

The infile should be in either FASTA, STOCKHOLM, or PHYLIP format.
Output is a matrix of expected distances and, if possible, estimates
of standard deviation.

Options:
   -indels        Remove gap columns. A gap is denoted by '-'.

   -ml            Compute a Maximum Likelihood estimate instead. This option
                  implies -sd.

   -sd            Do not output a matrix with standard deviations after the
                  distance matrix.

   -id            Output percent identity.

   -jc            Use a simplistic Jukes-Cantor model.
   -jck           Use -jc, but use Kimura's correction.
   -jcss          Like -jck, but using Storm-Sonhammer's correction.

   -wag           Default. Use the WAG matrix (see Wheelan and Goldman, 2001).
   -jtt           Use the JTT matrix (see Jones, Taylor, Thornton, 1992).
   -day           Use the Dayhoff matrix (Dayhoff et al, 1978).
   -arve          Use the Arvestad matrix.
   -mv            Use the Muller-Vingron matrix (2000).

   -f <file>      Read matrix and equilibrium distribution from file.

   -pfam          Use a normal distribution as distance prior, estimated
                  from Pfam 7.2.

   -s <int>       "Speed". High speed results in low precision.
                  Default is 5. Valid range is [1, 10].

   -v             Verbose. Show progress info on STDERR.
   -octave <path> Point to the Octave binary to run. 'octave -q' by default.
   -d             Debug option. Output Octave commands to STDOUT.

   -u, -h         This help text.


Published by: Lars Arvestad <arve@csc.kth.se>
Updated 2014-09-24