GitHub software repository
The latests (maybe unstable) versions of all my projects are available on GitHub HERE
FRCurve (Feature Response Curve)
FRCurve is a straightforward method to evaluate de novo assemblies from the read-layouts even when no reference exists. The new tool extends the FRCurve approach (introduced by Narzisi
) to cases where lay-out information may have been obscured, as is true in many deBruijn-graph-based algorithms. As a by-product, FRCurve now expands its applicability to a much wider class of assemblers, thus, identifying higher-quality members of this group, their inter-relations as well as sensitivity to carefully selected features, with or without the support of a reference sequence or layout for the reads.
A paper about FRCurve is currently under revision. A preliminary version is available on arxiv.
Software is avaialble for download HERE
. Check my GitHub
for the latest version.
GAM-NGS (Genome Assemblies Merger for Next Generation Sequencing)
The incapability to obtain correct and reliable assemblies using a single assembler is forcing the introduction of new algorithms able to enhance de novo assemblies. With more than 20 available assemblers it is hard to select the best tool. In this context we propose a tool that improves assemblies (and, as a by-product, perhaps even assemblers) by merging them and selecting the sequence that is most likely to be correct. GAM-NGS is able to merge two or more assemblies and it rteturns an improved assembly (more contiguous and more correct). GAM-NGS shows its full potential with multi-library Illumina-based projects.
A paper about GAM-NGS is currently under revision. GAM-NGS is currently used to assemble Spruce genome whose estimated size is 20 Gbp
Riccardo Vicedomini (PhD student at Udine University) is GAM-NGS main developer. The latest stable GAM-NGS is available HERE
. Latest varsion available at Riccardo's GitHub
ERNE: Extended Randomized Numerical AlignEr
ERNE (Extended Randomized Numerical AlignEr) is rNA (randomized Numerical Aligner) successor. ERNE is primarly a a software designed to align the huge amount of data produced by Next Generation Sequencers, in particular by Illumina sequencers. The main RNE feature is the fact that it achieves an accuracy greater than the majority of other tools (e.g.
, BWA) in a feasible amount of time. ERNE works with single as well as paired ends reads and it allows indels and delta-search (an alignment option that allows to deal with repetitive seqeunces) for better accuracy. A graphical user interface is provided for not command-line-skilled users. ERNE is able to take advantage of multiple nodes clusters thank to a MPI-based version.
ERNE is composed by several modules:
- ERNE-CREATE: preprocess a reference file to be used with ERNE-MAP and ERNE-BS5;
- ERNE-MAP: align short DNA or RNA reads against a genome;
- ERNE-DMAP: is designed to tackle the main computational bottleneck of all classical parallel implementation of aligners: references longer than 4 Gbp. ERNE-DMAP is able to spread the computation over the nodes of a cluster.
- ERNE-BS5: align short reads trated with bisulfite against a genome;
- ERNE-FILTER: is a quality trimming and contamination filter for de novo assembly
- ERNE-VISUAL: provides a easy-to-use graphical interface for ERNE-FILTER and (coming soon) other ERNE mapping tools.
Software and more informations are avaialable HERE
GapFiller: closing the gap within paired reads
GapFiller is not a standard de novo assembler. It aims "only" at closing the gap between pairs of reads as a first step of a large number of downstream analysis HERE
eRGA: enhanced Reference Guided Assembly
De novo assmebly is probably one of the most difficult task in today genomics. eRGA aims at producing an improve assembly in presence of a sequence belonging to a closely related organism. The perl source codes are freely available HERE
together with a small EXAMPLE