GLEAN is an unsupervised learning system to integrate disparate sources of gene structure evidence (gene model predictions, EST/protein genomic sequence alignments, SAGE/peptide tags, etc) to produce a consensus gene prediction, without prior...
Gblocks eliminates poorly aligned positions and divergent regions of a DNA or protein alignment so that it becomes more suitable for phylogenetic analysis. This server implements the most important features of the Gblocks program to make its...
There are many tools to perform gap filling using Illumina short reads, for example "GapFiller: a de novo assembly approach to fill the gap within paired reads" or "Toward almost closed genomes with GapFiller". There are also some tools like...
Ranbow is a haplotype assembler for polyploid genomes. It has been developed for the haplotype assembly of the hexaploid sweet potato genome, which is highly heterozygous. Ranbow can also be applied to other polyploid genomes. After a first phasing,...
BFC is a standalone high-performance tool for correcting sequencing errors from Illumina sequencing data. It is specifically designed for high-coverage whole-genome human data, though also performs well for small genomes.
The BFC algorithm is a...
CrossMap is a program for convenient conversion of genome coordinates (or annotation files) between different assemblies (such as Human hg18 (NCBI36) <> hg19 (GRCh37), Mouse mm9 (MGSCv37) <> mm10 (GRCm38)).
It supports most commonly...
SEX-DETector is a probabilistic method that relies on RNAseq data from a cross (parents and progeny of each sex) to infer autosomal and sex-linked genes (genes located on the non recombining part of sex chromosomes).
How does SEX-DETector...
Here’s the quick and dirty of what was done:
1 Run minimap:
This uses a pre-built set of defaults (the ava-pb in the code below) for analyzing PacBio data. Minimap only accepts two FASTQ files and you need to map your FASTQ file...
The k-mer Weighted Inner Product.
This software implements a de novo, alignment free measure of sample genetic dissimilarity which operates upon raw sequencing reads. It is able to calculate the genetic dissimilarity between samples without any...
Porechop is a tool for finding and removing adapters from Oxford Nanopore reads. Adapters on the ends of reads are trimmed off, and when a read has an adapter in its middle, it is treated as chimeric and chopped into separate reads. Porechop...
WhatsHap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.
Features
Very accurate...
Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs. Manta discovers, assembles...
GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. It can be used to align against multiple genomes simulanteously or against a single...
pbalign aligns PacBio reads to reference sequences, filters aligned reads according to user-specific filtering criteria, and converts the output to either the SAM format or PacBio Compare HDF5 (e.g., .cmp.h5) format. The output Compare HDF5 file...
Minialign is a little bit fast and moderately accurate nucleotide sequence alignment tool designed for PacBio and Nanopore long reads. It is built on three key algorithms, minimizer-based index of the minimap overlapper, array-based seed chaining,...
%MM calculates whether a given gene sequence encodes amino acids using the most common codons possible, the least common codons possible, or (most typically) some combination of these extremes. See our PLoS ONE paper for more details on how the...