software.broadinstitute.org - Genome STRiP (Genome STRucture In Populations) is a suite of tools for discovering and genotyping structural variations using sequencing data. The methods are designed to detect shared variation using data from multiple individuals.Genome STRiP...
When you have both Illumina and Nanopore data, then SPAdes remains a good option for hybrid assembly - SPAdes was used to produce the B fragilis assembly by Mick Watson’s group.
Again, running spades.py will show you the...
In graph theory, a string graph is an intersection graph of curves in the plane; each curve is called a "string". String graphs were first proposed by E. W. Myers in a 2005 publication.
cran.r-project.org - Most variant calling pipelines result in files containing large quantities of variant information. The variant call format (vcf) is an increasingly popular format for this data. The format of these files and their content is discussed in...
github.com - A de novo genome assembly can be summarised b
y a number of metrics, including:
Overall assembly length
Number of scaffolds/contigs
Length of longest scaffold/contig
Scaffold/contig N50 and N90Assembly base composition, in...
http://assemblytics.com/ - Download and install MUMmer
Align your assembly to a reference genome using nucmer (from MUMmer package)
$ nucmer -maxmatch -l 100 -c 500 REFERENCE.fa ASSEMBLY.fa -prefix OUT
Consult the MUMmer manual if you encounter problems
Optional: Gzip...
www.sanger.ac.uk - ACT is a Java application for displaying pairwise comparisons between two or more DNA sequences. It can be used to identify and analyse regions of similarity and difference between genomes and to explore conservation of synteny, in the context of...
github.com - HM2 can process any diploid assemblies, but it is especially suitable for diploid assemblies with high heterozygosity (≥3%), which can be difficult for other tools. This pipeline also implements flexible and sensitive assembly error detection, a...
sybil.sourceforge.net - The Sybil software package provides a primarily web-based front-end to comparative genome datasets warehoused in a chado relational database. It was developed by the bioinformatics department at The Institute for Genomic Research (TIGR) and...
gwct.github.io - Modern genome sequencing technologies provide a succint measure of quality at each position in every read, however all of this information is lost in the assembly process. Referee summarizes the quality information from the reads that map to a site...