www.darwintreeoflife.org - The specimens were collected by the Oxford Wytham Woods and Edinburgh Lohse lab teams. DNA extraction and sequencing was carried out by the Sanger Institute Scientific Operations teams. Assemblies were carried out by the Tree of Life team (Shane...
github.com - HiCanu, a significant modification of the Canu assembler designed to leverage the full potential of HiFi reads via homopolymer compression, overlap-based error correction, and aggressive false overlap filtering.
More...
github.com - SqueezeMeta is a full automatic pipeline for metagenomics/metatranscriptomics, covering all steps of the analysis. SqueezeMeta includes multi-metagenome support allowing the co-assembly of related metagenomes and the retrieval of individual genomes...
github.com - AlignGraph2 is the second version of AlignGraph for PacBio long reads. It extends and refines contigs assembled from the long reads with a published genome similar to the sequencing genome.
More...
If we only had Illumina reads, we could also assemble these using the tool Spades.
You can try this here, or try it later on your own data.
Get data
We will use the same Illumina data as we used above:
illumina_R1.fastq.gz: the Illumina...
github.com - SqueezeMeta is a full automatic pipeline for metagenomics/metatranscriptomics, covering all steps of the analysis. SqueezeMeta includes multi-metagenome support allowing the co-assembly of related metagenomes and the retrieval of individual genomes...
bioinf.spbau.ru - SPAdes – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies. This manual will help you to install and run SPAdes. SPAdes version 3.7.1 was released under GPLv2 on March 8,...
github.com - Reads simulator
Wgsim is a small tool for simulating sequence reads from a reference genome. It is able to simulate diploid genomes with SNPs and insertion/deletion (INDEL) polymorphisms, and simulate reads with uniform substitution sequencing...
bioinformatics.oxfordjournals.org - Summary: Mate pair library sequencing is an effective and economical method for detecting genomic structural variants and chromosomal abnormalities. Unfortunately, the mapping and alignment of mate pair read pairs to a reference genome is a...
code.google.com - splitbam splits a BAM by chromosomes.
Using the reference sequence dictionary (*.dict), it also creates some empty BAM files if no sam record was found for a chromosome. A pair of 'mock' SAM-Records can also be added to those empty BAMs to...