github.com - HiCanu, a significant modification of the Canu assembler designed to leverage the full potential of HiFi reads via homopolymer compression, overlap-based error correction, and aggressive false overlap filtering.
More...
If we only had Illumina reads, we could also assemble these using the tool Spades.
You can try this here, or try it later on your own data.
Get data
We will use the same Illumina data as we used above:
illumina_R1.fastq.gz: the Illumina...
bioen-compbio.bioen.illinois.edu - Rreference-Assisted Chromosome Assembly (RACA), an algorithm to reliably order and orient sequence scaffolds generated by NGS and assemblers into longer chromosomal fragments using comparative genome information and paired-end...
code.google.com - splitbam splits a BAM by chromosomes.
Using the reference sequence dictionary (*.dict), it also creates some empty BAM files if no sam record was found for a chromosome. A pair of 'mock' SAM-Records can also be added to those empty BAMs to...
github.com - Flye is a de novo assembler for long and noisy reads, such as those produced by PacBio and Oxford Nanopore Technologies. The algorithm uses an A-Bruijn graph to find the overlaps between reads and does not require them to be error-corrected. After...
shendurelab.github.io - LACHESIS is method that exploits contact probability map data (e.g. from Hi-C) for chromosome-scale de novo genome assembly.
Further information about LACHESIS, including source code, documentation and a user's guide are available...
github.com - Pollux: General-purpose error corrector that corrects errors introduced by Illumina, Ion Torrent, and Roche 454 sequencing technologies and can be applied to single- or mixed-genome data. In addition to correcting substitution errors, we locate and...
github.com - Tool for detecting and cleaning PacBio / Nanopore long reads after whole genome amplification. Check the poster from the Revolutionizing Next-Generation Sequencing (2nd edition) conference in the source...
gite.lirmm.fr - An error correction method that uses long reads only. The method consists of two phases: first, we use an iterative alignment-free correction method based on de Bruijn graphs with increasing length of k-mers, and second, the corrected reads are...
journals.plos.org - Illumina Sequencing data can provide high coverage of a genome by relatively short (most often 100 bp to 150 bp) reads at a low cost. Even with low (advertised 1%) error rate, 100 × coverage Illumina data on average has an error in some read...