github.com - Despite marked recent improvements in long-read sequencing technology, the assembly of diploid genomes remains a difficult task. A major obstacle is distinguishing between alternative contigs that represent highly heterozygous regions. If primary...
github.com - HapSolo, that identifies secondary contigs and defines a primary assembly based on multiple pairwise contig alignment metrics. HapSolo evaluates candidate primary assemblies using BUSCO scores and then distinguishes among candidate assemblies using...
github.com - Merfin, a k-mer based variant-filtering algorithm for improved accuracy in genotyping and genome assembly polishing. Merfin evaluates each variant based on the expected k-mer multiplicity in the reads, independently of the quality of the...
ratt.sourceforge.net - RATT is software to transfer annotation from a reference (annotated) genome to an unannotated query genome.
It was first developed to transfer annotations between different genome assembly versions. However, it can also transfer annotations between...
broadinstitute.github.io - Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. These file formats are defined in the Hts-specs repository. See especially the SAM specification and the VCF...
sc932.github.io - Assembly Likelihood Evaluation (ALE) framework that overcomes these limitations, systematically evaluating the accuracy of an assembly in a reference-independent manner using rigorous statistical methods. This framework is comprehensive, and...
www.ncbi.nlm.nih.gov - Background. Next-generation sequencing technologies are now producing multiple times the genome size in total reads from a single experiment. This is enough information to reconstruct at least some of the differences between the individual genome...
github.com - SGA is a de novo genome assembler based on the concept of string graphs. The major goal of SGA is to be very memory efficient, which is achieved by using a compressed representation of DNA sequence reads.
More at
https://github.com/jts/sga
SGA...
www.topcoder.com - Learning greedy algo for biologist.
https://www.topcoder.com/community/data-science/data-science-tutorials/greedy-is-good/
This webpage is also useful for the...