github.com - The pipeline can use information from scaffolded assemblies (for example from HiC or 10X Genomics), or even from diverged (~65-100 Mya) reference genomes for ordering the contigs and thus support the assembly process. This typically results in...
genomebiology.biomedcentral.com - REAPR is a tool that evaluates the accuracy of a genome assembly using mapped paired end reads, without the use of a reference genome for comparison. It can be used in any stage of an assembly pipeline to automatically break incorrect scaffolds and...
www.sanger.ac.uk - PAGIT addresses the need for software to generate high quality draft genomes. It is based on a series of programs that we developed:
ABACAS, that is able to contiguate contigs from a de novo assembly against a closely related reference.
IMAGE, an...
arthropods.eugenes.org - EvidentialGene is a genome informatics project, "Evidence Directed Gene Construction for Eukaryotes", to construct high quality, accurate gene sets for animals and plants, developed by Don Gilbert at Indiana University,...
You will have some previous experience with genome bioinformatics or other large scale scientific data analysis, or a newly qualified graduate student with data science skills interested in DNA sequence data. While desirable, previous experience...
nematodes.org - Blobsplorer is a tool for interactive visualization of assembled DNA sequence data ("contigs") derived from (often unintentionally) mixed-species pools. It allows the simultaneous display of GC content, coverage, and taxonomic annotation for...
github.com - Contig Annotation Tool (CAT) and Bin Annotation Tool (BAT) are pipelines for the taxonomic classification of long DNA sequences and metagenome assembled genomes (MAGs/bins) of both known and (highly) unknown microorganisms, as generated by...
github.com - lordFAST is a sensitive tool for mapping long reads with high error rates. lordFAST is specially designed for aligning reads from PacBio sequencing technology but provides the user the ability to change alignment parameters depending on the reads...
github.com - MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The...