To decide which strategy should be our “preferred” genome assembly approach based on data rather than my gut-feeling about the “best assembly” I decided to do some testing with a known “true” reference E Coli K12 MG1655
github.com - genome simulation across a population with zeta-distributed allele frequency, snps, insertions, deletions, and multi-nucleotide polymorphisms
More at https://github.com/ekg/mutatrix
./mutatrix -S sample -P test/ -p 2 -n 10 reference.fasta
ftp.ncbi.nih.gov - Now a days there are a lots of genomics databases available around the world. This bookmark is created to provide all links in one place ...
ftp://ftp.ncbi.nih.gov/genomes/
https://hgdownload.soe.ucsc.edu/downloads.html
github.com - odgi provides an efficient and succinct dynamic DNA sequence graph model, as well as a host of algorithms that allow the use of such graphs in bioinformatic analyses.
Careful encoding of graph entities allows odgi to efficiently...
www.genomicus.bio.ens.psl.eu - Genomicus is a genome browser that enables users to navigate in genomes in several dimensions: linearly along chromosome axes, transversaly across different species, and chronologicaly along evolutionary time.
Once a query gene has been entered, it...
With the emergence of NGS technologies, and sequencing data most of the bioinformaticians mung and wrangle around massive amounts of genomics text. There are several "standardized" file formats (FASTQ, SAM, VCF, etc.) and some tools for manipulating...
www.bx.psu.edu - LASTZ is a program for aligning DNA sequences, a pairwise aligner. Originally designed to handle sequences the size of human chromosomes and from different species, it is also useful for sequences produced by NGS sequencing technologies such as...
www.cs.utoronto.ca - With the relative ease and low cost of current generation sequencing technologies has led to a dramatic increase in the number of sequenced genomes for species across the tree of life. This increasing volume of data requires tools that can quickly...
bioinfo.lifl.fr - YASS is a genomic similarity search tool, for nucleic (DNA/RNA) sequences in fasta or plain text format (it produces local pairwise alignments). Like most of the heuristic pairwise local alignment tools for DNA sequences (FASTA, BLAST,...
github.com - methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods such as Agilent...