www.ncbi.nlm.nih.gov - We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, and apply it to the assembly of a dataset of paired 75-bp Illumina reads derived from the 15.4 megabase genome of the haploid yeast Pichia...
bitbucket.org - TAndem REpeat ANalyzer -TAREAN – is a computational pipeline for unsupervised identification of satellite repeats from unassembled sequence reads. The pipeline uses low-pass whole genome sequence reads and performs their...
github.com - Second generation sequencing technologies paved the way to an exceptional increase in the number of sequenced genomes, both prokaryotic and eukaryotic. However, short reads are difficult to assemble and often lead to highly fragmented assemblies....
If we only had Illumina reads, we could also assemble these using the tool Spades.
You can try this here, or try it later on your own data.
Get data
We will use the same Illumina data as we used above:
illumina_R1.fastq.gz: the Illumina...
github.com - Variation graphs provide a succinct encoding of the sequences of many genomes. A variation graph (in particular as implemented in vg) is composed of:
nodes, which are labeled by sequences and ids
edges, which connect two nodes via either of...
www.ncbi.nlm.nih.gov - Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster...
www.cebitec.uni-bielefeld.de - TACOA is a software that can accurately predict the taxonomic origin of genomic fragments from metagenomic data sets by combining the advantages of the k -NN approach with a smoothing kernel function.
TACOA can be easily installed and run on a...
www.pango.network - In the vast majority of instances it is expected that Pango lineage names and designations will conform to the following rules. These rules also act as guidelines for the decisions made by the Lineage Designation...
Types of SSRs (simple sequence repeats), SSRs are short DNA sequences consisting of a tandem repeat of a few nucleotides, typically 2-6 nucleotides in length. There are different types of SSRs based on the length and pattern of the repeated...