github.com - Long-read sequencing technologies have become increasingly popular in genome projects due to their strengths in resolving complex genomic regions. As a leading model organism with small genome size and great biotechnological importance, the budding...
The genome of 130 mammals was sequenced by a large international consortium and the data was analyzed together with 110 existing genomes to allow scientists to identify the important positions in the DNA.
In today’s era of big biology, we’re generating more data than ever before—genomes, transcriptomes, proteomes, metabolomes, microbiomes… you name it. But raw biological data doesn’t speak for itself. Making sense of it requires more than traditional...
www.zbh.uni-hamburg.de - Tallymer is based on enhanced suffix arrays. This gives a much larger flexibility concerning the choice of the k-mer size. Tallymer can process large data sizes of several billion bases. We used it in a variety of applications to study the...
github.com - Welcome to kevlar, software for predicting de novo genetic variants without mapping reads to a reference genome! kevlar's k-mer abundance based method calls single nucleotide variants (SNVs), multinucleotide variants (MNVs),...
Ruby was created by Yukihiro Matsumoto, who wished to create a new language that balanced functional programming with imperative programming
Ruby is a dynamic, reflective, general purpose object-oriented programming language that combines syntax...
github.com - MCE spawns a pool of workers and therefore does not fork a new process per each element of data. Instead, MCE follows a bank queuing model. Imagine the line being the data and bank-tellers the parallel workers. MCE enhances that model by adding the...
Here is a small tutorial on how to make best use of multiple processors for bioinformatics analysis. One best way is using perl threads and forks. Knowing how these threads and forks work is very important before implementing them. Getting to know...