http://ga4gh.org/#/ - GA4GH Data Working Group
Led by David Haussler (UCSC) and Richard Durbin (Sanger Institute), the Data Working Group (DWG) of the Global Alliance brings together the leading Genome Institutes and Centers with IT industry leaders to create global...
github.com - Long-read sequencing technologies have become increasingly popular in genome projects due to their strengths in resolving complex genomic regions. As a leading model organism with small genome size and great biotechnological importance, the budding...
The genome of 130 mammals was sequenced by a large international consortium and the data was analyzed together with 110 existing genomes to allow scientists to identify the important positions in the DNA.
In today’s era of big biology, we’re generating more data than ever before—genomes, transcriptomes, proteomes, metabolomes, microbiomes… you name it. But raw biological data doesn’t speak for itself. Making sense of it requires more than traditional...
github.com - dna2bit: an ultra-fast and accurate genomic distance estimation software
dna2bit is a software tool developed in C++11, leveraging the capabilities of OpenMP for parallel computing and the popcount technique for efficient bit manipulation.
github.com - SRBreak is a read-depth and split-read package written in R for identifying copy-number variants in next-generation sequencing datasets.
Note: SBReak was designed to work for multiple samples. It can work for >= 2 samples, but we suggest that...
http://mojolicious.org/ - Back in the early days of the web, many people learned Perl because of a wonderful Perl library called CGI. It was simple enough to get started without knowing much about the language and powerful enough to keep you going, learning by doing was...
github.com - ChIP-seq simulation pipeline, ChIPulate, we assess the impact of various biological and experimental sources of variation on several outcomes of a ChIP-seq experiment, viz., the recoverability of the TF binding motif, accuracy of TF-DNA binding...
github.com - UniAligner (formerly, TandemAligner) is the first parameter-free algorithm for sequence alignment that introduces a sequence-dependent alignment scoring that automatically changes for any pair of compared sequences. Classical alignment approaches,...