samtools.sourceforge.net - In current genome era, our day to day work is to handle the huge geneome sequences, expression data, several other datasets. This link provide a comprehensive list of commonly used sofware/tools.
broadinstitute.github.io - Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. These file formats are defined in the Hts-specs repository. See especially the SAM specification and the VCF...
Many times bioinformatician needs to parse binary files like bam and sff. Advantage of binary files is that they occupy less space in memory with maximum information content.
Link for those who looking for structure of Bam and sff...
userweb.eng.gla.ac.uk - This webpage lists some of the one-liners that we frequently use in metagenomic analyses. You can click on the following links to browse through different topics. You can copy/paste the commands as they are in your terminal screen, provided you...
bedtools.readthedocs.io - Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For...
github.com - This will help us to reduce the amount of drive space we take up and decrease data transfer times
Quip compresses next-generation sequencing data with extreme prejudice. It supports input and output in...
gear.embl.de - The easiest way to get Alfred is to download a statically linked binary from the Alfred github release page. Alternatively, you can build Alfred from source. Alfred dependencies are included as submodules so you need to do a recursive...
github.com - mosdepth can output:
per-base depth about 2x as fast samtools depth--about 25 minutes of CPU time for a 30X genome.mean per-window depth given a window size--as would be used for CNV calling.the mean per-region given a BED file of regions.a...
The key to finding a solution is to notice that most genomicsequences differ by very little. It may well be that the number of complete genome sequences being stored is increasing rapidly, but the actual amount of new data is very small. In...