The purpose of this cheat sheet is to introduce biologist and bioinformatician to the frequently used tools for NGS analysis as well as giving experience in writing one-liners.
File System ls — list items in current directory ls...
Genome browsers are useful not only for showing final results but also for improving analysis protocols, testing data quality, and generating result drafts. Its integration in analysis pipelines allows the optimization of parameters, which...
The study of biological pathways is a key to understand the different processes inside a cell: proteins exert their function not in isolation but in a tightly controlled network of interactions and reactions. Activation of a pathway typically leads...
Network analysis is any structured technique used to mathematically analyze a circuit (a “network” of interconnected components). The Network analysis provides the ability to quantify associations between individuals, which...
To do sam to bam conversion, follow the following commands :-
Code:
$ samtools view -b -S file.sam > file.bam
Then you will need to use
Code:
$ samtools sort file.bam file-sorted
followed by
Code:
$ samtools index...
There are several ways you can convert fastq to fasta sequences. Some methods are listed below.
Using SED
sed can be used to selectively print the desired lines from a file, so if you print the first and 2rd line of every 4 lines, you get the...
BBSplit internally uses BBMap to map reads to multiple genomes at once, and determine which genome they match best. This is different than with ordinary mapping. If a genome (say, human) contains an exact repeat somewhere, reads mapping to it will...
Perl's second wave of adoption came from the growth of the world wide web. Dynamic web pages—the precursor to modern web applications—were easy to create with Perl and CGI. Thanks to Perl's ubiquity as a language for system...
Miniasm is a very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format. Different from mainstream...