github.com - MEC, to identify and correct misassemblies in contigs. Firstly, MEC takes fragment coverage as the feature to detect the candidate misassemblies. Then, it can distinguish a large number of false positives from the candidate misassemblies based on...
www.bx.psu.edu - We describe a new method for predicting the ancestral order and orientation of those intervals from their observed adjacencies in modern species. We combine the results from this method with data from chromosome painting experiments to produce a map...
github.com - SvABA is a method for detecting structural variants in sequencing data using genome-wide local assembly. Under the hood, SvABA uses a custom implementation of SGA (String Graph Assembler) by Jared Simpson, and BWA-MEM by Heng Li....
github.com - RefKA, a reference-based approach for long read genome assembly. This approach relies on breaking up a closely related reference genome into bins, aligning k-mers unique to each bin with PacBio reads, and then assembling each bin in parallel...
The School of Biotechnology offers a curriculum that reflects the multidisciplinary nature of Biotechnology, integrating theoretical and applied science in undergraduate and graduate courses. The school has six departments with about 300 employees,...
The purpose of this cheat sheet is to introduce biologist and bioinformatician to the frequently used tools for NGS analysis as well as giving experience in writing one-liners.
File System ls — list items in current directory ls...
github.com - HapSolo, that identifies secondary contigs and defines a primary assembly based on multiple pairwise contig alignment metrics. HapSolo evaluates candidate primary assemblies using BUSCO scores and then distinguishes among candidate assemblies using...
To remove all line ends (\n) from a Unix text file:
sed ':a;N;$!ba;s/\n//g' filename.txt > newfilename_oneline.txt
To get average for a column of numbers (here the second column $2):
awk '{ sum += $2; n++ } END { if (n > 0) print sum / n;...