cosmos.hms.harvard.edu - COSMOS, our Python-based management system for implementing large-scale parallel workflows focusing on, but not restricted to, large-scale short-read "NGS" sequencing data is open-access published via Advance Access in Bioinformatics (Gafni et al....
My research group consists primarily of computer science graduate students and postdocs with expertise in algorithms, statistical inferences and machine learning, and sharing a passion for understanding fundamental biological problems.
We work in...
The goal of our research is to better understand the biology of microbial organisms of significant ecological, veterinary and medical importance.
To achieve this goal, our team combines the power of next generation DNA sequencing and...
ratt.sourceforge.net - RATT is software to transfer annotation from a reference (annotated) genome to an unannotated query genome.
It was first developed to transfer annotations between different genome assembly versions. However, it can also transfer annotations between...
github.com - Pilon is a software tool which can be used to:
Automatically improve draft assemblies
Find variation among strains, including large event detection
Pilon requires as input a FASTA file of the genome along with one or more BAM files of reads...
bitbucket.org - RCircos package provides a simple and flexible way to make Circos 2D track plots with R and could be easily integrated into other R data processing and graphic manipulation pipelines for presenting large-scale multi-sample genomic research data. It...
github.com - In a nutshell
Anvi’o is an analysis and visualization platform for ‘omics data.
Please find the methods paper here: https://peerj.com/articles/1319/
Anvi’o would not have been possible without the help of many people who...
journals.plos.org - Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for...
drive5.com - USEARCH >Extreme high-throughput sequence analysis. Orders of magnitude faster than BLAST. MUSCLE >Multiple sequence alignment. Faster and more accurate than CLUSTALW.
UPARSE >OTU clustering for 16S and other marker genes....
cran.r-project.org - Most variant calling pipelines result in files containing large quantities of variant information. The variant call format (vcf) is an increasingly popular format for this data. The format of these files and their content is discussed in...