The Roth Lab seeks insight into biological systems through genome- and proteome-scale experimentation and analysis.
Current computational interests:
Systematic analysis of genetic epistasis to identify redundant or compensatory systems and to...
With the emergence of NGS technologies, and sequencing data most of the bioinformaticians mung and wrangle around massive amounts of genomics text. There are several "standardized" file formats (FASTQ, SAM, VCF, etc.) and some tools for manipulating...
lh3.github.io - Heng Li posted several issues with the human reference genomes given in these resources and suggests the following compressed FASTA file to be used as hg38/GRCh38 human reference genome.
if you map reads to GRCh38 or hg38, use the...
Integrated solutions CLCbio Genomics Workbench - de novo and reference assembly of Sanger, Roche FLX, Illumina, Helicos, and SOLiD data. Commercial next-gen-seq software that extends the CLCbio Main Workbench software. Includes SNP detection,...
github.com - JBrowse is a fast, embeddable genome browser built completely with JavaScript and HTML5, with optional run-once data formatting tools written in Perl.
Headline Features:
Fast, smooth scrolling and zooming. Explore your genome with unparalleled...
www.nature.com - Validated a widely accessible approach that can be used to establish functional causality for noncoding sequence variants identified by GWASs.
https://www.nature.com/articles/nm.3975
github.com - Cogent is a tool that identifies gene families and reconstructs the coding genome using high-quality transcriptome data without a reference genome, and can be used to check assemblies for the presence of these known coding...
Complete, accurate replication of the genome is essential for life. All chromosomes in eukaryotic cells must be duplicated and then segregated to daughter cells to ensure genetic integrity and produce the large number of cells that make up a...
github.com - With advances in Cancer Genomics, Mutation Annotation Format (MAF) is being widely accepted and used to store somatic variants detected. The Cancer Genome Atlas Project has sequenced over 30 different cancers with sample size of each cancer type...