Our research group is primarily focused on the analysis of whole genome sequence data to identify genetic variation (primarily structural variation) and examine their potential functional impact in disease phenotypes. We are particularly interested...
Raphael Lab research is focused on Bioinformatics and Computational Biology.
Current research interests include next-generation DNA sequencing, structural variation, genome rearrangements in cancer and evolution, and network analysis of somatic...
This research group works on problems from the fields of Bioinformatics, Biotechnology, Data Mining, and Information Retrieval. The group's research projects includes Comparative Genomics of Bacterial genomes, Metagenomics, Genomic databases,...
www.ncbi.nlm.nih.gov - The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny,...
http://busco.ezlab.org/ - Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs
More at http://busco.ezlab.org/
crossmap.sourceforge.net - CrossMap is a program for convenient conversion of genome coordinates (or annotation files) between different assemblies (such as Human hg18 (NCBI36) <> hg19 (GRCh37), Mouse mm9 (MGSCv37) <> mm10 (GRCm38)).
It supports most commonly...
broadinstitute.github.io - Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. These file formats are defined in the Hts-specs repository. See especially the SAM specification and the VCF...
mira-assembler.sourceforge.net - MIRA is a multi-pass DNA sequence data assembler/mapper for whole genome and EST/RNASeq projects. MIRA assembles/maps reads gained by
electrophoresis sequencing (aka Sanger sequencing)
454 pyro-sequencing (GS20, FLX or Titanium)
Ion...
www.bioinformatics.babraham.ac.uk - Understanding Following table and graphs
Duplication level
kmer profile
per base GC content
per base N content
per base quality
per base sequence content
per sequence GC content
per sequence quality
sequence length distribution
More at...
github.com - Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION). The software is currently alpha level, feel free to use and report issues encountered.
Canu is...