offers many different tools including alignment, RNA-Seq, DNA-Seq, ChIP-Seq, Small RNA-Seq, Genome Browser, visualizations, Biological Interpretation, etc. Supports workflows “one can import the sample data in FASTA, FASTQ or tag-count format. In addition, prealigned data in SAM, BAM or Illumina-specific ELAND format can be directly imported for analysis.”
Alignment feature: Supports alignment from Illumina, Ion Torrent, 454 (Roche), and Pac Bio
(QIAGEN). Features include: resequencing, workflow, read mapping, de novo assembly, variant detection, RNA-Seq, ChIP-Seq, Genome Browser, etc (entire list on website); Main Workbench offers database search (Genbank, Blast, Pubmed); 2000 organizations have invested in CLC
Accepts VCF files from 1000 Genomes Project
Accepts downloaded tracks from dbSNP
Also accepts: FASTA, GFF/GTF/GVF, BED, Wiggle, Cosmic, UCSC variant database, complete genomics master var file
Read mapping: “In addition to Sanger sequence data, reads from these high-throughput sequencing machines are supported: The 454 FLX System and the 454 GS Junior System from Roche, Illumina Genome Analyzer, Illumina HiSeq, Illumina HiScan, and Illumina MiSeq sequencing systems, SOLiD system from Life Technologies, Ion Torrent system from Life Technologies, Helicos from Helicos BioSciences”
De novo assembly: “In addition to Sanger sequence data, reads from these high-throughput sequencing machines are supported The 454 FLX System and the 454 GS Junior System from Roche, Illumina Genome Analyzer, Illumina HiSeq, Illumina HiScan, and Illumina MiSeq sequencing systems, SOLiD system from Life Technologies, Ion Torrent system from Life Technologies”
Private cloud repository -- formerly a redistributor of SRA and other NCBI resources; command-line or via web, can fetch data from a URL, build custom pipeline/ workflow has sra.dnanexus.com site: data downloads come directly from NCBI
(QIAGEN) allows for variant identification and analysis, uses NCI-60 data set for cancer, Supported third part informatin: Entrez Gene, RefSeq, ClinVar; gives contextual details of results instead of just A to B relationship
Has own database-- “knowledge base” based on COSMIC, OMIM, and TCGA databases
Comprehensive NGS software pipeline for assembly, alignment, variant calling and analysis of NGS data
Supported workflows include: reference-guided and de novo genome and transcriptome assembly and analysis, metagenomics sample assembly, targeted resequencing, exome alignment, gene panels with validation control, variant analysis, and RNA-Seq, ChIP-Seq and miRNA alignment and analysis.
#1 in accuracy: fewer false negatives and better sensitivity compared to results obtained from other aligners
Aligns exome data and performs variant calling an average of 3 times faster than alternative pipelines
Annotates genomic data with allele and genotype frequency, functional impact predictions, evolutionary conservation scores and pathogenicity
Supports all major NGS technologies (Illumina, Ion Torrent, Pac Bio and Roche 454) and project types
Available on Windows, Mac OS X, Linux, and the Amazon Cloud
“perfect analytical partner for the analysis of desktop sequencing data produced by the ION PGM™, Roche Junior, Illumina MiSeq as well as high throughput systems as the Ion Torrent Proton, Roche FLX, Applied BioSystems SOLiD™ and Illumina® platforms.” runs on Windows, free-standing multi-application package-- SNP/Indel analysis, CNV prediction and disease discovery, whole genome alignment, etc.
Cited in over 3,500 peer-reviewed scientific publications
Workflows for microarray and PCR data include: Gene expression including alternative splicing, miRNA expression, Genome Wide Association Studies, Mother-Father-Child Trio analysis, DNA Copy number including allele specific copy number and Loss of Heterozygosity (LOH), and ChIP, and methylation. Next Generation Sequencing (NGS) workflows include: RNA-Seq, miRNA-Seq, ChIP-Seq, DNA-Seq, and Methylation
Powerful statistics and interactive, publication ready visualizations
Supports all commercial next generation sequencing and microarray file format as well as text files
Open source platform (SaaS), analysis and genome sequencing tools, integrates over 400 genomic analysis open source tools and pipelines, have a private and public cloud version. Features: genomic data visualization, drag and drop interface, accelerated analysis, real-time collaboration
They have a couple modules to do so, and have enabled parts of the sra toolkit
Description: phasing observed genotypes and imputing missing genotypes uses reference panels to provide all available halotypes, does not use population labels or genome-wide measures; designed to represent variation in one population; Fairly popular
Input:
Reference Haplotypes: Links to 1000 Genomes and HapMap downloads
Description: algorithm searches graphs produced by de novo assembler Cortex; c++ source code for SNP detection “2kplus2.cpp is a c++ source code for the detection and the classification of single nucleotide polymorphisms in transformed De Bruijn graphs using Cortex assembler.”
Description: identifies SNPs and INDELs from pooled high-throughput NGS, not used for analysis of single samples; implemented in C and uses SAMtools API; latest version should work with diploid genomes
Description: (Wellcome Trust Sanger) calls small indels from short-read sequences, only can handle Illumina data; cannot test candidate indels; written in C++, used on Linux based and Mac computers (not tested in windows)
Description: family-based sequencing studies- provides probability of an individual carrying variant based on family’s raw measurements; accommodates de novo mutations, can perform variant calling at chrX;
Description: uses likelihood-based model for variant calling, starts from genotype likelihoods that have been computed from other tools (ex. Samtools BAQ), the likelihoods combine with individual-based prior p(genotype) to generate posterior probabilities
Description: (Broad Institute): does not perform realignment, relies on alignments in BAM files (BAM files need aligned before put into indelocator); recommended to use GATK prior;
Input: 2 BAM files(tumor & normal), annotated as germline or somatic; also has single sample mode
Output: “Output of Indelocator is a high-sensitivity list of putative indel events containing large numbers of false positives. The statistics reported for each event have to be used to custom-filter the list in order to lower false positive rate”
Description: SNV caller, Python language, standalone program, uncovers cell-population heterogeneity from high-throughput sequencing datasets; calls variants found in <.05% of the population
Input: BAM file input→ suggest running through GATK
Description: SNV caller, specifically tailored to Oxford Nanopore Reads, written in Python; Package comes with 3 programs, marginAlign, marginCaller (calls SNVs), marginStats (computes qc stats on sam files)
Description: Last release March 2014; for analyzing sequencing data in family studies of inherited diseases; variant calls for a family in VCF file; still in alpha-testing on github, example data uses 1000 genomes dataset
Description: UCSC Nanopore group (group at UCSC studying using ion channels for analysis of single RNA/DNA structures) software pipeline; tailored to Oxford Nanopore Reads; command line program
Input: FASTQ
Reference files: FASTA
Output: “For each possible pair of read file, reference genome and mapping algorithm an experiment directory will be created in the nanopore/output directory.”
Description: Package program, written in C, Python, Cython; Can identify SNPs, MNPs, short indels, and larger variants; has been tested on very large datasets (1000 genomes)
Input: BAM
Reference Genome: FASTA (files must be indexed using Samtools or similar program
Description: detection of SNPs; “can be used as a standalone application with graphical user interface as part of pipeline system”; does not require fully sequenced reference genome; haplotype strategy
Description: calls common and rare variants in pool or individual NGS data, reports overall p-value, operating system independent statistical tool, identifies SNPs and INDELs, written in Java, no dependencies, straightforward command-line
(SNVerGUI=GUI version) --SNVerGUI: desktop tool for variant detection
Description: standalone program, “pipeline for the integrated analysis of somatic variants in cancer genomes”; integrates four algorithms; written in Perl; required tools: samtools, tabix, vcftools, VarScan2, bambino, cmake, somaticsniper (User guide; workflow page)
Input: tumor and normal reads in BAM files, run through variant calling programs to generate intermediate VCF
Description: Improved Bayesian inversion somatic caller; unlike other software packages, treats effects fully probabilisticallys instead of using ad-hoc modeling; effects are integrated at the atomic level and standard probability theory integrates read tallies to the sample level and to the tumor-normal pair level; "pending public release"
Description: standalone program; “accurate detection of copy number alteration and loss of heterozygosity in impure and aneuploid tumor samples using whole genome sequencing data”
Input: depth file generated by DFExtract and a config file
Output: .results file, .Gtype, LOG.txt, also generates visualization
Description: Empirical Baysian Mutation Calling; standalone program; uses tumor/normal paired reads and non-paired normal reference samples; dependent on samtools, R and VGAM pack for R
Input: BAM
Output: not sure what exact type of file- “The format of the result is suitable for adding annotation by annovar.”
Description: command-line program; calls SNVs from NGS data from multiple samples from the same patient; dependent on R, Git, cmake, Boost and compile libraries
Description: RNA and DNA Integrated Analysis for Somatic Mutation Detection; DNA only Method(tumor/normal pair, ignores RNA) or Triple BAM Method (uses all three datasets from same patient); dependent upon python, samtoools, pysam API, BLAT, SnpEff
Input: BAM
Reference Genome: FASTA indexed with SAMtools faidx
Description: finds somatic mutations using integration of DNA and RNA seq data-- boosts sensitivity for low purity tumors and rare mutations;
Input:”can accept a variety of sequencing inputs and configurations”
Output: “table of somatically mutated sites and associated information. These somatic mutations can be annotated with predicted transcript and protein effects using third party tools, such as Annovar”
Description: Virtual Microdissection for SNP calling; Java based; for disease-control matched samples; uncovers SNPs with low allele frequency by considering alpha contamination
Input: BAM (must be sorted and indexed- samtools sort)
Description: software package: genotype calling, phasing, imputation of ungenotyped markers, and identity-by-descent segment detection:unsure if this one is in the right category; genotype calling, phasing, imputation of ungenotyped markers, and identity-by-descent segment detection;
Description: software package, SNV calling from normal-tumor pair and two parent genomes; quantifies descent-by-modification relationships; Written in Java
Description: RAre REference VAriant annotaTOR; command line; “identification and annotation of germline and somatic variants in rare reference allele loci from second generation sequencing data”; Bayesian genotype likelihood model
Input: BED or VCF files from GATK
Output: two VCF files (one for SNVs, one for Indels)
Description: Used for detecting indels in a reference genome; performs localized micro-assembly of specific regions of interest; can do single, de novo, somatic reads; requires that raw reads are aligned with BWA
Description: Analysis Tool for Heritable and Environmental Network Associations; software package, combines machine learning model with biology and statistics to predict non-linear interactions
Input: Configuration file, Data file, Map file (includes rsID)
Output: Summary file, Best model file, dot file, individual score file, cross-validation file
Description: Genome Wide Complex Trait Analysis; package program, command line interface; estimates variance by all SNPs; 5 main functions: “data management, estimation of the genetic relationships from SNPs, mixed linear model analysis of variance explained by the SNPs, estimation of the linkage disequilibrium structure, and GWAS simulation”
Description: package for analysis of complete genome data; annotation using public data or custom tracks, automated primer desing for Sanger or Sequenom validation; “The cg process_illumina command can be used to generate annotated multisample data starting from fastq files, using tools such as bwa for alignment and GATK and samtools for variant calling. Sequencing data can also be imported from Complete Genomics (cg_process_sample command), Real Time Genomics (cg_process_rtgsample command) and VariantCallFormat (VCF) variant files (vcf2sft command).”
Input: Sequencing data from Complete Genomics, Illumina, SOLiD and VCF;
Output: standard file format used is a simple tab delimited file (.sft, .tsv)
Description: animal gene mapping; “genomic prediction and variance component estimation of additive and dominance effects”; standalone program, command line interface, writting in C++ and Java
Description: Analyzes heterogeneity with respect to single marker loci or known maps of markers; Carries out homogeneity test for alternative hypothesis “Two family types, one with linkage betweeen a trait to a marker or map of markers, the other without linkage”
Description: GWIA for case-control SNP and quantitative traits; selected for joint analysis using priori information; Provides linear regression framework, Pathway Association Analysis, Genome-wide Haplotype Analysis,
Description: Currently only the standalone version available, but moving to LIMIX software suite; offers set tests- allows for testing between variants and traits; accounts for confounding factors ex. relatedness
Input: sample-to-sample genetic covariance matrix needs to be computed; multiple types of input; simulator requires input genotype and relatedness component;
Output: resdir (result file of analysis), outfile (test statistics and p-values), manhattan_plot (flag)
Description: command-line tool, supports SNPs, INDELs, CNVs and block substitutions, provides wide variety of annotation techniques, depends upon multiple databases (each needing to be downloaded); annotates genetic variants; utilizes RefSeq, UCSC Genes, and the Ensembl gene annotation systems; can compare mutations detected in dpSNP or 1000 Genomes Project; Very popular *“The final command run TABLE_ANNOVAR, using dbSNP version 138, 1000 Genomes Project 2014 Oct version, NIH-NHLBI 6500 exome database version 2 (referred to as esp6400siv2), dbNFSP version 2.6 (referred to as ljb26), dbSNP version 138 (referred to as snp138) databases and remove all temporary files, and generates the output file called myanno.hg19_multianno.txt”
Input: VCF, ANNOVAR input format (simple text-based format); can convert other formats into ANNOVAR input format
Description: Very popular; Polymorphism Phenotyping; Web application; predicts impact of amino acid substitution on protein; Calculates Bayes posterior probability (Last update July 2015)
Description: predicts how an amino acid substitution will affect protein function; Based on degree of conservation of amino acid residues- collected though PSI-BLAST; can be applied to nonsynonymous polymorphisms or laboratory-induced missense mutations; links to dbSNP 132, GRCh37; Standalone or web app program; Very popular
Input: Uniprot ID or Accession, Go term ID, Function name, Species Name or ID, etc
Description: Genetic variant annotation and effect prediction toolbox; integrated with Galaxy, GATK, and GNKO; can annotate SNPs, INDELs, and multiple-nucleotide polymorphisms; categorizes effects into classes by functionality; Very popular; Standalone or Web app; Claims to calculate all SNPs in 1000 genomes (EMBI) in less than 15 minutes; can annotate SNPs, MNPs, and insertions and deletions; Provides assessment of impact of the variant ( low, medium or high)
Input: VCF, BED
Output: VCF (with new ANN field, also used in ANNOVAR and VEP), HTML summary files
Description: Filter and manipulate annotated files; Part of SnpEff main distribution; one variants have been annotated, this can be used to filter your data to find relevant variants
Description: Variant Annotation, Analysis, and Search Tool; probabilistic search tool for identifying damage genes and the disease causing variants; can score both coding and non-coding variants; Four tools: VAT (Variant annotation tool), VST (Variant Selection Tool), VAAST, pVAAST (for pedigree data); updated April 2015
Input: FASTA, GFF3, GVF
Output: CDR (condenser file), VAAST file (both unique to VAAST)
Description: (Broad Institute); can estimate purity and ploidy to compute absolute copy number and mutation multiplicitie; reextracts data from the mixed DNA population
Input: HAPSEQ segdat or segmentation file
Output: per-sample output directory and subdirectory providing per-sample text files containing standard out being emitted from R
Description: high-throughput annotation software for NGS analysis; for “intensive variant analysis workflows”; “enriches raw NGS variants with dozens of attributes”; based on clinically oriented Alamut database; Supports human genes; easy to integrate into pipeline (Latest Release- July 2015)
Description: Annotation, Visualization, and Impact Analysis; “The tool is based on coupling a comprehensive annotation pipeline with a flexible visualization method. We leveraged the ANNOVAR (Wang et. al, 2010) framework for assigning functional impact to genomic variations by extending its list of reference annotation databases (RefSeq, UCSC, SIFT, Polyphen etc.) with additional in-house developed sources (Non-B DB, PolyBrowse).”
Input: BED
Output: Table of annotations with gene annotation features
Description: (Mayo Clinic) (Page last updated June 2015) Biological Reference Repository; “data integration tool that enables coordinate based searches and joins based on strings”; “BioR consists of two parts 1) the BioR toolkit which depends on Java…. 2) the BioR catalogs which are the data files used by the system”
Description: Cancer-specific High-throughput Annotation of Somatic Mutations; Last updated May 2014; uses Random Forest Method to “distinguish between driver and passenger somatic mutations”; Positive driver class curated from COSMIC database; packed together with SNVBox (database)
Input:Passenger mutation rates, Transcript and amino acid change, Genomic coordinates
Description: Cancer-Related Analysis of Variants Toolkit; Web application; Uses CHASM, VEST, SNVGet; “CRAVAT provides predictive scores for germline variants, somatic mutations and relative gene importance, as well as annotations from published literature and databases” Latest Release May 2015;
Input: VCF, CRAVAT format
Output: CRAVAT report- MS Excel spreadsheet or tab-separated file (emailed)
Description: Cologne University Protein Stability Analysis Tool; “tool to predict changes in protein stability upon point mutations”; web service program; Can predict mutant stability from existing PDB structures or custom protein structures
Input:for PDB- provide PDB ID and Amino Acid Residue Number; for custom- PDB file format
Description: Deleterious Annotation of genetic variants; standalone program, uses “the same feature set and training data as CADD to train a deep neural network”; can catch nonlinear relationships; “There are four different datasets: training, validation, testing, and ClinVar_ESP...The ClinVar_ESP dataset is also a testing set containing a set of “gold standard” pathogenic and benign variants”
Description: Exonic Splicing Enhancer; useful for interpretation of point mutations/polymorphisms that are disease-associated; GUI interface; web app program
Input: FASTA
Output: html or plain text format, graphical display of results
Description: Wellcome Trust Sanger; functionally annotates variants from whole-exome sequencing data; Based on Jannovar and uses UCSC KnownGene; Java program; web app program (Page last modified Feb 2015)
Description: Automated variant annotation pipeline for family-based sequencing studies; Annotaties SNVs and INDELs; 4 models- autosomal dominant, autosomal recessive, de novo mutations and a general model; “A variety of annotations are provided for each segregating variant: number of family (and family ID) each variant hits, variant genomic location and coding effect (based on snpEff), loss-of-function mutation annotation, selected ENCODE annotation, allele frequency in the 1000 Genomes Project, allele frequency in the Exome Variant Server (ESP6500), segmental duplication annotation, SIFT, PolyPhen2, LRT, MutationTaster, GERP++, PhyloP, SiPhy, etc.” (Last updated May 2014)
Description: Combines tool for filtering and data analysis with an online network for genetic professionals; Different degrees- basic license, premium license, in-house solution (the last ones are paid for- Commercial tool?)
Description: “GeneVetter is a tool designed for investigation of the background prevalence of exonic variation in the Phase 3 1000 Genomes data under user defined filtering criteria”; web app program; GeneVetter uses GRch37p4 (hs37d5.fa.gz), dbSNP build 138, 1000G Phase 3, clinvar_2014072
Description: (Broad Institute) Last update- July 2014; Identifies genomic regions that are significantly “amplified or deleted”; Each is given a G score; gives genomic locations and q-values from aberrant regions
Description: Have yOur Protein Explained; Web app program; Automatic mutant analysis server that provides structural effects of a mutation; Uses BLAST against UniProt and PDB along with homology modeling
Input: FASTA protein sequence, or accession code of protein of interest
Output: a report containing information from a “decision tree” and illustrated figures and animations
Description: Last update: May 2013; aimed to help study pre-mRNA splicing; combines 12 algorithms to identify mutations’ effect on splicing motifs; uses ensembl database 70
Input: Gene Name, Ensembl transcript ID, Ensembl Gene ID, Consensus CDS, RefSeq Peptide ID, or own sequence (looks like you can enter FASTA)
Output: Chart with columns for predicted signal, predicted algorithm, cDNA position and interpretation
Description: Large-scale Analysis of Variants in noncoding Annotations; New version released July 2015; Command-line program; used for studying noncoding variants; integrates comprehensive set of noncoding elements, modeling their mutation count; Dependent on C++ and BEDtools
Description:three main programs: mlink (calculates lod scores at fixed values for the recombination fraction in one interval of a genetic map), linkmap (calculates location scores for positions of a disease locus along a marker), and ilink (estimates parameters including recombination fractions, allele frequencies, penetrances, etc)
Input: pedfile (processed by MAKEPED) and datafile (reflects loci for each individual; set in PREPLINK)
Description: MNV Annotation Corrector; Ad hoc software, fixes incorrect amino acid predictions that are caused by multiple nucleotide variations; Uses existing annotators ANNOVAR, SnpEff, VEP (last update April 2015) (only 1 download this week → not popular)
Input: List of called SNVs and corresponding BAM
Output: Report identifying block of mutation within codon (BMCs)
Description: focuses on mtDNA, provides clinically relevant information from different resources; two component pipeline: command link for alignment of NGS reads and online version that provides genetic report on mitocondrial variants
Input:FASTQ, pileup
Reference sequence: rCRSm
Output: Online version gives comprehensive genetic report
Description: Web App program; “This application generates reports on inherited mutations in five genes (ANK1, SLC4A1, SPTA1, SPTB and EPB42) associated with the following rare Mendelian blood disorders: Hereditary Spherocytosis (HS), Hereditary Elliptocytosis (HE) and Hereditary Pyropoikilocytosis”; Newer program- recently validated on omictools
Input: Can upload coordinates of DNA variants or VEP
Output: Displayed on web or can be downloaded in Excel or RDF format
Description: web app tool; Classifies amino acids substituation as disease associated or neutral in humans; Last modified Feb. 2014; Based on SIFT, trained using Human Gene Mutation Database
Input:
Output: “The output of MutPred contains a general score (g), i.e., the probability that the amino acid substitution is deleterious/disease-associated, and top 5 property scores (p), where p is the P-value that certain structural and functional properties are impacted.”
Description: (Broad Institute) Mutation Significance (CV= covariates); Analyzes mutations discovered in DNA sequencing to identify genes that were mutated more often than expected
Description: Collection of command-line scripts for providing rich SNP annotations; “NCBI, Ensembl, and Uniprot IDs are provided for genes, transcripts and proteins when applicable”;
Input: Samtools consensus pileup, Maq, diBayes, Genetic format, VCF
Output: File containing annotated SNPs is copied from SNP list and some classes are added
Description: (Broad Institute) “Tool for annotating human genomic point mutations and data relevant to cancer researchers”; Web app; Supports annotation of data from ClinVar, dbSNP, 1000 genomes (plus many other external sites); Only GRCh27 coordinates supported; Last update: April 2015
Description: Protein ANalysis THrough Evolutionary Relationships; Web app program, also has its own database; Classification system used to classify proteins and their genes; Also, “Estimates the likelihood of a particular nonsynonymous (amino-acid changing) coding SNP to cause a functional impact on the protein”; Updated in 2015
Input: Data from PANTHER, IDs from Ensembl, EntrezGene, NCBI GI numbers, NCBI UniGene IDs HUGO, UniProt; if ID type is not one of the above, can input txt file or excel format
Description: Combines patient's’ disease symptoms with sequencing data; Standalone or Web app version; Only excepts 1 family per run, in order to evaluate unrelated individuals, each sample needs to be run individually
Input: Variant- VCF; Pheotype- HPO; Pedigree- PED
Output: Combined scores file, variants for top genes file
Description: Aimed at annotation and prediction of pathological mutations; based on different kinds of sequence info and neural networks to process information
Description: Protein Variation Effect Analyzer; predicts whether an amino acid substitution or indel has impact on biological function of the protein; “comparable to SIFT or Polyphen-2”; Standalone, Web app, Command line or GUI; Last update May 2014
Input: FASTA, list of variants;
Output: tab-separated columns including Variant, Provean Score and prediciton
Description: Web application program, includes a database as well; Database contains physical-based SNP annotations and functional annotations; “Information on physical, functional, and LD annotation served on the SCAN database comes directly from public resources, including the HapMap (release 23a), NCBI (dbSNP 129), or is information created by us using data downloaded from these public resources”; “SCAN can be utilized in several ways including: (i) queries of the SNP and gene databases; (ii) analysis using the attached tools and algorithms; (iii) downloading files with SNP annotation for various GWA platforms”
Description: “SeattleSeqAnnotation137 was most recently updated October 13, 2013. The current version is 8.08. The most recent site, based on dbSNP build 141, and hg38/NCBI 38”; Provides annotations for SNVs and Indels- includes dbSNP rsID, gene names and accession numbers, variation functions, protein positions and amino acid changes, conservation scores, HapMap frequencies, PolyPhen predictions and clinical association.
Input: Maq, gff, CASAVA, VCF, GATK bed, custom
Output: “default output file format is a header line (starting with "#") followed by tab-separated annotations”; VCF
Description: Scripps Genome Annotation and Distributed Variant Interpretation Server, web developed applications for variant annotation, “Downstream applications of variant annotation include: Clinical sequencing applications including: carrier testing, or identification of causal variants in molecular diagnosis, tumor sequencing, or diagnostic odyssey. Prioritization of variants prior to statistical analysis of sequence based disease association studies, especially for automated set-generation and enrichment of likely functional variants within sets. Identification of causal variants in post-GWAS/linkage sequencing studies. Identification of causal variants in forward genetic screens (stay tuned for non-human annotation)”
Input: SNV- VCF, BED, and a few others; CNV- BED, CNVator, plus others
Description: “SNAP2 is a trained classifier that is based on a machine learning device called "neural network". It distinguishes between effect and neutral variants/non-synonymous SNPs by taking a variety of sequence and variant features into account”; predicts impact of amino acid substitution on protein