X BOL wishing you a very and Happy New year

Alternative content

History

Our Sponsors



Download BioinformaticsOnline(BOL) Apps in your chrome browser.




Tools to Predict the Impact of Missense Variants !: Revision

Prioritizing missense variants for further experimental investigation is a key challenge in current sequencing studies for exploring complex and Mendelian diseases. A large number of in silico tools have been employed for the task of pathogenicity prediction, including PolyPhen‐2, SIFT, FatHMM, MutationTaster‐2, MutationAssessor, Combined Annotation Dependent Depletion, LRT, phyloP, and GERP++, as well as optimized methods of combining tool scores, such as Condel and Logit. Due to the wealth of these methods, an important practical question to answer is which of these tools generalize best, that is, correctly predict the pathogenic character of new variants.

Study of 10 tools on five datasets that such a comparative evaluation of these tools is hindered by two types of circularity: they arise due to (1) the same variants or (2) different variants from the same protein occurring both in the datasets used for training and for evaluation of these tools, which may lead to overly optimistic results. Comparative evaluations of predictors that do not address these types of circularity may erroneously conclude that circularity confounded tools are most accurate among all tools, and may even outperform optimized combinations of tools.

Following tools are useful for mis sense muation detection ... 

PolyPhen‐2 (PP2)
“Predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations”a

MutationTaster‐2 (MT2)
“Evaluation of the disease‐causing potential of DNA sequence alterations”b

MutationAssessor (MASS)
“Predicts the functional impact of amino acid substitutions in proteins, such as mutations discovered in cancer or missense polymorphisms”c

LRT
“Identify a subset of deleterious mutations that disrupt highly conserved amino acids within protein‐coding sequences, which are likely to be unconditionally deleterious”d

SIFT
“Predicts whether an amino acid substitution affects protein function”e

GERP++
“Identifies constrained elements in multiple alignments by quantifying substitution deficits. These deficits represent substitutions that would have occurred if the element were neutral DNA, but did not occur because the element has been under functional constraint. We refer to these deficits as “rejected substitutions.” Rejected substitutions are a natural measure of constraint that reflects the strength of past purifying selection on the element”f

phyloP
“Compute conservation or acceleration P values based on an alignment and a model of neutral evolution”g

FatHMM unweighted (FatHMM‐U)
Predicts “functional consequences of both coding variants, that is, nonsynonymous single‐nucleotide variants, and noncoding variants”h

FatHMM weighted (FatHMM‐W)
Predicts “functional consequences of both coding variants, that is, nonsynonymous single‐nucleotide variants, and noncoding variants” and its weighting scheme attributes higher tolerance scores to SNVs in proteins, related proteins, or domains that already include a high fraction of pathogenic variantsh

Combined Annotation Dependent Depletion (CADD)
“CADD is a tool for scoring the deleteriousness of single‐nucleotide variants as well as insertion/deletions variants in the human genome”i