<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44288?</link>
	<atom:link href="https://bioinformaticsonline.com/related/44288?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/8265/list-of-generic-simulation-softwaretoolsresource-with-brief-description-and-homepage</guid>
	<pubDate>Mon, 10 Feb 2014 05:57:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/8265/list-of-generic-simulation-softwaretoolsresource-with-brief-description-and-homepage</link>
	<title><![CDATA[List of generic simulation software/tools/resource with brief description and homepage !!!]]></title>
	<description><![CDATA[<p>List of generic simulation software/tools/resource with brief description and homepage</p><p><img src="http://www.evolution-of-life.com/fileadmin/images/carousel/genetic.PNG" alt="image" style="border: 0px;"></p><p>ALF <br />A Simulation Framework for Genome Evolution <br />http://www.cbrg.ethz.ch/alf<br /><br />Bayesian Serial SimCoal <br />Bayesian Serial SimCoal, (BayeSSC) is a modification of SIMCOAL 1.0, a program written by Laurent Excoffier, John Novembre, and Stefan Schneider. <br />http://www.stanford.edu/group/hadlylab/ssc/index.html<br /><br />BEERS <br />BEERS was designed to benchmark RNA-Seq alignment algorithms and also algorithms that aim to reconstruct different isoforms and alternate splicing from RNA-Seq data <br />http://cbil.upenn.edu/beers/<br /><br />BOTTLENECK <br />Bottleneck is a program for detecting recent effective population size reductions from allele data frequencies <br />http://www.ensam.inra.fr/urlb/bottleneck/bottleneck.html<br /><br />BottleSim <br />BottleSim is a computer simulation program for simulating the process of population bottlenecks <br />http://chkuo.name/software/bottlesim.html<br /><br />CASS <br />Protein Sequence Simulation <br />http://www.wyomingbioinformatics.org/liberlesgroup/cass/<br /><br />CDPOP <br />CDPOP is a landscape genetics tool for simulating the emergence of spatial genetic structure in populations resulting from specified landscape processes governing organism movement behavior. <br />http://cel.dbs.umt.edu/cdpop<br /><br />CoalFace <br />CoalFace is a simulation of the coalescent process with the visual display of gene genealogies. <br />http://web.up.ac.za/default.asp?ipkcategoryid=3283<br /><br />CoaSim <br />CoaSim is a tool for simulating the coalescent process with recombination and geneconversion under various demographic models. <br />http://users-birc.au.dk/mailund/coasim/index.html<br /><br />cosi <br />The cosi package is written in C and is available as a tar file. <br />http://www.broadinstitute.org/~sfs/cosi/<br /><br />CS-PSeq-Gen <br />A program to simulate the evolution of protein sequences under the constraints of the information of a particular reconstructed phylogeny <br />http://bioserv.rpbs.univ-paris-diderot.fr/software/cs-pseq-gen.html<br /><br />DAWG <br />An application designed to simulate the evolution of recombinant DNA sequences in continuous time <br />http://scit.us/projects/dawg<br /><br />Easypop <br />EASYPOP is an individual based model intended to simulate datasets under a very broad range of conditions <br />http://www.unil.ch/dee/page36926_fr.html<br /><br />EggLib <br />EggLib is a C++/Python library and program package for evolutionary genetics and genomics. <br />http://egglib.sourceforge.net/<br /><br />EvolSimulator <br />A simulation test bed for hypotheses of genome evolution <br />http://acb.qfab.org/acb/evolsim/<br /><br />EvolveAGene <br />A realistic coding sequence simulation program that separates mutation from selection and allows the user to set selection conditions <br />http://bellinghamresearchinstitute.com/software/index.html<br /><br />fastsimcoal <br />A continuous-&not;‐time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios <br />http://cmpg.unibe.ch/software/fastsimcoal/<br /><br />FastSLINK <br />Simulation of Marker and Phenotype Data in Pedigrees <br />http://watson.hgen.pitt.edu/<br /><br />FFPopSim <br />C++/Python library for population genetics. <br />http://webdav.tuebingen.mpg.de/ffpopsim/<br /><br />FLUX SIMULATOR <br />The Flux Simulator aims at providing a deterministic in silico reproduction of the experimental pipelines for RNA-Seq, employing a minimal set of parameters. <br />http://flux.sammeth.net/simulator.html<br /><br />ForSim <br />ForSim: A Forward Evolutionary Computer Simulation <br />http://www.anthro.psu.edu/weiss_lab/research.shtml<br /><br />ForwSim <br />The program given below is based on the algorithm described in Padhukasahasram et al. 2008 to simulate genetic drift in a standard Wright-Fisher process. <br />http://badri-populationgeneticsimulators.blogspot.com/<br /><br />FPG <br />Forward Population Genetic simulation <br />http://genfaculty.rutgers.edu/hey/software#fpg<br /><br />FREGENE <br />FREGENE is a C++ program that simulates sequence-like data over large genomic regions in large diploid populations. <br />http://www.ebi.ac.uk/projects/bargen/download/fregen/documentation_html.html<br /><br />GAMETES <br />Genetic Architecture Model Emulator for Testing and Evaluating Software: Simulates complex SNP models with pure, strict epistatic interactions with n-loci. <br />http://sourceforge.net/projects/gametes/?source=navbar<br /><br />GASP <br />Genometric Analysis Simulation Program. A software tool for testing and investigating methods in statistical genetics by generating samples of family data based on user specified models. <br />http://research.nhgri.nih.gov/gasp/<br /><br />GemSIM <br />Next generation sequencing read simulator <br />http://sourceforge.net/projects/gemsim/<br /><br />GeneArtisan <br />Simulation of Markers in Case-Control Study Designs <br />http://www.rannala.org/?page_id=241<br /><br />GENOME <br />A rapid coalescent-based whole genome simulator <br />http://www.sph.umich.edu/csg/liang/genome/<br /><br />GenomePop2 <br />GenomePop2 is a specialization of the program GenomePop just to manage SNPs under more flexible and useful settings. If you need models with more than 2 alleles please use the GenomePop program version. <br />http://webs.uvigo.es/acraaj/genomepop2.htm<br /><br />GenomeSimla <br />GenomeSIMLA is currently under development- however, we have a beta release that we are asking to be tested <br />http://chgr.mc.vanderbilt.edu/genomesimla/<br /><br />GENS2 <br />Simulates interactions among two genetic and one environmental factor and also allows for epistatic interactions. <br />https://sourceforge.net/projects/gensim/<br /><br />GWAsimulator <br />A rapid whole genome simulation program <br />http://biostat.mc.vanderbilt.edu/wiki/main/gwasimulator<br /><br />HAP-SAMPLE <br />An association simulator for candidate regions or genome scans <br />http://www.hapsample.org/<br /><br />HAPGEN <br />A simulator for the simulation of case control datasets at SNP markers <br />https://mathgen.stats.ox.ac.uk/genetics_software/hapgen/hapgen2.html<br /><br />HapSim <br />A simulation tool for generating haplotype data with pre-specified allele frequencies and LD coefficients <br />http://cran.r-project.org/web/packages/hapsim/index.html<br /><br />HAPSIMU <br />A program that simulates heterogeneous populations with various known and controllable structures under the continuous migration model or the discrete model <br />http://l.web.umkc.edu/liujian/<br /><br />IBDsim <br />IBDSim is a computer package for the simulation of genotypic data under general isolation by distance models. <br />http://raphael.leblois.free.fr/<br /><br />indel-Seq-Gen <br />A biological sequence simulation program that simulates highly divergent DNA sequences and protein superfamilies <br />http://bioinfolab.unl.edu/~cstrope/isg/<br /><br />Indelible <br />A powerful and flexible simulator of biological evolution <br />http://abacus.gene.ucl.ac.uk/software/indelible/<br /><br />invertFREGENE <br />InvertFREGENE is a forward-in-time simulator of inversions in population genetic data <br />http://www.ebi.ac.uk/projects/bargen/<br /><br />kernalPop <br />A spatially explicit population genetic simulation engine <br />http://cran.r-project.org/src/contrib/archive/kernelpop/<br /><br />MaCS <br />Markovian Coalescent Simulator <br />http://www-hsc.usc.edu/~garykche/<br /><br />Mason <br />A package for the simulation of nucleotide data. <br />http://www.seqan.de/projects/mason/<br /><br />mbs <br />modifying Hudson's ms software to generate samples of DNA sequences with a biallelic site under selection <br />http://www.sendou.soken.ac.jp/esb/innan/innanlab/software.html<br /><br />Mendel's Accountant <br />Mendel's Accountant (MENDEL) is an advanced numerical simulation program for modeling genetic change over time and was developed collaboratively by Sanford, Baumgardner, Brewer, Gibson and ReMine <br />http://mendelsaccount.sourceforge.net/<br /><br />MetaSim <br />A tool to generate collections of synthetic reads that reflect the diverse taxonomical composition of typical metagenome data sets <br />http://ab.inf.uni-tuebingen.de/software/metasim/<br /><br />mlcoalsim <br />Multilocus Coalescent Simulations <br />http://code.google.com/p/mlcoalsim-v1/<br /><br />ms <br />The purpose of this program is to allow one to investigate the statistical properties of such samples, to evaluate estimators or statistical tests, and generally to aid in the interpretation of polymorphism data sets. <br />http://home.uchicago.edu/~rhudson1/source/mksamples.html<br /><br />msHOT <br />The purpose of this program is to allow one to investigate the statistical properties of such samples, to evaluate estimators or statistical tests, and generally to aid in the interpretation of polymorphism data sets. <br />http://home.uchicago.edu/~rhudson1/<br /><br />msms <br />A coalescent Simlation tool with selection. <br />http://www.mabs.at/ewing/msms/index.shtml<br /><br />MySSP <br />A program for the simulation of DNA sequence evolution across a phylogenetic tree <br />http://www.rosenberglab.net/software.php<br /><br />Nemo <br />A forward-time, individual-based, genetically explicit, and stochastic simulation program designed to study the evolution of genetic markers, life history traits, and phenotypic traits in a flexible (meta-)population framework. <br />http://nemo2.sourceforge.net/<br /><br />NetRecodon <br />Coalescent simulation of coding DNA sequences with recombination (inter and intracodon), migration and demography <br />http://code.google.com/p/netrecodon/<br /><br />PEDAGOG <br />Software for simulating eco-evolutionary population dynamics <br />https://bcrc.bio.umass.edu/pedigreesoftware/node/5<br /><br />phenosim <br />A tool to add phenotypes to simulated genotypes <br />http://evoplant.uni-hohenheim.de/doku.php?id=software:software<br /><br />PhyloSim <br />An R package for the Monte Carlo simulation of sequence evolution <br />http://bit.ly/rlsim-git<br /><br />pIRS <br />Profile-based Illumina pair-end reads simulator <br />https://code.google.com/p/pirs/<br /><br />ProteinEvolver <br />Simulation of protein evolution along phylogenies under structure-based substitution models <br />http://code.google.com/p/proteinevolver/<br /><br />QMSim <br />QTL and Marker Simulator <br />http://www.aps.uoguelph.ca/~msargol/qmsim/<br /><br />quantiNEMO <br />An individual-based program for the analysis of quantitative traits with explicit genetic architecture potentially under selection in a structured population <br />http://www2.unil.ch/popgen/softwares/quantinemo/<br /><br />RECOAL <br />Simulates new haplotype data from a reference population of haplotypes. <br />ftp://popgen.usc.edu/<br /><br />Recodon <br />Coalescent simulation of coding DNA sequences with recombination, migration and demography <br />http://code.google.com/p/recodon/<br /><br />rlsim <br />A package for simulating RNA-seq library preparation with parameter estimation <br />http://bit.ly/rlsim-git<br /><br />Rmetasim <br />Rmetasim is a front-end for the metasim engine that is implemented as a package that runs in the statistical computing environment R <br />http://linum.cofc.edu/software.html#metasim<br /><br />RNA Seq Simulator <br />RSS takes SAM alignment files from RNA-Seq data and simulates over dispersed, multiple replica, differential, non-stranded RNA-Seq datasets. <br />http://useq.sourceforge.net/cmdlnmenus.html#rnaseqsimulator<br /><br />Rose <br />Random model of sequence evolution <br />http://bibiserv.techfak.uni-bielefeld.de/rose/<br /><br />SelSim <br />SelSim is a program for Monte Carlo simulation of DNA polymorphism data for a recom- bining region within which a single bi-allelic site has experienced natural selection <br />http://www.well.ox.ac.uk/~spencer/selsim/<br /><br />Seq-Gen <br />An application for the Monte Carlo simulation of molecular sequence evolution along phylogenetic trees. <br />http://tree.bio.ed.ac.uk/software/seqgen/<br /><br />SEQPower <br />Statistical power analysis for sequence-based association studies <br />http://bioinformatics.org/spower/<br /><br />SeqSIMLA <br />SeqSIMLA can simulate sequence data with user-specified disease and quantitative trait models. Family or unrelated case-control data can be simulated. <br />http://seqsimla.sourceforge.net/<br /><br />Serial NetEvolve <br />A flexible utility for generating serially-sampled sequences along a tree or recombinant network <br />http://biorg.cis.fiu.edu/sne/<br /><br />SFS_CODE <br />SFS_CODE can perform forward population genetic simulations under a general Wright-Fisher model with arbitrary migration, demographic, selective, and mutational effects. <br />http://sfscode.sourceforge.net/sfs_code/index/index.html<br /><br />SIBSIM <br />Quantitative phenotype simulation in extended pedigrees <br />http://sourceforge.net/projects/sibsim/<br /><br />SIMCOAL2 <br />A coalescent program for the simulation of complex recombination patterns over large genomic regions under various demographic models <br />http://cmpg.unibe.ch/software/simcoal2/<br /><br />SimCopy <br />An R package simulating the evolution of copy number profiles along a tree. <br />http://bit.ly/simcopy<br /><br />SIMLA <br />SIMLA is a SIMuLAtion program that generates data sets of families for use in Linkage and Association studies. <br />http://www.chg.duke.edu/research/simla.html<br /><br />SimPed <br />A Simulation Program to Generate Haplotype and Genotype Data for Pedigree Structures <br />http://www.hgsc.bcm.tmc.edu/content/simped<br /><br />Simprot <br />A program to simulate protein evolution by substitution, insertion and deletion <br />http://www.uhnresearch.ca/labs/tillier/software.htm#3<br /><br />SimRare <br />Rare variant simulation and analysis tool <br />http://code.google.com/p/simrare/<br /><br />simuGWAS <br />A forward-time simulator that simulates realistic samples for genome-wide association studies. <br />http://simupop.sourceforge.net/cookbook/simucomplexdisease<br /><br />simuPOP <br />simuPOP is a general-purpose individual-based forward-time population genetics simulation environment. <br />http://simupop.sourceforge.net/<br /><br />SISSI <br />A software tool to generate data of related sequences along a given phylogeny, taking into account user defined system of neighbourhoods and instantaneous rate matrices. <br />http://www.cibiv.at/software/sissi/<br /><br />SNPsim <br />Coalescent simulation of hotspot recombination <br />http://code.google.com/p/phylosoftware/<br /><br />SPIP <br />SPIP simulates the transmission of genes from parents to offspring in a population having demographic structure defined by the user <br />http://swfsc.noaa.gov/textblock.aspx?division=fed&amp;id=3434<br /><br />Splatche <br />Spatial and Temporal Coalescences in Heterogeneous Environment <br />http://www.splatche.com/<br /><br />srv <br />Simulator of Rare Varaints (srv) is a simulator for the simulation of the introduction and evolution of (rare) genetic variants. <br />http://simupop.sourceforge.net/cookbook/simurarevariants<br /><br />SUP <br />SLINK/FastSLINK utility program <br />http://mlemire.freeshell.org/software.html<br /><br />TreesimJ <br />A flexible, forward-time population genetic simulator <br />http://code.google.com/p/treesimj/<br /><br />Vortex <br />VORTEX is an individual-based simulation model for population viability analysis (PVA). <br />http://www.vortex9.org/vortex.html<br /><br />References:</p><p>Image www.evolution-of-life.com</p><p>www.cancer.gov</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28807/organellargenomedraw</guid>
	<pubDate>Tue, 16 Aug 2016 08:13:13 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28807/organellargenomedraw</link>
	<title><![CDATA[OrganellarGenomeDRAW]]></title>
	<description><![CDATA[<p><span>O</span><span>rganellar</span><span>G</span><span>enome</span><span>DRAW</span><span>&nbsp;is dedicated to convert genetic information stored in GenBank entries to graphical maps. The input text file has to be in GenBank flat file format, whereas the output format can be chosen among several formats. The application is especially optimized and adapted for the creation of high-quality, detailed circular maps of organellar genomes like the plastid genome (plastome) or the mitochondrial genome (chondriome). Nevertheless, you can upload any GenBank entry. The workflow is devided into three steps.&nbsp;</span></p>
<p><span>More at&nbsp;http://ogdraw.mpimp-golm.mpg.de/cgi-bin/ogdraw.pl</span></p><p>Address of the bookmark: <a href="http://ogdraw.mpimp-golm.mpg.de/index.shtml" rel="nofollow">http://ogdraw.mpimp-golm.mpg.de/index.shtml</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33924/figtree</guid>
	<pubDate>Wed, 19 Jul 2017 08:06:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33924/figtree</link>
	<title><![CDATA[FigTree]]></title>
	<description><![CDATA[<p><span>FigTree is designed as a graphical viewer of phylogenetic trees and as a program for producing publication-ready figures. As with most of my programs, it was written for my own needs so may not be as polished and feature-complete as a commercial program. In particular it is designed to display summarized and annotated trees produced by BEAST.</span></p><p>Address of the bookmark: <a href="http://tree.bio.ed.ac.uk/software/figtree/" rel="nofollow">http://tree.bio.ed.ac.uk/software/figtree/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34172/orthodotter-synteny-plots-oxford-grid</guid>
	<pubDate>Wed, 09 Aug 2017 07:16:16 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34172/orthodotter-synteny-plots-oxford-grid</link>
	<title><![CDATA[orthodotter: Synteny plots (oxford grid)]]></title>
	<description><![CDATA[<pre><code>orthodotter -h
--------------------------------------------------------------------------------
orthodotter - Plot orthologous genes on an oxford grid.
       -f &lt;file&gt;     : input file, containing orthologous genes, default is stdin
                       species chr-name start end species chr-name start end
       -toPlot &lt;arg&gt; : give the x and y sets and the color separated by double-dots,
                       for example set1:set2:red will plot set1 on x, set2 on y with
                       red points. Could give several -toPlot arguments.
                       To launch the clustering of dots, use extra-option 1=dist,min_nb_genes
                       where dist is the minimal distance (euclidian) between two points and min_nb_genes the minimal
                       number of genes in a cluster to be valid.
       -o &lt;file&gt;     : output file, default is stdout
       -x &lt;int&gt;      : resolution of x axis, default is 600
       -y &lt;int&gt;      : resolution on y axis, default is 600
       -r &lt;int&gt;      : radius of circle representing orthologous genes
       -format       : could be png, gif, jpg, pdf or ps. Default is png.
       -fg           : foreground color, default is black
       -bg           : background color, default is transparent
       -fSize &lt;int&gt;  : fontSize, default is 1
       -filter       : check chromosome names
       -h            : help
--------------------------------------------------------------------------------
orthodotter -f Vigne_Banane.ortho -toPlot Vigne:Banane:black:1=10,5 -x 1200 -y 1200 -bg white -o Vigne_vs_Banane.png &gt; Vigne_vs_Banane.clusters
--------------------------------------------------------------------------------</code></pre><p>Address of the bookmark: <a href="https://github.com/institut-de-genomique/orthodotter" rel="nofollow">https://github.com/institut-de-genomique/orthodotter</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44734/data-visualization-in-bioinformatics-useful-and-eye-catching-plots-for-data-analysis</guid>
	<pubDate>Sat, 14 Dec 2024 12:41:53 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44734/data-visualization-in-bioinformatics-useful-and-eye-catching-plots-for-data-analysis</link>
	<title><![CDATA[Data Visualization in Bioinformatics: Useful and Eye-Catching Plots for Data Analysis]]></title>
	<description><![CDATA[<p>Data visualization is a cornerstone of bioinformatics, enabling researchers to interpret complex datasets effectively. With a plethora of data types&mdash;genomic sequences, expression profiles, protein interactions, and more&mdash;the right visualizations can make or break an analysis. This blog highlights some of the most useful and visually compelling plots for bioinformatics data analysis, along with tools to create them.</p><h4><strong>1. Heatmaps: Exploring Patterns in High-Dimensional Data</strong></h4><p>Heatmaps are a go-to visualization for representing high-dimensional datasets, such as gene expression or metabolomics data. They use color gradients to display data intensity, making patterns and clusters easily detectable.</p><ul>
<li>
<p><strong>Applications</strong>: Gene expression analysis, pathway enrichment, methylation studies.</p>
</li>
<li>
<p><strong>Tools</strong>: Seaborn (Python), ComplexHeatmap (R), Morpheus (web-based).</p>
</li>
</ul><p><strong>Tip</strong>: Add dendrograms to visualize clustering of rows and columns for hierarchical relationships.</p><h4><strong>2. Volcano Plots: Highlighting Differential Features</strong></h4><p>Volcano plots are indispensable for identifying significantly differentially expressed genes or proteins. They plot the log2 fold change against &ndash;log10(p-value), making it easy to spot statistically significant changes.</p><ul>
<li>
<p><strong>Applications</strong>: RNA-seq, proteomics, and metabolomics.</p>
</li>
<li>
<p><strong>Tools</strong>: ggplot2 (R), EnhancedVolcano (R), Plotly (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use color to highlight significant features and label key genes or proteins.</p><h4><strong>3. PCA Plots: Reducing Complexity with Principal Component Analysis</strong></h4><p>Principal Component Analysis (PCA) plots are used to reduce dimensionality and uncover trends or clusters in data. They provide insights into sample variability and grouping.</p><ul>
<li>
<p><strong>Applications</strong>: Transcriptomics, metabolomics, microbiome studies.</p>
</li>
<li>
<p><strong>Tools</strong>: scikit-learn + Matplotlib (Python), prcomp (R), ClustVis (web-based).</p>
</li>
</ul><p><strong>Tip</strong>: Annotate clusters with metadata to enhance interpretability.</p><h4><strong>4. Manhattan Plots: Genome-Wide Association Studies</strong></h4><p>Manhattan plots visualize p-values across the genome, making it easy to identify significant associations in genome-wide studies. They resemble city skylines, with the highest peaks indicating loci of interest.</p><ul>
<li>
<p><strong>Applications</strong>: GWAS, QTL mapping.</p>
</li>
<li>
<p><strong>Tools</strong>: qqman (R), Matplotlib (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use alternating colors for chromosomes and highlight significant SNPs for clarity.</p><h4><strong>5. Circular Plots (Circos): Visualizing Genomic Relationships</strong></h4><p>Circular plots are ideal for visualizing relationships across the genome, such as structural variations, gene duplications, or synteny.</p><ul>
<li>
<p><strong>Applications</strong>: Comparative genomics, structural variation studies.</p>
</li>
<li>
<p><strong>Tools</strong>: Circos (standalone), Rcircos (R), pyCircos (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Keep the plot clean and avoid overcrowding to maintain readability.</p><h4><strong>6. Sankey Diagrams: Tracking Data Flows</strong></h4><p>Sankey diagrams visualize flows or relationships between categories, often used to track changes in gene expression or pathway enrichment across conditions.</p><ul>
<li>
<p><strong>Applications</strong>: Pathway analysis, gene set enrichment analysis.</p>
</li>
<li>
<p><strong>Tools</strong>: Plotly (Python), networkD3 (R).</p>
</li>
</ul><p><strong>Tip</strong>: Use gradients or distinct colors to highlight key transitions.</p><h4><strong>7. Network Graphs: Mapping Interactions</strong></h4><p>Network graphs represent relationships between entities, such as protein-protein interactions or gene regulatory networks. Nodes represent entities, and edges represent relationships.</p><ul>
<li>
<p><strong>Applications</strong>: Systems biology, interactomics.</p>
</li>
<li>
<p><strong>Tools</strong>: Cytoscape (standalone), igraph (R), NetworkX (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use edge thickness or node size to represent interaction strength or centrality.</p><h4><strong>8. Violin Plots: Visualizing Data Distribution</strong></h4><p>Violin plots combine a boxplot with a density plot, showing the distribution and variability of data.</p><ul>
<li>
<p><strong>Applications</strong>: Single-cell RNA-seq, quantitative trait analysis.</p>
</li>
<li>
<p><strong>Tools</strong>: Seaborn (Python), ggplot2 (R).</p>
</li>
</ul><p><strong>Tip</strong>: Split violins by groups for side-by-side comparisons.</p><h4><strong>9. Time-Series Plots: Monitoring Changes Over Time</strong></h4><p>Time-series plots display changes in variables across time points, useful for tracking gene expression dynamics or metabolic fluxes.</p><ul>
<li>
<p><strong>Applications</strong>: Time-course experiments, cell cycle studies.</p>
</li>
<li>
<p><strong>Tools</strong>: Matplotlib (Python), ggplot2 (R).</p>
</li>
</ul><p><strong>Tip</strong>: Smooth the data to highlight trends while avoiding overfitting.</p><h4><strong>10. Genome Tracks: Visualizing Genomic Features</strong></h4><p>Genome tracks display multiple layers of genomic data, such as gene annotations, sequencing coverage, and epigenetic marks.</p><ul>
<li>
<p><strong>Applications</strong>: ChIP-seq, ATAC-seq, whole-genome sequencing.</p>
</li>
<li>
<p><strong>Tools</strong>: IGV (standalone), pyGenomeTracks (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Stack related tracks for direct comparisons.</p><h4><strong>11. UpSet Plots: Visualizing Set Intersections</strong></h4><p>UpSet plots are a powerful alternative to Venn diagrams for visualizing intersections between multiple datasets.</p><ul>
<li>
<p><strong>Applications</strong>: Overlap analysis for gene sets, pathways, or variants.</p>
</li>
<li>
<p><strong>Tools</strong>: UpSetR (R), ComplexUpset (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use bar plots to represent the size of each intersection for added clarity.</p><h4><strong>12. Ridge Plots: Comparing Distributions</strong></h4><p>Ridge plots visualize the distributions of multiple datasets, stacked for easy comparison.</p><ul>
<li>
<p><strong>Applications</strong>: Transcriptomics, single-cell RNA-seq.</p>
</li>
<li>
<p><strong>Tools</strong>: ggridges (R), Matplotlib (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use transparency and consistent scaling for better readability.</p><h4><strong>13. Chord Diagrams: Visualizing Connections Between Groups</strong></h4><p>Chord diagrams illustrate relationships between categories, such as shared genes between pathways or overlaps in regulatory elements.</p><ul>
<li>
<p><strong>Applications</strong>: Pathway overlap, synteny, co-expression networks.</p>
</li>
<li>
<p><strong>Tools</strong>: Circlize (R), Holoviews (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use distinct colors for each group to emphasize relationships.</p><h4><strong>14. Treemaps: Hierarchical Data Representation</strong></h4><p>Treemaps visualize hierarchical data as nested rectangles, with area proportional to data size.</p><ul>
<li>
<p><strong>Applications</strong>: Ontology enrichment, pathway analysis.</p>
</li>
<li>
<p><strong>Tools</strong>: Treemapify (R), Plotly (Python).</p>
</li>
</ul><p><strong>Tip</strong>: Use colors to represent additional variables, like significance or enrichment scores.</p><h4><strong>15. T-SNE/UMAP Plots: Dimensionality Reduction for Clustering</strong></h4><p>T-SNE and UMAP plots are great for visualizing high-dimensional data in two dimensions while preserving local or global structure.</p><ul>
<li>
<p><strong>Applications</strong>: Single-cell transcriptomics, clustering analyses.</p>
</li>
<li>
<p><strong>Tools</strong>: scikit-learn (Python), Seurat (R).</p>
</li>
</ul><p><strong>Tip</strong>: Combine with metadata annotations for better cluster interpretation.</p><h4><strong>Bringing It All Together</strong></h4><p>The choice of visualization can significantly impact the insights gained from bioinformatics data. By selecting plots tailored to your data type and analysis goals, you can effectively communicate your findings and make your research more impactful. Whether you&rsquo;re a seasoned bioinformatician or a beginner, mastering these visualizations will elevate your analyses and presentations.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/7986/list-of-bioinformatics-open-source-projectssoftware</guid>
	<pubDate>Tue, 21 Jan 2014 14:28:37 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/7986/list-of-bioinformatics-open-source-projectssoftware</link>
	<title><![CDATA[List of bioinformatics open source projects/software.]]></title>
	<description><![CDATA[<p>Open source software is software that can be freely used, changed, and shared (in modified or unmodified form) by anyone. Open source software is made by many people, and distributed under licenses that comply with the Open Source Definition.The Open Source Initiative (OSI) is a global non-profit that supports and promotes the open source movement. Followings are the OS bioinformatics projects/software :</p><p><strong>.NET Bio</strong></p><p>http://blogs.msdn.com/b/msr_er/archive/2011/10/18/microsoft-biology-foundation-evolves-into-new-toolkit-net-bio.aspx</p><p>A language-neutral bioinformatics toolkit built using the Microsoft 4.0 .NET Framework to help developers, researchers, and scientists.</p><p><strong>AMPHORA</strong> ("AutoMated Phylogenomic infeRence Application")</p><p>http://wolbachia.biology.virginia.edu/WuLab/Software.html</p><p><a href="http://en.wikipedia.org/wiki/Metagenomics" title="Metagenomics">Metagenomics</a> analysis software</p><p><strong>Anduril</strong></p><p>http://www.anduril.org/anduril/site/</p><p>Component-based <a href="http://en.wikipedia.org/wiki/Workflow" title="Workflow">workflow</a> framework for data analysis</p><p>Armadillo workflow platform</p><p>Tool for designing and executing phylogenetic workflows</p><p><strong>AutoDock</strong></p><p>http://autodock.scripps.edu/</p><p>suite of automated docking tools</p><p><strong>Biochemical Algorithms Library (BALL)</strong></p><p>http://www.ball-project.org/</p><p>C++ library and framework for molecular modeling and visualization designed for rapid prototyping</p><p><strong>Bio4j</strong></p><p>http://bio4j.com/</p><p>Bio4j is a <a href="http://en.wikipedia.org/wiki/Bioinformatics" title="Bioinformatics">bioinformatics</a> platform and <a href="http://en.wikipedia.org/wiki/Chart" title="Chart">graph</a> based <a href="http://en.wikipedia.org/wiki/Database" title="Database">database</a> built around most data available in <a href="http://en.wikipedia.org/wiki/UniProt" title="UniProt">UniProt</a> KB(<a href="http://en.wikipedia.org/wiki/Swiss-Prot" title="Swiss-Prot">Swiss-Prot</a> + <a href="http://en.wikipedia.org/wiki/TrEMBL" title="TrEMBL">TrEMBL</a>), <a href="http://en.wikipedia.org/wiki/Gene_Ontology" title="Gene Ontology">Gene Ontology</a> (GO), <a href="http://en.wikipedia.org/w/index.php?title=UniRef&amp;action=edit&amp;redlink=1" title="UniRef (page does not exist)">UniRef</a> (50,90,100), <a href="http://en.wikipedia.org/wiki/RefSeq" title="RefSeq">RefSeq</a>, <a href="http://en.wikipedia.org/wiki/National_Center_for_Biotechnology_Information" title="National Center for Biotechnology Information">NCBI</a> taxonomy, and Expasy Enzyme DB</p><p><strong>Bioclipse</strong></p><p>www.bioclipse.net</p><p>Visual platform for <a href="http://en.wikipedia.org/wiki/Cheminformatics" title="Cheminformatics">chemo</a>- and <a href="http://en.wikipedia.org/wiki/Bioinformatics" title="Bioinformatics">bioinformatics</a> based on the <a href="http://en.wikipedia.org/wiki/Eclipse_%28software%29" title="Eclipse (software)">Eclipse</a> Rich Client Platform (RCP).</p><p><strong>Bioconductor</strong></p><p>http://www.bioconductor.org/</p><p><a href="http://en.wikipedia.org/wiki/R_%28programming_language%29" title="R (programming language)">R (programming language)</a> language toolkit</p><p><strong>Bioinformatics Learning Tutorial (BLT)</strong></p><p>http://sourceforge.net/projects/biotutorial/</p><p>Educational <a href="http://en.wikipedia.org/wiki/Interactive_tutorials" title="Interactive tutorials">interactive tutorials</a> and 3D animations for Replication, Transcription, and Translation</p><p><strong>BioHaskell</strong></p><p>http://biohaskell.org/</p><p><a href="http://en.wikipedia.org/wiki/Haskell_%28programming_language%29" title="Haskell (programming language)">Haskell (programming language)</a></p><p><strong>BioJava</strong></p><p>http://biojava.org/wiki/Main_Page</p><p><a href="http://en.wikipedia.org/wiki/Java_%28programming_language%29" title="Java (programming language)">Java (programming language)</a></p><p><strong>BioMOBY</strong></p><p>http://biomoby.org/</p><p>registry of <a href="http://en.wikipedia.org/wiki/Web_services" title="Web services">web services</a></p><p><strong>BioPerl</strong></p><p>http://www.bioperl.org/wiki/Main_Page</p><p><a href="http://en.wikipedia.org/wiki/Perl" title="Perl">Perl</a> language toolkit</p><p><strong>BioPHP</strong></p><p>http://www.biophp.org/</p><p><a href="http://en.wikipedia.org/wiki/PHP" title="PHP">PHP</a> language toolkit</p><p><strong>Biopython</strong></p><p>http://biopython.org/wiki/Main_Page</p><p><a href="http://en.wikipedia.org/wiki/Python_%28programming_language%29" title="Python (programming language)">Python</a> language toolkit</p><p><strong>BioRails</strong></p><p>https://github.com/biorails</p><p>a <a href="http://en.wikipedia.org/wiki/Data_management_system" title="Data management system">data management system</a> designed to support researchers in <a href="http://en.wikipedia.org/wiki/Drug_discovery" title="Drug discovery">drug discovery</a></p><p><strong>BioRuby</strong></p><p>http://bioruby.org/</p><p><a href="http://en.wikipedia.org/wiki/Ruby_%28programming_language%29" title="Ruby (programming language)">Ruby</a> language toolkit</p><p><strong>BioSmalltalk</strong></p><p>https://code.google.com/p/biosmalltalk/</p><p><a href="http://en.wikipedia.org/wiki/Smalltalk_%28programming_language%29" title="Smalltalk (programming language)">Smalltalk</a> language toolkit</p><p><strong>BioUno</strong></p><p>http://www.biouno.org/</p><p><a href="http://en.wikipedia.org/w/index.php?title=BioUno&amp;action=edit&amp;redlink=1" title="BioUno (page does not exist)">BioUno</a> is a project that applies <a href="http://en.wikipedia.org/wiki/Continuous_Integration" title="Continuous Integration">Continuous Integration</a> tools and techniques in <a href="http://en.wikipedia.org/wiki/Bioinformatics" title="Bioinformatics">Bioinformatics</a>. It uses <a href="http://en.wikipedia.org/wiki/Jenkins_%28software%29" title="Jenkins (software)">Jenkins</a> and its plug-in API to create <a href="http://en.wikipedia.org/wiki/Bioinformatics_workflow_management_system" title="Bioinformatics workflow management system">biology workflows</a> and manage <a href="http://en.wikipedia.org/wiki/Computer_clusters" title="Computer clusters">computer clusters</a>.</p><p><strong>caCORE</strong></p><p>&nbsp;</p><p>ontologic representation environment</p><p><strong>caArray</strong></p><p>https://cabig-stage.nci.nih.gov/community/tools/caArray</p><p>ontologic representation environment</p><p><strong>EMBOSS</strong></p><p>http://emboss.sourceforge.net/</p><p>Suite of packages for sequencing, searching, etc.</p><p><strong>Gaggle</strong></p><p>https://www.gaggle.net/</p><p>A framework for interoperability between systems biology software</p><p><strong>Galaxy</strong></p><p>http://galaxyproject.org/</p><p><a href="http://en.wikipedia.org/wiki/Scientific_workflow_system" title="Scientific workflow system">Scientific workflow</a> and <a href="http://en.wikipedia.org/wiki/Data_integration" title="Data integration">data integration</a> system</p><p><strong>GenePattern</strong></p><p>http://www.broadinstitute.org/cancer/software/genepattern/</p><p><a href="http://en.wikipedia.org/wiki/Scientific_workflow_system" title="Scientific workflow system">Scientific workflow system</a> that provides access to more than 150 genomic analysis tools</p><p><strong>GeWorkbench</strong></p><p>http://wiki.c2b2.columbia.edu/workbench/index.php/Home</p><p>Genomic <a href="http://en.wikipedia.org/wiki/Data_integration" title="Data integration">data integration</a> platform</p><p><strong>GMOD</strong></p><p>http://www.gmod.org/wiki/Main_Page</p><p>Toolkit for addressing many common challenges at biological databases.</p><p><strong>GeneProf</strong></p><p>http://www.geneprof.org/GeneProf/</p><p>A web-based, bioinformatics software suite for the analysis of functional genomics experiments, e.g. RNA-seq or ChIP-seq.</p><p><strong>GeneTalk</strong></p><p>http://www.gene-talk.de/</p><p>Tool for filtering sequence variants in <a href="http://en.wikipedia.org/wiki/Variant_Call_Format" title="Variant Call Format">VCF</a> files. Network for scientists and clinicians for expertise and knowledge exchange. Database of annotations aboute sequence variants with clinically relevant information.</p><p><strong>GenGIS</strong></p><p>http://kiwi.cs.dal.ca/GenGIS/Main_Page</p><p>Application that allows users to combine digital map data with information about biological sequences collected from the environment.</p><p><strong>GenomeSpace</strong></p><p>http://www.genomespace.org/</p><p>Centralized web application that provides data format transformations and facilitates connections with other bioinformatics tools</p><p><strong>GENtle</strong></p><p>http://directory.fsf.org/wiki/GENtle</p><p>An equivalent to the proprietary <a href="http://en.wikipedia.org/wiki/Vector_NTI" title="Vector NTI">Vector NTI</a>, a tool to analyze and edit <a href="http://en.wikipedia.org/wiki/DNA" title="DNA">DNA</a> sequence files</p><p><strong>Integrated Genome Browser</strong></p><p>http://bioviz.org/igb/</p><p><a href="http://en.wikipedia.org/wiki/Java_%28software_platform%29" title="Java (software platform)">Java</a>-based desktop <a href="http://en.wikipedia.org/wiki/Genome_browser" title="Genome browser">genome browser</a></p><p><strong>Integrative Genomics Viewer (IGV)</strong></p><p>http://www.broadinstitute.org/igv/</p><p>High-performance desktop tool for interactive visual exploration of diverse genomic data</p><p><strong>IntAct</strong></p><p>http://www.ebi.ac.uk/intact/</p><p>molecular interaction database</p><p><strong>InterMine</strong></p><p>http://intermine.github.io/intermine.org/</p><p>Extensive data warehouse system for the analysis and integration of biological datasets</p><p><strong>Java Treeview</strong></p><p>http://jtreeview.sourceforge.net/</p><p>microarray data viewer</p><p><strong>LabKey Server</strong></p><p>http://labkey.com/</p><p>platform for integrating, analyzing and sharing data</p><p><strong>OpenClinica</strong></p><p>https://www.openclinica.com/</p><p>software for capturing and managing data in clinical trials</p><p><a href="http://www.biomedcentral.com/1471-2164/13/512">PromKappa</a></p><p>http://xbioinformatics.wordpress.com/tag/promkappa/</p><p>PromKappa (Promoter analysis by Kappa) software program used for promoter pattern generation and promoter analysis.</p><p><strong>MeV: Multi-Experiment Viewer</strong></p><p>http://www.tm4.org/mev.html</p><p>a desktop application for the analysis, visualization and data-mining of large-scale genomic data</p><p><strong>PathVisio</strong></p><p>http://www.pathvisio.org/</p><p>a desktop software for drawing, analysis and visualization of biological pathways</p><p>REDCRAFT</p><p>software for determining tertiary protein structure given assigned Residual Dipolar Coupling data</p><p>SAM Tools</p><p>Data format (SAM) and accompanying tool suite, for storing large nucleotide sequence alignments</p><p><a href="http://en.wikipedia.org/wiki/Staden_Package" title="Staden Package">Staden Package</a></p><p>Sequence assembly, editing and analysis, primarily consisting of gap4, gap5 and spin.</p><p><a href="http://en.wikipedia.org/wiki/STAMP" title="STAMP">STAMP</a></p><p>Software package for analyzing metagenomic profiles that promotes &lsquo;best practices&rsquo; in choosing appropriate statistical techniques and reporting results.</p><p><a href="http://supfam.org/supraHex">supraHex</a></p><p>An open-source R/Bioconductor package for omics data analysis using a supra-hexagonal map</p><p><a href="http://en.wikipedia.org/wiki/Taverna_workbench" title="Taverna workbench">Taverna workbench</a></p><p>Tool for designing and executing workflows</p><p>TGAC Browser</p><p>Genome Browser, visualisation solutions for big data in the genomic era</p><p>T-REX WebServer</p><p>Bioinformatics and phylogenetics webserver (NJ, PhyML, RAxML, MAFFT, MUSCLE, Newick viewer, <a href="http://en.wikipedia.org/wiki/Horizontal_gene_transfer" title="Horizontal gene transfer">Horizontal gene transfer</a> detection, Reticulograms, Substitution models)</p><p><a href="http://en.wikipedia.org/wiki/UGENE" title="UGENE">UGENE</a></p><p>integrated bioinformatics tools</p><p>Visomics</p><p>bioinformatics tools for omics data</p><p>Genome Analysis Toolkit 1.0 (GATK 1.0)</p><p>a software package to analyse next-generation resequencing data</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/17924/software-developed-in-pevsner-lab</guid>
	<pubDate>Mon, 06 Oct 2014 12:41:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/17924/software-developed-in-pevsner-lab</link>
	<title><![CDATA[Software developed in pevsner lab]]></title>
	<description><![CDATA[<div>
<div id="block-system-main">
<div>
<div id="node-7">
<div>
<div>
<div>
<div>
<p><a href="http://pevsnerlab.kennedykrieger.org/dragon.htm">DRAGON</a>: Database Referencing of Array Genes Online</p>
<p><a href="http://pevsnerlab.kennedykrieger.org/php/node/96">SNOMAD</a>: Standardization and Normalization of Microarray Data</p>
<p><a href="http://pevsnerlab.kennedykrieger.org/php/node/70">SNPduo</a>: SNP Analysis Between Two Individuals</p>
<p><a href="http://pevsnerlab.kennedykrieger.org/php/node/71">SNPtrio</a>: Analyzing and Visualizing and Inheritance Patterns in Trios</p>
<p><a href="http://pevsnerlab.kennedykrieger.org/php/node/64">SNPscan</a>: Data Analysis and Visualization of SNP Data</p>
<p><a href="http://pevsnerlab.kennedykrieger.org/php/node/64">pediSNP</a>: Analyze SNP Data From a Pedigree of Two Generations</p>
<p><a href="http://pevsnerlab.kennedykrieger.org/php/node/73">kcoeff</a>: Calculate Cotterman Coefficients of SNP Genotype Data</p>
<p><a href="http://pevsnerlab.kennedykrieger.org/php/node/113">triPOD:</a> Detects chromosomal abnormalities in parent-child trio-based microarray data</p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div><p>Address of the bookmark: <a href="http://pevsnerlab.kennedykrieger.org/php/?q=software" rel="nofollow">http://pevsnerlab.kennedykrieger.org/php/?q=software</a></p>]]></description>
	<dc:creator>Robert M Willioms</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/26617/list-of-bioinformatics-software-tools-for-next-generation-sequencing</guid>
	<pubDate>Fri, 11 Mar 2016 20:22:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/26617/list-of-bioinformatics-software-tools-for-next-generation-sequencing</link>
	<title><![CDATA[List of Bioinformatics Software Tools for Next Generation Sequencing]]></title>
	<description><![CDATA[<p><strong>Commercial tools</strong></p><ol>
<li><strong><a href="http://www.strand-ngs.com/">Strand NGS</a></strong>
<ul>
<li>offers many different tools including alignment, RNA-Seq, DNA-Seq, ChIP-Seq, Small RNA-Seq, Genome Browser, visualizations, Biological Interpretation, etc. Supports workflows &ldquo;one can import the sample data in FASTA, FASTQ or tag-count format. In addition, prealigned data in SAM, BAM or Illumina-specific ELAND format can be directly imported for analysis.&rdquo;</li>
<li>Alignment feature: Supports alignment from Illumina, Ion Torrent, 454 (Roche), and Pac Bio</li>
<li>DNA-Seq Feature, can annotate with dbSNP</li>
</ul>
</li>
<li><strong><a href="http://www.clcbio.com/desktop-applications/top-features/">CLC Genomics Workbench</a></strong><br />
<ul>
<li>(QIAGEN). Features include: resequencing, workflow, read mapping, de novo assembly, variant detection, RNA-Seq, ChIP-Seq, Genome Browser, etc (entire list on website); Main Workbench offers database search (Genbank, Blast, Pubmed); 2000 organizations have invested in CLC</li>
<li>Accepts VCF files from 1000 Genomes Project</li>
<li>Accepts downloaded tracks from dbSNP</li>
<li>Also accepts: FASTA, GFF/GTF/GVF, BED, Wiggle, Cosmic, UCSC variant database, complete genomics master var file</li>
<li>Read mapping: &ldquo;In addition to Sanger sequence data, reads from these high-throughput sequencing machines are supported: The 454 FLX System and the 454 GS Junior System from Roche, Illumina Genome Analyzer, Illumina HiSeq, Illumina HiScan, and Illumina MiSeq sequencing systems, SOLiD system from Life Technologies, Ion Torrent system from Life Technologies, Helicos from Helicos BioSciences&rdquo;</li>
<li>De novo assembly: &ldquo;In addition to Sanger sequence data, reads from these high-throughput sequencing machines are supported The 454 FLX System and the 454 GS Junior System from Roche, Illumina Genome Analyzer, Illumina HiSeq, Illumina HiScan, and Illumina MiSeq sequencing systems, SOLiD system from Life Technologies, Ion Torrent system from Life Technologies&rdquo;</li>
<li>Annotation tracks from Ensembl</li>
</ul>
</li>
<li><strong><a href="https://www.dnanexus.com/product-overview">DNAnexus</a></strong>
<ul>
<li>Private cloud repository -- formerly a redistributor of SRA and other NCBI resources; command-line or via web, can fetch data from a URL, build custom pipeline/ workflow has sra.dnanexus.com site: data downloads come directly from NCBI</li>
</ul>
</li>
<li><strong><a href="http://www.ingenuity.com/products/variant-analysis">Ingenuity Variant Analysis</a></strong>
<ul>
<li>(QIAGEN) allows for variant identification and analysis, uses NCI-60 data set for cancer, Supported third part informatin: Entrez Gene, RefSeq, ClinVar; gives contextual details of results instead of just A to B relationship</li>
<li>Has own database-- &ldquo;knowledge base&rdquo; based on COSMIC, OMIM, and TCGA databases</li>
</ul>
</li>
<li><strong><a href="http://www.dnastar.com/t-products-dnastar-lasergene-genomics.aspx">Lasergene Genomics Suite</a></strong>
<ul>
<li>Comprehensive NGS software pipeline for assembly, alignment, variant calling and analysis of NGS data</li>
<li>Supported workflows include: reference-guided and de novo genome and transcriptome assembly and analysis, metagenomics sample assembly, targeted resequencing, exome alignment, gene panels with validation control, variant analysis, and RNA-Seq, ChIP-Seq and miRNA alignment and analysis.</li>
<li>#1 in accuracy: fewer false negatives and better sensitivity compared to results obtained from other aligners</li>
<li>Aligns exome data and performs variant calling an average of 3 times faster than alternative pipelines</li>
<li>Annotates genomic data with allele and genotype frequency, functional impact predictions, evolutionary conservation scores and pathogenicity</li>
<li>Supports all major NGS technologies (Illumina, Ion Torrent, Pac Bio and Roche 454) and project types</li>
<li>Available on Windows, Mac OS X, Linux, and the Amazon Cloud</li>
</ul>
</li>
<li><strong><a href="http://www.softgenetics.com/NextGENe.html">NextGENe</a></strong>
<ul>
<li>&ldquo;perfect analytical partner for the analysis of desktop sequencing data produced by the ION PGM&trade;, Roche Junior, Illumina MiSeq as well as high throughput systems as the Ion Torrent Proton, Roche FLX, Applied BioSystems SOLiD&trade; and Illumina&reg; platforms.&rdquo; runs on Windows, free-standing multi-application package-- SNP/Indel analysis, CNV prediction and disease discovery, whole genome alignment, etc.</li>
<li>Data can be imported from Clinvar, dbSNP, Genbank:<a href="http://www.softgenetics.com/PDF/NextGene_UsersManual_web.pdf">http://www.softgenetics.com/PDF/NextGene_UsersManual_web.pdf</a></li>
</ul>
</li>
<li><strong><a href="http://www.partek.com/pgs">Partek Genomics Suite</a></strong>
<ul>
<li>Cited in over 3,500 peer-reviewed scientific publications</li>
<li>Workflows for microarray and PCR data include: Gene expression including alternative splicing, miRNA expression, Genome Wide Association Studies, Mother-Father-Child Trio analysis, DNA Copy number including allele specific copy number and Loss of Heterozygosity (LOH), and ChIP, and methylation. Next Generation Sequencing (NGS) workflows include: RNA-Seq, miRNA-Seq, ChIP-Seq, DNA-Seq, and Methylation</li>
<li>Powerful statistics and interactive, publication ready visualizations</li>
<li>Supports all commercial next generation sequencing and microarray file format as well as text files</li>
<li>Can input GEO SOFT files</li>
</ul>
</li>
<li><strong><a href="http://www.partek.com/partekflow">Partek Flow</a></strong>
<ul>
<li>Installation can be cloud-based or on a local cluster or Linux server</li>
<li>Easy to use point-and-click interface</li>
<li>Takes NGS data (.fastq, BAM, SAM), microarrays (Affymetrix, Illumina) and text files</li>
<li>Supports custom genome builds and annotation databases</li>
<li>Performs base trimming, alignment, quantification, quality analysis, statistics, and visualization</li>
<li>Includes ten fully customizable aligners (Bowtie, Bowtie 2, BWA, GSNAP, Isaac 2, SHRiMP 2, STAR, TMAP, TopHat and TopHat 2)</li>
<li>Applications for RNA-Seq, Small RNA-Seq, WGS/WES, Pathway enrichment, Fusion detection and Variant calling</li>
<li>Allows users to create, save, share, or download analysis pipelines for automated and repeatable analysis</li>
<li>Collaborate with others without transferring data</li>
<li>Integrates microarray and next generation sequencing data</li>
</ul>
</li>
<li><strong><a href="http://goldenhelix.com/SNP_Variation/">Golden Helix: SNP and Variation Suite</a></strong>
<ul>
<li>used for managing, analyzing and visualizing genotypic and phenotypic data; Features: Genome-wide association studies, genomic prediction, copy number analysis, small sample DNA-Seq workflows, large sample DNA-seq analysis, RNA-seq analysis. Supported files: .txt, excel XLS &amp; XLSX, CEL, CHP, CNT, Illumina, Plink PED, TPED, BED, Agilent files, NimbleGen data summary files, VCF files, Impute2 GWAS files, HapMap format, MACH output, + 50 other formats consumes NCBI data directly</li>
</ul>
</li>
<li><strong><a href="https://www.genomatix.de/">Genomatix</a></strong>
<ul>
<li>Applications: ChIP-Seq, DNA-Seq, RNA-Seq, DNA methylation; enable personalized medicine,</li>
<li>Mining Stations: Supports all established NGS sequencing platforms- SOLiD, 454 Life Sciences, Genome Analyzer, HiSeq, MiSeq, IonTorrent</li>
<li>Software Suite: can upload sequence of BED files</li>
<li>Genome browser: BED and BAM files, Public data- 1500 BED files available for every user</li>
</ul>
</li>
<li><strong><a href="http://www.biodatomics.com/">Biodatomics</a></strong>
<ul>
<li>Open source platform (SaaS), analysis and genome sequencing tools, integrates over 400 genomic analysis open source tools and pipelines, have a private and public cloud version. Features: genomic data visualization, drag and drop interface, accelerated analysis, real-time collaboration</li>
<li>They have a couple modules to do so, and have enabled parts of the sra toolkit</li>
</ul>
</li>
<li><strong><a href="https://www.solvebio.com/">SolveBio</a></strong>
<ul>
<li>Software product, for clinical genomics professionals, manage, curate, report genomic variation</li>
<li>Has own data library -- data from NCBI</li>
</ul>
</li>
<li><strong><a href="http://www.basepairtech.com">Basepair</a></strong>
<ul>
<li>Offers high quality workflows for all common NGS applications (RNA-Seq, ChIP-Seq, DNA-Seq, etc.)</li>
<li>Very fast - get all results in a 1-2 hours. Cloud-based, no storage or computing limits.</li>
<li>Easy to use - less than a minute to run an analysis</li>
<li>REST and Python API to mange large projects.</li>
</ul>
<div>&nbsp;</div>
</li>
</ol><h2><a href="https://github.com/NCBI-Hackathons/Community_Software_Tools_for_NGS/blob/master/NGS_Tools_List.md#variant-identification"></a>Variant Identification</h2><h3><a href="https://github.com/NCBI-Hackathons/Community_Software_Tools_for_NGS/blob/master/NGS_Tools_List.md#germline-callers"></a>Germline Callers</h3><ol>
<li><strong><a href="http://mathgen.stats.ox.ac.uk/impute/impute_v2.html">IMPUTE2</a></strong>
<ul>
<li>Description: phasing observed genotypes and imputing missing genotypes uses reference panels to provide all available halotypes, does not use population labels or genome-wide measures; designed to represent variation in one population; Fairly popular</li>
<li>Input:</li>
<li>Reference Haplotypes: Links to 1000 Genomes and HapMap downloads</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="https://github.com/ekg/freebayes">FreeBayes</a></strong>
<ul>
<li>Description: finds SNPs, Indels, MNPs; reports variants based on alignment; haplotype based</li>
<li>Input: BAM- uses BAMtools API to parse</li>
<li>Reference genome: FASTA</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://soap.genomics.org.cn/soapindel.html">SOAPindel</a></strong>
<ul>
<li>Description: detects indels from NGS paired-end sequencing</li>
<li>Input: files with read alignment can be SOAP or SAM formats, users must also give raw reads in Fasta or Fastq</li>
<li>Reference Sequence used to align reads: FASTA</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="https://github.com/danmaclean/2kplus2">2Kplus2</a></strong>
<ul>
<li>Description: algorithm searches graphs produced by de novo assembler Cortex; c++ source code for SNP detection &ldquo;2kplus2.cpp is a c++ source code for the detection and the classification of single nucleotide polymorphisms in transformed De Bruijn graphs using Cortex assembler.&rdquo;</li>
<li>Input:</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="https://www.hgsc.bcm.edu/software/atlas-2">Atlas 2</a></strong>
<ul>
<li>Description: specializes in separation of true SNPs and indels from sequencing and mapping errors, last update January 2013</li>
<li>Input: takes BAM file,</li>
<li>Reference Genome: FASTA</li>
<li>Output: produces VCF</li>
</ul>
</li>
<li><strong><a href="https://sites.google.com/site/vibansal/software/crisp">CRISP</a></strong>
<ul>
<li>Description: identifies SNPs and INDELs from pooled high-throughput NGS, not used for analysis of single samples; implemented in C and uses SAMtools API; latest version should work with diploid genomes</li>
<li>Input: requires BAM files (aligned with GATK)</li>
<li>Reference Genome: indexed FASTA file</li>
<li>Output: VCF files</li>
</ul>
</li>
<li><strong><a href="http://www.sanger.ac.uk/resources/software/dindel/">Dindel</a></strong>
<ul>
<li>Description: (Wellcome Trust Sanger) calls small indels from short-read sequences, only can handle Illumina data; cannot test candidate indels; written in C++, used on Linux based and Mac computers (not tested in windows)</li>
<li>Input: BAM files</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://colibread.inria.fr/software/discosnp/">discoSnp++</a></strong>
<ul>
<li>Description: detects homozygous and heterozygous SNPs and Indels; software composed of 2 modules (kissnp2 and kissreads)</li>
<li>Input: raw NGS datasets; fasta, fastq, gzipped or not;</li>
<li>no reference genome required; read pairs can be given</li>
<li>Output: FASTA</li>
</ul>
</li>
<li><strong><a href="http://odin.mdacc.tmc.edu/~wwang7/FamSeqIndex.html">FamSeq</a></strong>
<ul>
<li>Description: family-based sequencing studies- provides probability of an individual carrying variant based on family&rsquo;s raw measurements; accommodates de novo mutations, can perform variant calling at chrX;</li>
<li>Input: VCF</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://sourceforge.net/p/%20geneticthesaurus/wiki/Example/">GeneticThesaurus</a></strong>
<ul>
<li>Description: &ldquo;Annotation of genetic variants in repetitive regions&rdquo;</li>
<li>Input: Initial variant calling from bam &rarr; vcf output</li>
<li>Reference Genome: need to provide own fasta file for hg19 genome,</li>
<li>Output: vcf.gz, vtf.gz, and baf.tsv.gz output</li>
</ul>
</li>
<li><strong><a href="http://genome.sph.umich.edu/wiki/GlfMultiples">glfMultiples</a></strong>
<ul>
<li>Description: command-line, variant caller</li>
<li>Input: GLF</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://genome.sph.umich.edu/wiki/GlfSingle">glfSingle</a></strong>
<ul>
<li>Description: uses likelihood-based model for variant calling, starts from genotype likelihoods that have been computed from other tools (ex. Samtools BAQ), the likelihoods combine with individual-based prior p(genotype) to generate posterior probabilities</li>
<li>Input: GLF</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="https://github.com/ddcap/halvade">Halvade</a></strong>
<ul>
<li>Description: command-line; written in Java, &ldquo;to run halvade a reference is needed for both GATK and BWA and a SNP (dbSNP!) database is required</li>
<li>Input: FASTQ</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="https://github.com/aakrosh/indelMINER">indelMINER</a></strong>
<ul>
<li>Description: identifies indels from paired-end reads</li>
<li>Input: BAM (aligned in SAMtools API)</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="https://www.broadinstitute.org/cancer/cga/indelocator">Indelocator</a></strong>
<ul>
<li>Description: (Broad Institute): does not perform realignment, relies on alignments in BAM files (BAM files need aligned before put into indelocator); recommended to use GATK prior;</li>
<li>Input: 2 BAM files(tumor &amp; normal), annotated as germline or somatic; also has single sample mode</li>
<li>Output: &ldquo;Output of Indelocator is a high-sensitivity list of putative indel events containing large numbers of false positives. The statistics reported for each event have to be used to custom-filter the list in order to lower false positive rate&rdquo;</li>
</ul>
</li>
<li><strong><a href="https://github.com/sequencing/isaac_variant_caller">Isaac Variant Caller</a></strong>
<ul>
<li>Description: detects SNPs and small indels from diploid sample; designed to run on &ldquo;nux-like platforms&rdquo;</li>
<li>Input: BAM</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://www.swisstph.ch/kvarq">KvarQ</a></strong>
<ul>
<li>Description: in silico genotyping for selected loci in bacterial genome, written in Python and C</li>
<li>Input: FASTQ</li>
<li>reference genome or de novo assembly not needed</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://sourceforge.net/projects/lofreq/files/">LoFreq</a></strong>
<ul>
<li>Description: SNV caller, Python language, standalone program, uncovers cell-population heterogeneity from high-throughput sequencing datasets; calls variants found in &lt;.05% of the population</li>
<li>Input: BAM file input&rarr; suggest running through GATK</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="https://github.com/Illumina/manta">Manta</a></strong>
<ul>
<li>Description: Calls indels and SVs from paired end reads; standalone, command line program; Written in C++ and Python</li>
<li>Input: BAM (can tolerate non-paired-end reads); a matched tumor sample may be provided as well</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="https://github.com/benedictpaten/marginAlign">MarginAlign</a></strong>
<ul>
<li>Description: SNV caller, specifically tailored to Oxford Nanopore Reads, written in Python; Package comes with 3 programs, marginAlign, marginCaller (calls SNVs), marginStats (computes qc stats on sam files)</li>
<li>Input: SAM</li>
<li>Output: SAM</li>
</ul>
</li>
<li><strong><a href="http://gmt.genome.wustl.edu/packages/mendelscan/">MendelScan</a></strong>
<ul>
<li>Description: Last release March 2014; for analyzing sequencing data in family studies of inherited diseases; variant calls for a family in VCF file; still in alpha-testing on github, example data uses 1000 genomes dataset</li>
<li>Input:</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="https://github.com/mitenjain/nanopore">nanopore</a></strong>
<ul>
<li>Description: UCSC Nanopore group (group at UCSC studying using ion channels for analysis of single RNA/DNA structures) software pipeline; tailored to Oxford Nanopore Reads; command line program</li>
<li>Input: FASTQ</li>
<li>Reference files: FASTA</li>
<li>Output: &ldquo;For each possible pair of read file, reference genome and mapping algorithm an experiment directory will be created in the nanopore/output directory.&rdquo;</li>
</ul>
</li>
<li><strong><a href="http://omictools.com/platypus-s1989.html">Platypus</a></strong>
<ul>
<li>Description: Package program, written in C, Python, Cython; Can identify SNPs, MNPs, short indels, and larger variants; has been tested on very large datasets (1000 genomes)</li>
<li>Input: BAM</li>
<li>Reference Genome: FASTA (files must be indexed using Samtools or similar program</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://www.bioinformatics.nl/QualitySNPng/">QualitySNPng</a></strong>
<ul>
<li>Description: detection of SNPs; &ldquo;can be used as a standalone application with graphical user interface as part of pipeline system&rdquo;; does not require fully sequenced reference genome; haplotype strategy</li>
<li>Input:SAM, ACE</li>
<li>Output: GUI</li>
</ul>
</li>
<li><strong><a href="http://revister.sourceforge.net/">ReviSTER</a></strong>
<ul>
<li>Description: command line program; automated pipeline; utilizes BWA, BLAT, and SAMTools; utilizes BWA mapping program;</li>
<li>Input: FASTQ,</li>
<li>Reference sequence file and list file containing STR locations as inputs</li>
<li>Output: SAM</li>
</ul>
</li>
<li><strong><a href="http://dna-discovery.stanford.edu/software/rvd/">RVD</a></strong>
<ul>
<li>Description: command-line program, detection of rare SNVs, relies upon Samtools, can be run in MATLAB</li>
<li>Input: BAM</li>
<li>Reference Genome: FASTA</li>
<li>Output: &ldquo;The algorithm output is a call table -- a comma-separated file with one line for each base position and each line in the following format:</li>
<li>AlginmentReferencePosition, AlignmentBase, Call ,SecondBase, CenteredErrorPrc, ReferenceErrorPrc, SecondBasePrc&rdquo;</li>
</ul>
</li>
<li><strong><a href="http://snver.sourceforge.net/">SNVer</a></strong>
<ul>
<li>Description: calls common and rare variants in pool or individual NGS data, reports overall p-value, operating system independent statistical tool, identifies SNPs and INDELs, written in Java, no dependencies, straightforward command-line</li>
<li>(SNVerGUI=GUI version) --SNVerGUI: desktop tool for variant detection</li>
<li>Input: chrX annotation, sam.zip, bam.zip</li>
<li>reference file must be aligned to the data file</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://compbio.bccrc.ca/software/snvmix/">SNVMix</a></strong>
<ul>
<li>Description: detects SNVs from NGS, post-alignment tool</li>
<li>Input: pileupformat (Maq or Samtools)</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/structural-variant-machine--sv-m-.html">SV-M</a></strong>
<ul>
<li>Description: Structural Variant Machine - predicts indels, uses split read alignment profiles, validated by Sanger Sequencng</li>
<li>Input:paired-end Illumina reads from 1001 genomes project (uses ref plant- 1001genomes.org)</li>
<li>Ouptut:</li>
</ul>
</li>
<li><strong><a href="https://github.com/slindgreen/SNPest">SNPest</a></strong>
<ul>
<li>Description: Standalone program, language C++, Perl</li>
<li>Input: mpileup (SAMtools)</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://genome.sph.umich.edu/wiki/TrioCaller">TrioCaller</a></strong>
<ul>
<li>Description:Command line program, relies on BWA and samtools; genotype calling for unrelated individuals and parent-offspring trios</li>
<li>Input: BAM (that has been aligned in BWA and Samtools</li>
<li>Output: BCF that can be formatted to VCF using bcftools</li>
</ul>
</li>
<li><strong><a href="http://www.vicbioinformatics.com/software.snippy.shtml">Snippy</a></strong>
<ul>
<li>Description: finds indels between haploid reference genome and NGS sequence reads</li>
<li>Input:read files- FASTQ or FASTA (can be .gz compressed), output- .aln, .tab, .txt</li>
<li>Reference genome in FASTA or GENBANK</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://orca.bu.edu/vntrseek/">VntrSeek</a></strong>
<ul>
<li>Description: pipeline for discovering microsatellite tandem repeats with high-throughput sequencing data</li>
<li>Input: gzip-compressed FASTA or FASTQ</li>
<li>Output: VCF files; one for TRs and observed alleles, another file contains link to viewer</li>
</ul>
</li>
</ol><h3><a href="https://github.com/NCBI-Hackathons/Community_Software_Tools_for_NGS/blob/master/NGS_Tools_List.md#somatic-callers"></a>Somatic Callers</h3><ol>
<li><strong><a href="http://cakesomatic.sourceforge.net/">Cake</a></strong>
<ul>
<li>Description: standalone program, &ldquo;pipeline for the integrated analysis of somatic variants in cancer genomes&rdquo;; integrates four algorithms; written in Perl; required tools: samtools, tabix, vcftools, VarScan2, bambino, cmake, somaticsniper (User guide; workflow page)</li>
<li>Input: tumor and normal reads in BAM files, run through variant calling programs to generate intermediate VCF</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://www.broadinstitute.org/cancer/cga/mutect">MuTect</a></strong>
<ul>
<li>Description: Broad Institute, identification of somatic point mutations in cancer genomes; requires preprocessing of reads (GATK)</li>
<li>Input: same as GATK (FASTA reference genome, SAM read files)</li>
<li>Output: call-stats, VCF, wiggle files</li>
</ul>
</li>
<li><strong><a href="http://genome.sph.umich.edu/wiki/Polymutt">Polymutt</a></strong>
<ul>
<li>Description: calls SNVs and detects de novo point mutations in families</li>
<li>Input: GLF or BAM or VCF (must have identical chromosome orders)</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://tvap.genome.wustl.edu/tools/bassovac/">Bassovac</a></strong>
<ul>
<li>Description: Improved Bayesian inversion somatic caller; unlike other software packages, treats effects fully probabilisticallys instead of using ad-hoc modeling; effects are integrated at the atomic level and standard probability theory integrates read tallies to the sample level and to the tumor-normal pair level; "pending public release"</li>
<li>Input:</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://bioinformatics.ustc.edu.cn/CLImAT/">CLImAT</a></strong>
<ul>
<li>Description: standalone program; &ldquo;accurate detection of copy number alteration and loss of heterozygosity in impure and aneuploid tumor samples using whole genome sequencing data&rdquo;</li>
<li>Input: depth file generated by DFExtract and a config file</li>
<li>Output: .results file, .Gtype, LOG.txt, also generates visualization</li>
</ul>
</li>
<li><strong><a href="http://denovogear.sourceforge.net/">DeNovoGear</a></strong>
<ul>
<li>Description: de-novo variant calling and interpretation; standalone program; dependencies C++ compiler, CMake, HTSlib, Eigen, Boost</li>
<li>Input: PED and BCF</li>
<li>Output: &ldquo;The output format is a single row for each putative de novo mutation (DNM), with the following fields&rdquo;</li>
</ul>
</li>
<li><strong><a href="https://github.com/friend1ws/EBCall">EBCall</a></strong>
<ul>
<li>Description: Empirical Baysian Mutation Calling; standalone program; uses tumor/normal paired reads and non-paired normal reference samples; dependent on samtools, R and VGAM pack for R</li>
<li>Input: BAM</li>
<li>Output: not sure what exact type of file- &ldquo;The format of the result is suitable for adding annotation by annovar.&rdquo;</li>
</ul>
</li>
<li><strong><a href="https://github.com/usuyama/hapmuc">HapMuc</a></strong>
<ul>
<li>Description: standalone program; &ldquo;utilizes the information of heterozygous germline variants near candidate mutations&rdquo;; Dependent upon- Boost, SAMtools, BEDtools; 3 step workflow</li>
<li>Input: BAM</li>
<li>Output: BED</li>
</ul>
</li>
<li><strong><a href="https://github.com/cui-lab/multigems">MultiGeMS</a></strong>
<ul>
<li>Description: Multi-sample Genotype Model Selection</li>
<li>Input: .txt, pileup (SAM/BAM converted to pileup format)</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="https://bitbucket.org/joseph07/multisnv/wiki/Home">MultiSNV</a></strong>
<ul>
<li>Description: command-line program; calls SNVs from NGS data from multiple samples from the same patient; dependent on R, Git, cmake, Boost and compile libraries</li>
<li>Input: BAM or pileup</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://compbio.bccrc.ca/software/mutationseq/">MutationSeq</a></strong>
<ul>
<li>Description: standalone program, somatic SNV detection in tumor/normal samples; dependent on python, bamtools, boost, and LAPACK</li>
<li>Input: BAM</li>
<li>Output: VCF4.1 consisting of two parts (meta information &amp; data lines)</li>
</ul>
</li>
<li><strong><a href="http://www.qcmg.org/bioinformatics/tiki-index.php">qSNP</a></strong>
<ul>
<li>Description: standalone program; SNV caller for somatic variants in &ldquo;low cellularity cancer samples&rdquo;</li>
<li>Input: BAM, dbSNP data, Illumina data, chrConv</li>
<li>Output: &ldquo;qSNP output files are named using a 4-element pattern: ...&rdquo;</li>
</ul>
</li>
<li><strong><a href="https://github.com/aradenbaugh/radia/">RADIA</a></strong>
<ul>
<li>Description: RNA and DNA Integrated Analysis for Somatic Mutation Detection; DNA only Method(tumor/normal pair, ignores RNA) or Triple BAM Method (uses all three datasets from same patient); dependent upon python, samtoools, pysam API, BLAT, SnpEff</li>
<li>Input: BAM</li>
<li>Reference Genome: FASTA indexed with SAMtools faidx</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://genomics.wpi.edu/rvd2/">RVD2</a></strong>
<ul>
<li>Description: sensitive, variant detection for low-depth targeted NGS data; python module or command- line program;</li>
<li>Input: tab- deliminted depth chart format (converted from pileup files)</li>
<li>Output: three hdf5 files and a vcf file</li>
</ul>
</li>
<li><strong><a href="https://github.com/nhansen/Shimmer">Shimmer</a></strong>
<ul>
<li>Description: standalone program; detects somatic SNVs with multiple testing correction, uses Fisher&rsquo;s exact test; dependent on git, samtools, R, R statmod package; for tumor/normal matched samples</li>
<li>Input: BAM</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://www.cs.helsinki.fi/en/gsa/snv-ppilp/">SNV-PPILP</a></strong>
<ul>
<li>Description: Refines GATK&rsquo;s Unified Genotyper SNV calls for &ldquo;multiple samples assumed to form a phylogeny&rdquo;</li>
<li>Input:</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://gmt.genome.wustl.edu/packages/somatic-sniper/">SomaticSniper</a></strong>
<ul>
<li>Description: command-line application to identify SNPs between tumor/normal pairs- predicts probability of difference between two</li>
<li>Input: BAM</li>
<li>Reference Genome in FASTA</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="https://sites.google.com/site/strelkasomaticvariantcaller/">Strelka</a></strong>
<ul>
<li>Description: somatic variant calling workflow for matched tumor-normal samples; detects indels; runs on *nux-like platform</li>
<li>Input: BAM (must be sorted and indexed)- Strelka does own realignment around indels-- don&rsquo;t need to do this type of pre-processing</li>
<li>Output: pair of VCF files</li>
</ul>
</li>
<li><strong><a href="http://www.pitt.edu/~wec47/triodenovo.html">Triodenovo</a></strong>
<ul>
<li>Description: Bayesian framework for calling de novo mutations in trios</li>
<li>Input: VCF file with PL or GL fields (recommend using GATK or samtools to generate)</li>
<li>Output: out_vcf</li>
</ul>
</li>
<li><strong><a href="http://lbg.med.unc.edu/~mwilkers/unceqr_dist/">UNCeqr</a></strong>
<ul>
<li>Description: finds somatic mutations using integration of DNA and RNA seq data-- boosts sensitivity for low purity tumors and rare mutations;</li>
<li>Input:&rdquo;can accept a variety of sequencing inputs and configurations&rdquo;</li>
<li>Output: &ldquo;table of somatically mutated sites and associated information. These somatic mutations can be annotated with predicted transcript and protein effects using third party tools, such as Annovar&rdquo;</li>
</ul>
</li>
<li><strong><a href="http://sourceforge.net/projects/virmid/">Virmid</a></strong>
<ul>
<li>Description: Virtual Microdissection for SNP calling; Java based; for disease-control matched samples; uncovers SNPs with low allele frequency by considering alpha contamination</li>
<li>Input: BAM (must be sorted and indexed- samtools sort)</li>
<li>Output: VCF and report file</li>
</ul>
</li>
</ol><h3><a href="https://github.com/NCBI-Hackathons/Community_Software_Tools_for_NGS/blob/master/NGS_Tools_List.md#germline--somatic--callers"></a>Germline + Somatic Callers</h3><ol>
<li><strong><a href="http://massgenomics.org/varscan">VarScan 2</a></strong>
<ul>
<li>Description: identify germline variants, private and shared variants, somatic mutations, and somatic CNVs; detects indels</li>
<li>Input: SAMtools pileup</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://genformatic.com/baysic/">BAYSIC</a></strong>
<ul>
<li>Description: Bayesian method; combines variant calls from different methods (GATK, FreeBayes, Atlas, Samtools, etc)</li>
<li>Input: VCF format from one or more variant calling programs</li>
<li>Output: VCF file containing integrated set of variant calls</li>
</ul>
</li>
<li><strong><a href="https://github.com/ding-lab/msisensor">MSIsensor</a></strong>
<ul>
<li>Description: Microsatellite instability detection; C++ program, detects somatic and germline variants in tumor-normal paired data</li>
<li>Input: BAM index files (normal and tumor)</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://faculty.washington.edu/browning/beagle/beagle.html">Beagle version 4</a></strong>
<ul>
<li>Description: software package: genotype calling, phasing, imputation of ungenotyped markers, and identity-by-descent segment detection:unsure if this one is in the right category; genotype calling, phasing, imputation of ungenotyped markers, and identity-by-descent segment detection;</li>
<li>Input: VCF</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://www.iro.umontreal.ca/~csuros/quadgt/">QuadGT</a></strong>
<ul>
<li>Description: software package, SNV calling from normal-tumor pair and two parent genomes; quantifies descent-by-modification relationships; Written in Java</li>
<li>Input: BAM files (parsed by Picard/Samtools API)</li>
<li>Reference Genome; FASTA</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="http://sourceforge.net/projects/rarevator/">RAREVATOR</a></strong>
<ul>
<li>Description: RAre REference VAriant annotaTOR; command line; &ldquo;identification and annotation of germline and somatic variants in rare reference allele loci from second generation sequencing data&rdquo;; Bayesian genotype likelihood model</li>
<li>Input: BED or VCF files from GATK</li>
<li>Output: two VCF files (one for SNVs, one for Indels)</li>
</ul>
</li>
<li><strong><a href="http://scalpel.sourceforge.net/">Scalpel</a></strong>
<ul>
<li>Description: Used for detecting indels in a reference genome; performs localized micro-assembly of specific regions of interest; can do single, de novo, somatic reads; requires that raw reads are aligned with BWA</li>
<li>Input: BAM</li>
<li>Output: either VCF or ANNOVAR</li>
</ul>
</li>
<li><strong><a href="http://soap.genomics.org.cn/soapsnp.html">SOAPsnp</a></strong>
<ul>
<li>Description: based on Baye&rsquo;s theorem; calls consensus genotype</li>
<li>Input:SOAP short read alignment results</li>
<li>Output: GLF, option of flat tabular format</li>
</ul>
</li>
<li><strong><a href="http://sourceforge.net/projects/variantmaster/">VariantMaster</a></strong>
<ul>
<li>Description: &ldquo;extract causative variants for monogenic and sporadic genetic diseases&rdquo;; uses ANNOVAR;</li>
<li>Input: BAM or VCF files (from SAMtools, GATK)</li>
<li>Output:</li>
</ul>
</li>
</ol><h2><a href="https://github.com/NCBI-Hackathons/Community_Software_Tools_for_NGS/blob/master/NGS_Tools_List.md#downstream-analysis-of-variants"></a>Downstream Analysis of Variants</h2><ol>
<li><strong><a href="https://github.com/hakyimlab/PrediXcan%20https://github.com/hriordan/PrediXcan/">PrediXcan</a></strong>
<ul>
<li>Description: command-line, standalone package program; available in Perl, Python, and R versions; predicts liklihood of a gene being related to a certain phenotype- &ldquo;that directly tests the molecular mechanisms through which genetic variation affects phenotype.&rdquo;; no actual expression data used, only in silico expression; &ldquo;PrediXcan can detect known and novel genes associated with disease traits and provide insights into the mechanism of these associations.&rdquo;</li>
<li>Input: genotype and phenotype file (doesn&rsquo;t specify file type)</li>
<li>Output:default values: genelist, dosages (file format: snpid rsid) , dosage_prefix, weights, output</li>
</ul>
</li>
<li><strong><a href="http://ritchielab.psu.edu/software/athena-downloads">ATHENA</a></strong>
<ul>
<li>Description: Analysis Tool for Heritable and Environmental Network Associations; software package, combines machine learning model with biology and statistics to predict non-linear interactions</li>
<li>Input: Configuration file, Data file, Map file (includes rsID)</li>
<li>Output: Summary file, Best model file, dot file, individual score file, cross-validation file</li>
</ul>
</li>
<li><strong><a href="http://www.sanger.ac.uk/resources/software/rarevariant/#t_2">CCRaVAT and QuTie</a></strong>
<ul>
<li>Description: (Wellcome Trust Sanger) Case-Control Rare Variant Analysis Tool and Quantitative Trait; software packages for large-scale analysis of rare variants</li>
<li>Input: PED file and MAP file</li>
<li>Output: Five tab-delimited txt files</li>
</ul>
</li>
<li><strong><a href="http://cnsgenomics.com/software/gcta/">GCTA</a></strong>
<ul>
<li>Description: Genome Wide Complex Trait Analysis; package program, command line interface; estimates variance by all SNPs; 5 main functions: &ldquo;data management, estimation of the genetic relationships from SNPs, mixed linear model analysis of variance explained by the SNPs, estimation of the linkage disequilibrium structure, and GWAS simulation&rdquo;</li>
<li>Input: PLINK binary PED files, MACH output format</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://genomecomb.sourceforge.net/">GenomeComb</a></strong>
<ul>
<li>Description: package for analysis of complete genome data; annotation using public data or custom tracks, automated primer desing for Sanger or Sequenom validation; &ldquo;The cg process_illumina command can be used to generate annotated multisample data starting from fastq files, using tools such as bwa for alignment and GATK and samtools for variant calling. Sequencing data can also be imported from Complete Genomics (cg_process_sample command), Real Time Genomics (cg_process_rtgsample command) and VariantCallFormat (VCF) variant files (vcf2sft command).&rdquo;</li>
<li>Input: Sequencing data from Complete Genomics, Illumina, SOLiD and VCF;</li>
<li>Output: standard file format used is a simple tab delimited file (.sft, .tsv)</li>
</ul>
</li>
<li><strong><a href="http://ancorr.eimb.ru/">Genome Track Analyzer</a></strong>
<ul>
<li>Description: compares genome tracks; allows user to compare DNA expression/binding;</li>
<li>Input: multiple: SGR/TXT, BED, BED6, GFF; if using prealigned sequence data- use MACS peak caller: BAM, BED, SAM, ELAND</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://animalgene.umn.edu/gvcblub">GVCBLUP</a></strong>
<ul>
<li>Description: animal gene mapping; &ldquo;genomic prediction and variance component estimation of additive and dominance effects&rdquo;; standalone program, command line interface, writting in C++ and Java</li>
<li>Input:</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://www.jurgott.org/linkage/homog.htm">HOMOG</a></strong>
<ul>
<li>Description: Analyzes heterogeneity with respect to single marker loci or known maps of markers; Carries out homogeneity test for alternative hypothesis &ldquo;Two family types, one with linkage betweeen a trait to a marker or map of markers, the other without linkage&rdquo;</li>
<li>Input: HOMOG.DAT - described on website</li>
<li>Output: HOMOG.OUT</li>
</ul>
</li>
<li><strong><a href="http://intersnp.meb.uni-bonn.de/">INTERSNP</a></strong>
<ul>
<li>Description: GWIA for case-control SNP and quantitative traits; selected for joint analysis using priori information; Provides linear regression framework, Pathway Association Analysis, Genome-wide Haplotype Analysis,</li>
<li>Input: PLINK input formats (ped/map, tped/tfam, bed/bim/fam) Compatible with SetID files</li>
<li>Gene reference file: Ensembl Release 75</li>
<li>Output: covariance matrix for regression models</li>
</ul>
</li>
<li><strong><a href="https://github.com/PMBio/mtSet">mtSet</a></strong>
<ul>
<li>Description: Currently only the standalone version available, but moving to LIMIX software suite; offers set tests- allows for testing between variants and traits; accounts for confounding factors ex. relatedness</li>
<li>Input: sample-to-sample genetic covariance matrix needs to be computed; multiple types of input; simulator requires input genotype and relatedness component;</li>
<li>Output: resdir (result file of analysis), outfile (test statistics and p-values), manhattan_plot (flag)</li>
</ul>
</li>
<li><strong><a href="http://dougspeed.com/multiblup/">MultiBLUP</a></strong>
<ul>
<li>Description: Package program, command line interface; constructs linear prediction models; Best Linear Unbiased Prediction; improves upon BLUP involving kinship matrices; options: pre-specified kinships, regional kinships, adaptive multiblups, LD weightings</li>
<li>Input: PLINK format</li>
<li>Output:.reml, .indi.blp</li>
</ul>
</li>
</ol><h2><a href="https://github.com/NCBI-Hackathons/Community_Software_Tools_for_NGS/blob/master/NGS_Tools_List.md#variant-annotation"></a>Variant Annotation</h2><ol>
<li><strong><a href="http://annovar.openbioinformatics.org/en/latest/">ANNOVAR</a></strong>
<ul>
<li>Description: command-line tool, supports SNPs, INDELs, CNVs and block substitutions, provides wide variety of annotation techniques, depends upon multiple databases (each needing to be downloaded); annotates genetic variants; utilizes RefSeq, UCSC Genes, and the Ensembl gene annotation systems; can compare mutations detected in dpSNP or 1000 Genomes Project; Very popular *&ldquo;The final command run TABLE_ANNOVAR, using dbSNP version 138, 1000 Genomes Project 2014 Oct version, NIH-NHLBI 6500 exome database version 2 (referred to as esp6400siv2), dbNFSP version 2.6 (referred to as ljb26), dbSNP version 138 (referred to as snp138) databases and remove all temporary files, and generates the output file called myanno.hg19_multianno.txt&rdquo;</li>
<li>Input: VCF, ANNOVAR input format (simple text-based format); can convert other formats into ANNOVAR input format</li>
<li>Output: VCF (if input VCF), output file with multiple columns, tab-delimited output file</li>
</ul>
</li>
<li><strong><a href="http://wannovar.usc.edu/">wANNOVAR</a></strong>
<ul>
<li>provides web-based access to ANNOVAR software</li>
</ul>
</li>
<li><strong><a href="http://genetics.bwh.harvard.edu/pph2/">PolyPhen-2</a></strong>
<ul>
<li>Description: Very popular; Polymorphism Phenotyping; Web application; predicts impact of amino acid substitution on protein; Calculates Bayes posterior probability (Last update July 2015)</li>
<li>Input: FASTA</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://sift.jcvi.org/">SIFT</a></strong>
<ul>
<li>Description: predicts how an amino acid substitution will affect protein function; Based on degree of conservation of amino acid residues- collected though PSI-BLAST; can be applied to nonsynonymous polymorphisms or laboratory-induced missense mutations; links to dbSNP 132, GRCh37; Standalone or web app program; Very popular</li>
<li>Input: Uniprot ID or Accession, Go term ID, Function name, Species Name or ID, etc</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://snpeff.sourceforge.net/">snpEff</a></strong>
<ul>
<li>Description: Genetic variant annotation and effect prediction toolbox; integrated with Galaxy, GATK, and GNKO; can annotate SNPs, INDELs, and multiple-nucleotide polymorphisms; categorizes effects into classes by functionality; Very popular; Standalone or Web app; Claims to calculate all SNPs in 1000 genomes (EMBI) in less than 15 minutes; can annotate SNPs, MNPs, and insertions and deletions; Provides assessment of impact of the variant ( low, medium or high)</li>
<li>Input: VCF, BED</li>
<li>Output: VCF (with new ANN field, also used in ANNOVAR and VEP), HTML summary files</li>
</ul>
</li>
<li><strong><a href="http://snpeff.sourceforge.net/SnpSift.html">SnpSIFT</a></strong>
<ul>
<li>Description: Filter and manipulate annotated files; Part of SnpEff main distribution; one variants have been annotated, this can be used to filter your data to find relevant variants</li>
<li>Input:</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://www.yandell-lab.org/software/vaast.html">VAAST 2</a></strong>
<ul>
<li>Description: Variant Annotation, Analysis, and Search Tool; probabilistic search tool for identifying damage genes and the disease causing variants; can score both coding and non-coding variants; Four tools: VAT (Variant annotation tool), VST (Variant Selection Tool), VAAST, pVAAST (for pedigree data); updated April 2015</li>
<li>Input: FASTA, GFF3, GVF</li>
<li>Output: CDR (condenser file), VAAST file (both unique to VAAST)</li>
</ul>
</li>
<li><strong><a href="http://useast.ensembl.org/info/docs/tools/vep/index.html?redirect=no">VEP</a></strong>
<ul>
<li>Description: (Ensembl) Variant Effect Predictor; determines effect of variants on genes, transcripts, and protein sequence; uses SIFT and PolyPhen</li>
<li>Input: Coordinates of variants and nucleotide changes; whitespace- separated format, VCF, pileup, HGVS</li>
<li>Output: VCF, JSON, Statistics</li>
</ul>
</li>
<li><strong><a href="http://www.broadinstitute.org/cancer/cga/absolute">ABSOLUTE</a></strong>
<ul>
<li>Description: (Broad Institute); can estimate purity and ploidy to compute absolute copy number and mutation multiplicitie; reextracts data from the mixed DNA population</li>
<li>Input: HAPSEQ segdat or segmentation file</li>
<li>Output: per-sample output directory and subdirectory providing per-sample text files containing standard out being emitted from R</li>
</ul>
</li>
<li><strong><a href="http://www.interactive-biosoftware.com/alamut-batch/">Alamut Batch</a></strong>
<ul>
<li>Description: high-throughput annotation software for NGS analysis; for &ldquo;intensive variant analysis workflows&rdquo;; &ldquo;enriches raw NGS variants with dozens of attributes&rdquo;; based on clinically oriented Alamut database; Supports human genes; easy to integrate into pipeline (Latest Release- July 2015)</li>
<li>Input:VCF, tab-delimted file</li>
<li>Output: tab-separated file of annotations</li>
</ul>
</li>
<li><strong><a href="http://avia.abcc.ncifcrf.gov/apps/site/index">AVIA</a></strong>
<ul>
<li>Description: Annotation, Visualization, and Impact Analysis; &ldquo;The tool is based on coupling a comprehensive annotation pipeline with a flexible visualization method. We leveraged the ANNOVAR (Wang et. al, 2010) framework for assigning functional impact to genomic variations by extending its list of reference annotation databases (RefSeq, UCSC, SIFT, Polyphen etc.) with additional in-house developed sources (Non-B DB, PolyBrowse).&rdquo;</li>
<li>Input: BED</li>
<li>Output: Table of annotations with gene annotation features</li>
</ul>
</li>
<li><strong><a href="http://bioinformaticstools.mayo.edu/research/bior/">BioR</a></strong>
<ul>
<li>Description: (Mayo Clinic) (Page last updated June 2015) Biological Reference Repository; &ldquo;data integration tool that enables coordinate based searches and joins based on strings&rdquo;; &ldquo;BioR consists of two parts 1) the BioR toolkit which depends on Java&hellip;. 2) the BioR catalogs which are the data files used by the system&rdquo;</li>
<li>Input: VCF</li>
<li>BioR-Supported Catalogs (tar-gzip files): dbSNP, 1000 genomes, HapMap, OMIM, NCBIGene</li>
<li>Output: VCF + JSON</li>
</ul>
</li>
<li><strong><a href="http://cadd.gs.washington.edu/">CADD</a></strong>
<ul>
<li>Description: Combined Annotation Dependent Depletion; tool for scoring SNV deletions/insertions; &ldquo;integrates multiple annotations into one metric&rdquo;; Score strongly correlates with allelic diversity and pathogenicity; links to 1000 Genome variants; uses Ensembl Variant Effect Predictor</li>
<li>Input: VCF</li>
<li>Output: CADD score</li>
</ul>
</li>
<li><strong><a href="http://www2.hu-berlin.de/wikizbnutztier/software/CandiSNPer/">CandiSNPer</a></strong>
<ul>
<li>Description: web application, characterizes SNPs located in vicinity of SNP of interest;</li>
<li>Input: enter SNP ID (rsID), choose population, region, measure for LD, threshold plot format, color of SNPs, and chose to show genes</li>
<li>Output: Imagefile</li>
</ul>
</li>
<li><strong><a href="https://github.com/UppsalaGenomeCenter/CanvasDB">CanvasDB</a></strong>
<ul>
<li>Description: &ldquo;local database infrastructure for analysis of targeted- and whole genome re-sequencing projects&rdquo;; dependent on MySQL, R, and ANNOVAR</li>
<li>Input:</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://www.sanger.ac.uk/resources/software/carol/">CAROL</a></strong>
<ul>
<li>Description: (Wellcome Trust Sanger); Combined Annotation scoRing toOL; Combined functional annotation score of nonsynonymous coding variants; Combines information from PolyPhen-2 and SIFT</li>
<li>Input: tab-delimited with columns obtained from PolyPhen-2 and SIFT output</li>
<li>Output: tab-delimited file</li>
</ul>
</li>
<li><strong><a href="http://wiki.chasmsoftware.org/index.php/Main_Page">CHASM</a></strong>
<ul>
<li>Description: Cancer-specific High-throughput Annotation of Somatic Mutations; Last updated May 2014; uses Random Forest Method to &ldquo;distinguish between driver and passenger somatic mutations&rdquo;; Positive driver class curated from COSMIC database; packed together with SNVBox (database)</li>
<li>Input:Passenger mutation rates, Transcript and amino acid change, Genomic coordinates</li>
<li>Output: CHASM score, p-value, FDR</li>
</ul>
</li>
<li><strong><a href="http://www.cravat.us/">CRAVAT</a></strong>
<ul>
<li>Description: Cancer-Related Analysis of Variants Toolkit; Web application; Uses CHASM, VEST, SNVGet; &ldquo;CRAVAT provides predictive scores for germline variants, somatic mutations and relative gene importance, as well as annotations from published literature and databases&rdquo; Latest Release May 2015;</li>
<li>Input: VCF, CRAVAT format</li>
<li>Output: CRAVAT report- MS Excel spreadsheet or tab-separated file (emailed)</li>
</ul>
</li>
<li><strong><a href="http://cupsat.tu-bs.de/">CUPSAT</a></strong>
<ul>
<li>Description: Cologne University Protein Stability Analysis Tool; &ldquo;tool to predict changes in protein stability upon point mutations&rdquo;; web service program; Can predict mutant stability from existing PDB structures or custom protein structures</li>
<li>Input:for PDB- provide PDB ID and Amino Acid Residue Number; for custom- PDB file format</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="https://cbcl.ics.uci.edu/public_data/DANN/">DANN</a></strong>
<ul>
<li>Description: Deleterious Annotation of genetic variants; standalone program, uses &ldquo;the same feature set and training data as CADD to train a deep neural network&rdquo;; can catch nonlinear relationships; &ldquo;There are four different datasets: training, validation, testing, and ClinVar_ESP...The ClinVar_ESP dataset is also a testing set containing a set of &ldquo;gold standard&rdquo; pathogenic and benign variants&rdquo;</li>
<li>Input:</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://rulai.cshl.edu/cgi-bin/tools/ESE3/esefinder.cgi?process=matrices">ESEfinder</a></strong>
<ul>
<li>Description: Exonic Splicing Enhancer; useful for interpretation of point mutations/polymorphisms that are disease-associated; GUI interface; web app program</li>
<li>Input: FASTA</li>
<li>Output: html or plain text format, graphical display of results</li>
</ul>
</li>
<li><strong><a href="http://www.sanger.ac.uk/resources/software/exomiser/">Exomiser</a></strong>
<ul>
<li>Description: Wellcome Trust Sanger; functionally annotates variants from whole-exome sequencing data; Based on Jannovar and uses UCSC KnownGene; Java program; web app program (Page last modified Feb 2015)</li>
<li>Input: VCF</li>
<li>Output: TSV, VCF</li>
</ul>
</li>
<li><strong><a href="https://sites.google.com/site/famannotation/home">FamAn</a></strong>
<ul>
<li>Description: Automated variant annotation pipeline for family-based sequencing studies; Annotaties SNVs and INDELs; 4 models- autosomal dominant, autosomal recessive, de novo mutations and a general model; &ldquo;A variety of annotations are provided for each segregating variant: number of family (and family ID) each variant hits, variant genomic location and coding effect (based on snpEff), loss-of-function mutation annotation, selected ENCODE annotation, allele frequency in the 1000 Genomes Project, allele frequency in the Exome Variant Server (ESP6500), segmental duplication annotation, SIFT, PolyPhen2, LRT, MutationTaster, GERP++, PhyloP, SiPhy, etc.&rdquo; (Last updated May 2014)</li>
<li>Input: VCF</li>
<li>Output: two excel compatible outputs</li>
</ul>
</li>
<li><strong><a href="http://www.gene-talk.de/">GeneTalk</a></strong>
<ul>
<li>Description: Combines tool for filtering and data analysis with an online network for genetic professionals; Different degrees- basic license, premium license, in-house solution (the last ones are paid for- Commercial tool?)</li>
<li>Input: VCF</li>
<li>Output: GeneTalk Annotation- includes clinical data, medical relevance, scientific relevance (<a href="http://www.gene-talk.de/public/GeneTalk_Whitepaper_Annotations.pdf">http://www.gene-talk.de/public/GeneTalk_Whitepaper_Annotations.pdf</a>)</li>
</ul>
</li>
<li><strong><a href="http://genevetter.kidneyomics.org/">GeneVetter</a></strong>
<ul>
<li>Description: &ldquo;GeneVetter is a tool designed for investigation of the background prevalence of exonic variation in the Phase 3 1000 Genomes data under user defined filtering criteria&rdquo;; web app program; GeneVetter uses GRch37p4 (hs37d5.fa.gz), dbSNP build 138, 1000G Phase 3, clinvar_2014072</li>
<li>Input: VCF</li>
<li>Output: TIMS score, summary table, PCA plot</li>
</ul>
</li>
<li><strong><a href="http://www.broadinstitute.org/software/cprg/?q=node/31">GSITIC</a></strong>
<ul>
<li>Description: (Broad Institute) Last update- July 2014; Identifies genomic regions that are significantly &ldquo;amplified or deleted&rdquo;; Each is given a G score; gives genomic locations and q-values from aberrant regions</li>
<li>Input: segmentation file -seg, markers file -mk (required); -array file list -alf, CNV file -cnv</li>
<li>Reference genome: -refgene (created in MATLAB, GISITIC provides four reference genomes: hg16.mat, hg17.mat, hg18.mat, hg19.mat</li>
<li>Output: All lesions file (text file), amplifications file (text file), deletion genes file (text file), Gistic Scores file, Segmented copy number (pdf file), amplification score GISTIC plot (pdf file), Deletion score/q-vale GISTIC plot (pdf file)</li>
</ul>
</li>
<li><strong><a href="http://www.cmbi.ru.nl/hope/about">HOPE</a></strong>
<ul>
<li>Description: Have yOur Protein Explained; Web app program; Automatic mutant analysis server that provides structural effects of a mutation; Uses BLAST against UniProt and PDB along with homology modeling</li>
<li>Input: FASTA protein sequence, or accession code of protein of interest</li>
<li>Output: a report containing information from a &ldquo;decision tree&rdquo; and illustrated figures and animations</li>
</ul>
</li>
<li><strong><a href="http://umd.be/HSF/">Human Splicing Finder</a></strong>
<ul>
<li>Description: Last update: May 2013; aimed to help study pre-mRNA splicing; combines 12 algorithms to identify mutations&rsquo; effect on splicing motifs; uses ensembl database 70</li>
<li>Input: Gene Name, Ensembl transcript ID, Ensembl Gene ID, Consensus CDS, RefSeq Peptide ID, or own sequence (looks like you can enter FASTA)</li>
<li>Output: Chart with columns for predicted signal, predicted algorithm, cDNA position and interpretation</li>
</ul>
</li>
<li><strong><a href="http://larva.gersteinlab.org/">LARVA</a></strong>
<ul>
<li>Description: Large-scale Analysis of Variants in noncoding Annotations; New version released July 2015; Command-line program; used for studying noncoding variants; integrates comprehensive set of noncoding elements, modeling their mutation count; Dependent on C++ and BEDtools</li>
<li>Input: multiple</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://www.jurgott.org/linkage/LinkagePC.html">LINKAGE</a></strong>
<ul>
<li>Description:three main programs: mlink (calculates lod scores at fixed values for the recombination fraction in one interval of a genetic map), linkmap (calculates location scores for positions of a disease locus along a marker), and ilink (estimates parameters including recombination fractions, allele frequencies, penetrances, etc)</li>
<li>Input: pedfile (processed by MAKEPED) and datafile (reflects loci for each individual; set in PREPLINK)</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://sourceforge.net/projects/mnvannotationcorrector/">MAC</a></strong>
<ul>
<li>Description: MNV Annotation Corrector; Ad hoc software, fixes incorrect amino acid predictions that are caused by multiple nucleotide variations; Uses existing annotators ANNOVAR, SnpEff, VEP (last update April 2015) (only 1 download this week &rarr; not popular)</li>
<li>Input: List of called SNVs and corresponding BAM</li>
<li>Output: Report identifying block of mutation within codon (BMCs)</li>
</ul>
</li>
<li><strong><a href="http://genome.igib.res.in/mitomatic/">mit-o-matic</a></strong>
<ul>
<li>Description: focuses on mtDNA, provides clinically relevant information from different resources; two component pipeline: command link for alignment of NGS reads and online version that provides genetic report on mitocondrial variants</li>
<li>Input:FASTQ, pileup</li>
<li>Reference sequence: rCRSm</li>
<li>Output: Online version gives comprehensive genetic report</li>
</ul>
</li>
<li><strong><a href="http://krauthammerlab.med.yale.edu/mutadelic/index.html">Mutadelic</a></strong>
<ul>
<li>Description: Web App program; &ldquo;This application generates reports on inherited mutations in five genes (ANK1, SLC4A1, SPTA1, SPTB and EPB42) associated with the following rare Mendelian blood disorders: Hereditary Spherocytosis (HS), Hereditary Elliptocytosis (HE) and Hereditary Pyropoikilocytosis&rdquo;; Newer program- recently validated on omictools</li>
<li>Input: Can upload coordinates of DNA variants or VEP</li>
<li>Output: Displayed on web or can be downloaded in Excel or RDF format</li>
</ul>
</li>
<li><strong><a href="http://www.mutationtaster.org/">MutationTaster</a></strong>
<ul>
<li>Description: (Last post on site 2014) Web app program; Rapid evaluation of disease causing alterations; uses NCBI 37 and Ensembl 69</li>
<li>Input: HGNC symbol, NCBI GeneID, or Ensembl ID,</li>
<li>Output: Report containing prediction, summary, name of alteration, etc</li>
</ul>
</li>
<li><strong><a href="http://mutpred.mutdb.org/">MutPred</a></strong>
<ul>
<li>Description: web app tool; Classifies amino acids substituation as disease associated or neutral in humans; Last modified Feb. 2014; Based on SIFT, trained using Human Gene Mutation Database</li>
<li>Input:</li>
<li>Output: &ldquo;The output of MutPred contains a general score (g), i.e., the probability that the amino acid substitution is deleterious/disease-associated, and top 5 property scores (p), where p is the P-value that certain structural and functional properties are impacted.&rdquo;</li>
</ul>
</li>
<li><strong><a href="http://www.broadinstitute.org/cancer/cga/mutsig">MutSigCV</a></strong>
<ul>
<li>Description: (Broad Institute) Mutation Significance (CV= covariates); Analyzes mutations discovered in DNA sequencing to identify genes that were mutated more often than expected</li>
<li>Input: mutations.maf, coverage.txt, covariates.txt</li>
<li>Output: output.txt</li>
</ul>
</li>
<li><strong><a href="http://stothard.afns.ualberta.ca/downloads/NGS-SNP/">NGS-SNP</a></strong>
<ul>
<li>Description: Collection of command-line scripts for providing rich SNP annotations; &ldquo;NCBI, Ensembl, and Uniprot IDs are provided for genes, transcripts and proteins when applicable&rdquo;;</li>
<li>Input: Samtools consensus pileup, Maq, diBayes, Genetic format, VCF</li>
<li>Output: File containing annotated SNPs is copied from SNP list and some classes are added</li>
</ul>
</li>
<li><strong><a href="http://www.broadinstitute.org/oncotator">Oncotator</a></strong>
<ul>
<li>Description: (Broad Institute) &ldquo;Tool for annotating human genomic point mutations and data relevant to cancer researchers&rdquo;; Web app; Supports annotation of data from ClinVar, dbSNP, 1000 genomes (plus many other external sites); Only GRCh27 coordinates supported; Last update: April 2015</li>
<li>Input: tal-delimited file</li>
<li>Output: tab-delimited MAF</li>
</ul>
</li>
<li><strong><a href="http://omictools.com/panther-s649.html">PANTHER</a></strong>
<ul>
<li>Description: Protein ANalysis THrough Evolutionary Relationships; Web app program, also has its own database; Classification system used to classify proteins and their genes; Also, &ldquo;Estimates the likelihood of a particular nonsynonymous (amino-acid changing) coding SNP to cause a functional impact on the protein&rdquo;; Updated in 2015</li>
<li>Input: Data from PANTHER, IDs from Ensembl, EntrezGene, NCBI GI numbers, NCBI UniGene IDs HUGO, UniProt; if ID type is not one of the above, can input txt file or excel format</li>
<li>Output: Analysis results displayed online</li>
</ul>
</li>
<li><strong><a href="http://cubio.biology.columbia.edu/pesx/pesx/">PESX</a></strong>
<ul>
<li>Description: Putative Exonic Splicing Enhancers/Silencers; (Can&rsquo;t tell if this is outdated or not)</li>
<li>Input: FASTA or plain text</li>
<li>Output: Excel spread sheet</li>
</ul>
</li>
<li><strong><a href="http://phen-gen.org/index.html">Phen-Gen</a></strong>
<ul>
<li>Description: Combines patient's&rsquo; disease symptoms with sequencing data; Standalone or Web app version; Only excepts 1 family per run, in order to evaluate unrelated individuals, each sample needs to be run individually</li>
<li>Input: Variant- VCF; Pheotype- HPO; Pedigree- PED</li>
<li>Output: Combined scores file, variants for top genes file</li>
</ul>
</li>
<li><strong><a href="http://mmb.pcb.ub.es/PMut/">PMUT</a></strong>
<ul>
<li>Description: Aimed at annotation and prediction of pathological mutations; based on different kinds of sequence info and neural networks to process information</li>
<li>Input: FASTA</li>
<li>Output; Simple yes/no and reliability index</li>
</ul>
</li>
<li><strong><a href="http://provean.jcvi.org/index.php">PROVEAN</a></strong>
<ul>
<li>Description: Protein Variation Effect Analyzer; predicts whether an amino acid substitution or indel has impact on biological function of the protein; &ldquo;comparable to SIFT or Polyphen-2&rdquo;; Standalone, Web app, Command line or GUI; Last update May 2014</li>
<li>Input: FASTA, list of variants;</li>
<li>Output: tab-separated columns including Variant, Provean Score and prediciton</li>
</ul>
</li>
<li><strong><a href="http://genes.mit.edu/burgelab/rescue-ese/">Rescue-ESE</a></strong>
<ul>
<li>Description: &ldquo;An online tool for identifying candidate ESEs in vertebrate exons&rdquo;; Web application; For human, mouse, zebrafish, pufferfish</li>
<li>Input: multi-FASTA or plain text</li>
<li>Output:</li>
</ul>
</li>
<li><strong><a href="http://scandb.org/newinterface/index_v1.html">SCAN</a></strong>
<ul>
<li>Description: Web application program, includes a database as well; Database contains physical-based SNP annotations and functional annotations; &ldquo;Information on physical, functional, and LD annotation served on the SCAN database comes directly from public resources, including the HapMap (release 23a), NCBI (dbSNP 129), or is information created by us using data downloaded from these public resources&rdquo;; &ldquo;SCAN can be utilized in several ways including: (i) queries of the SNP and gene databases; (ii) analysis using the attached tools and algorithms; (iii) downloading files with SNP annotation for various GWA platforms&rdquo;</li>
<li>Input:</li>
<li>Output: HTML, comma-delimited, tab-delimited</li>
</ul>
</li>
<li><strong><a href="http://snp.gs.washington.edu/SeattleSeqAnnotation137/">SeattleSeq Annotation</a></strong>
<ul>
<li>Description: &ldquo;SeattleSeqAnnotation137 was most recently updated October 13, 2013. The current version is 8.08. The most recent site, based on dbSNP build 141, and hg38/NCBI 38&rdquo;; Provides annotations for SNVs and Indels- includes dbSNP rsID, gene names and accession numbers, variation functions, protein positions and amino acid changes, conservation scores, HapMap frequencies, PolyPhen predictions and clinical association.</li>
<li>Input: Maq, gff, CASAVA, VCF, GATK bed, custom</li>
<li>Output: &ldquo;default output file format is a header line (starting with "#") followed by tab-separated annotations&rdquo;; VCF</li>
</ul>
</li>
<li><strong><a href="https://cran.r-project.org/web/packages/seqminer/">seqminer 3.7</a></strong>
<ul>
<li>Description: &ldquo;Efficiently Read Sequence Data (VCF Format, BCF Format and METAL Format) into R&rdquo;; Command line package program; Published August 2015</li>
<li>Input: VCF, BCF</li>
<li>Output: VCF</li>
</ul>
</li>
<li><strong><a href="https://genomics.scripps.edu/ADVISER/Home.jsp">SG Adviser</a></strong>
<ul>
<li>Description: Scripps Genome Annotation and Distributed Variant Interpretation Server, web developed applications for variant annotation, &ldquo;Downstream applications of variant annotation include: Clinical sequencing applications including: carrier testing, or identification of causal variants in molecular diagnosis, tumor sequencing, or diagnostic odyssey. Prioritization of variants prior to statistical analysis of sequence based disease association studies, especially for automated set-generation and enrichment of likely functional variants within sets. Identification of causal variants in post-GWAS/linkage sequencing studies. Identification of causal variants in forward genetic screens (stay tuned for non-human annotation)&rdquo;</li>
<li>Input: SNV- VCF, BED, and a few others; CNV- BED, CNVator, plus others</li>
<li>Output: tab-delimited file</li>
</ul>
</li>
<li><strong><a href="https://rostlab.org/services/snap/">SNAP-2</a></strong>
<ul>
<li>Descriptio</li></ul></li></ol>]]></description>
	<dc:creator>Jitendra Prajapati</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27104/gatb-genome-analysis-toolbox-with-de-bruijn-graph</guid>
	<pubDate>Thu, 28 Apr 2016 11:16:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27104/gatb-genome-analysis-toolbox-with-de-bruijn-graph</link>
	<title><![CDATA[GATB : Genome Analysis Toolbox with de-Bruijn graph]]></title>
	<description><![CDATA[<p>The&nbsp;<strong><strong>Genome Analysis Toolbox with de-Bruijn graph</strong> (GATB)</strong> provides a set of <a href="https://gatb.inria.fr/gatb-global-architecture/">highly efficient algorithms to analyse NGS data sets</a>. These methods enable the analysis of data sets of any size on multi-core desktop computers, including very huge amount of reads data coming from any kind of organisms such as bacteria, plants, animals and even complex samples (<em>e.g.</em> metagenomes).</p>
<p>More at https://gatb.inria.fr/</p><p>Address of the bookmark: <a href="https://gatb.inria.fr/" rel="nofollow">https://gatb.inria.fr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27959/darkhorse</guid>
	<pubDate>Wed, 22 Jun 2016 05:37:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27959/darkhorse</link>
	<title><![CDATA[DarkHorse]]></title>
	<description><![CDATA[<p><em>DarkHorse</em>&nbsp;is a bioinformatic method for rapid, automated identification and ranking of phylogenetically atypical proteins on a genome-wide basis. It works by selecting potential ortholog matches from a reference database of amino acid sequences, then using these matches to calculate a lineage probability index (LPI) score for each genome protein.</p>
<p>LPI scores are inversely proportional to the phylogenetic distance between database match sequences and the query genome. These scores are useful not only for large-scale<em>de novo</em>&nbsp;predictions of horizontally transferred proteins, but can also serve as an independent quality control test for potential horizontal transfer candidates identified by alternative methods, especially those based on nucleic acid signatures. Candidates having high LPI scores are unlikely to have been horizontally transferred, since they are highly conserved among closely related organisms.</p>
<p>One unique and powerful feature of the DarkHorse HGT Candidate database is the opportunity to explore the phylogenetic background of potential HGT donors as well as recipients. The breadth of the database allows not only query sequences, but also their database match partners to be evaluated for sequence similarity or novelty compared to taxonomically related organisms.</p>
<p><em>DarkHorse</em>&nbsp;is configurable for varying degrees of phylogenetic granularity and protein sequence conservation. Users should consult the&nbsp;<a href="http://darkhorse.ucsd.edu/#references">references</a>&nbsp;cited below for a complete explanation of parameter selection and result interpretation. A brief&nbsp;<a href="http://darkhorse.ucsd.edu/tutorial.html">tutorial</a>&nbsp;page is also available on-line.</p><p>Address of the bookmark: <a href="http://darkhorse.ucsd.edu/download.html" rel="nofollow">http://darkhorse.ucsd.edu/download.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>