<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/36395?offset=0</link>
	<atom:link href="https://bioinformaticsonline.com/related/36395?offset=0" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36398/tools-for-protein-protein-docking</guid>
	<pubDate>Wed, 25 Apr 2018 05:15:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36398/tools-for-protein-protein-docking</link>
	<title><![CDATA[Tools for Protein-Protein Docking !]]></title>
	<description><![CDATA[<p>Predicting the structure of protein&ndash;protein complexes using docking approaches is a difficult problem whose major challenges include identifying correct solutions, and properly dealing with molecular flexibility and conformational changes. Following are the tools to predict&nbsp;<span>the structure of protein&ndash;protein complexes:</span></p><p><a href="http://www.sbg.bio.ic.ac.uk/docking/index.html" target="_blank">3D-Dock Suite</a></p><p>Global rigid search: FFTShape complementarity and electrostatics</p><p>Re-scoring and clustering. Refinement of interface side-chains</p><p><a href="http://www.sbg.bio.ic.ac.uk/~3dgarden/" target="_blank">3D-Garden</a></p><p>Global rigid search in ensamble</p><p>Shape complementarity and Lennard&ndash;Jones potential</p><p>Side chain and backbone dihedral refinement</p><p><a href="http://www.sdsc.edu/CCMS/DOT/" target="_blank">DOT</a></p><p>Global rigid search: FFTShape complementarity, electrostatics and VDWNone</p><p><a href="http://users.unimi.it/~ddl/escherng/index.htm" target="_blank">Escher NG</a></p><p>Global rigid searchShape complementarity, hydrogen bonds and electrostatic</p><p>Integrated in&nbsp;<a href="http://users.unimi.it/~ddl/vega/download.htm" target="_blank">VEGA</a></p><p><a href="http://vakser.bioinformatics.ku.edu/resources/gramm/gramm1" target="_blank">GRAMM</a>&nbsp;</p><p>Global rigid search: FFT. smooth protein surface representation for soft docking</p><p>Shape complementarity and Lennard-Jones potential</p><p>Clustering of conformations</p><p><a href="http://vakser.bioinformatics.ku.edu/resources/gramm/grammx/" target="_blank">GRAMM-X</a>&nbsp;</p><p>Global rigid search: FFT. smooth protein surface representation for soft docking</p><p>Shape complementarity and Lennard-Jones potentialminimization and re-scoring with multiple filters</p><p><a href="http://www.loria.fr/~ritchied/hex_server/" target="_blank">HEX</a></p><p>Global rigid search: Fourier correlation of spherical harmonics</p><p>Shape complementarity</p><p><a href="http://www.csd.abdn.ac.uk/hex/" target="_blank"></a><a href="http://haddock.chem.uu.nl/Haddock/haddock.php" target="_blank">HADDOCK</a></p><p>Global rigid searchElectrostatic ,VDW and desolvation energy termsMD simulated annealing refinement . Filtering based on external data.&nbsp;</p><p><a href="http://www.molsoft.com/docking.html">ICM</a></p><p>Global rigid search: Monte CarloEmpirical scoring function</p><p>Clustering and selection of conformations. Refinement of interface side-chains and re-scoring</p><p><a href="http://www.weizmann.ac.il/Chemical_Research_Support/molfit/" target="_blank">MolFit&nbsp;</a></p><p>Global rigid search: FFTShape complementarity</p><p>Clustering of good solutions, filtering using&nbsp;<em>a priori&nbsp;</em>information and small, local rigid rotations around selected conformations</p><p><a href="http://bioinfo3d.cs.tau.ac.il/PatchDock/" target="_blank">PatchDock</a></p><p>Global rigid searchShape complementarity and atomic desolvation energy</p><p>Clustering of conformations</p><p><a href="http://inb.bsc.es/gn6/PyDock" target="_blank">PyDock</a></p><p>Global rigid search:FFTShape complementarity</p><p>rescoring by binding electrostatics and desolvation energy</p><p><a href="http://bioinfo3d.cs.tau.ac.il/PatchDock/" target="_blank"></a><a href="http://rosettadock.graylab.jhu.edu/" target="_blank">RosettaDock</a></p><p>Local rigid search: Monte Carlo with low and high resolution structure representation levels</p><p>Different scoring parameters for the different resolutions&nbsp;</p><p><a href="http://zlab.bu.edu/zdock/" target="_blank">ZDOCK</a></p><p>Global rigid search: FFTShape complementarity, desolvation energy, and electrostatics.</p><p>Energy minimization and re-scoringFree for academics</p><p>&nbsp;</p><p>Point to note:</p><p>The proper treatment of flexibility in protein&ndash;protein docking is still an active field of research. You first should analyzed your proteins in order to define their conformational space and then choose the most suitable method for your docking problem.</p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36384/binding-site-prediction-in-protein</guid>
	<pubDate>Wed, 25 Apr 2018 04:35:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36384/binding-site-prediction-in-protein</link>
	<title><![CDATA[Binding Site Prediction in Protein !]]></title>
	<description><![CDATA[<p><span>The interaction between proteins and other molecules is fundamental to all biological functions. In this section we include tools that can assist in prediction of interaction sites on protein surface and tools for predicting the structure of the intermolecular complex formed between two or more molecules (docking).</span></p><h4>Pockets Identification</h4><p><a href="http://sts.bioengr.uic.edu/castp/" target="_blank">CASTp</a></p><div style="text-align: justify;">Automatic Identification of pockets and cavities in proteins structure, and quantitation of their volumes using Delaunay triangulation. Available also as PyMOL plugin</div><p><a href="http://www.bioinformatics.leeds.ac.uk/pocketfinder/" target="_blank">Pocket-Finder</a></p><div style="text-align: justify;">Automatic identification of pockets and cavities in proteins structure, and quantitation of their volumes.</div><p><a href="http://gecco.org.chemie.uni-frankfurt.de/pocketpicker/index.html" target="_blank">PocketPicker</a></p><div style="text-align: justify;">Grid-based technique for the analysis of protein pockets. PocketPicker available as a plugin for&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/pymol.htm">PyMOL</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><h4>Binding Site Prediction</h4>
<p><a href="http://consurf.tau.ac.il/" target="_blank">ConSurf</a></p>
</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">Identification of functional regions in proteins by surface-mapping of phylogenetic information</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><a href="http://www-cryst.bioc.cam.ac.uk/~crescendo/crescendo.php" target="_blank">CRESCENDO</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">Identification protein interaction sites. It uses sequence conservation patterns in homologous proteins to distinguish between residues that are conserved due to structural restraints from those due to functional restraints.&nbsp;&nbsp;</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><strong>Ligand Binding Sites</strong></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><a href="http://www.sbg.bio.ic.ac.uk/~3dligandsite/" target="_blank">3DLigandSite</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">The server utilizes protein-structure prediction to provide structural models of the binding site. Ligands bound to structures are superimposed onto the model and use to predict the binding site.</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">F<a href="http://cssb.biology.gatech.edu/skolnick/files/FINDSITE/" target="_blank">INDSITE</a></div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">A threading-based method for ligand-binding site prediction and functional annotation based on binding-site similarity across superimposed groups of threading templates.</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">
<p><a href="http://scoppi.biotec.tu-dresden.de/pocket/" target="_blank">LIGSITE<sup>csc</sup></a></p>
<div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;">Prediction of binding site by pocket identification using the Connolly surface and degree of conservation</div>
<p><a href="http://metapocket.eml.org/" target="_blank"></a></p>
</div><div style="text-align: justify;">&nbsp;</div><div style="text-align: justify;"><a href="http://metapocket.eml.org/" target="_blank">metaPocket</a>A meta server for ligand-binding site prediction. metaPocket use&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#ligsite">LIGSITE<sup>csc</sup></a>,&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#pass">PASS</a>,&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#qsite">Q-SiteFinder</a>&nbsp;and&nbsp;<a href="http://www.biochem.ucl.ac.uk/~roman/surfnet/surfnet.html" target="_blank">SURFNET</a></div>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42974/list-of-bioinformatics-packages-for-ngs-analysis</guid>
	<pubDate>Sat, 20 Mar 2021 00:28:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42974/list-of-bioinformatics-packages-for-ngs-analysis</link>
	<title><![CDATA[List of bioinformatics packages for NGS analysis !]]></title>
	<description><![CDATA[<p>Package suites gather software packages and installation tools for specific languages or platforms. We have some for bioinformatics software.</p><ul>
<li><a href="https://github.com/Bioconductor">Bioconductor</a>&nbsp;&ndash; A plethora of tools for analysis and comprehension of high-throughput genomic data, including 1500+ software packages. [&nbsp;<a href="https://link.springer.com/article/10.1186/gb-2004-5-10-r80">paper-2004</a>&nbsp;|&nbsp;<a href="https://www.bioconductor.org/">web</a>&nbsp;]</li>
<li><a href="https://github.com/biopython/biopython">Biopython</a>&nbsp;&ndash; Freely available tools for biological computing in Python, with included cookbook, packaging and thorough documentation. Part of the&nbsp;<a href="http://open-bio.org/">Open Bioinformatics Foundation</a>. Contains the very useful&nbsp;<a href="https://biopython.org/DIST/docs/api/Bio.Entrez-module.html">Entrez</a>&nbsp;package for API access to the NCBI databases. [&nbsp;<a href="https://pubmed.ncbi.nlm.nih.gov/19304878">paper-2009</a>&nbsp;|&nbsp;<a href="https://biopython.org/">web</a>&nbsp;]</li>
<li><a href="https://github.com/bioconda">Bioconda</a>&nbsp;&ndash; A channel for the&nbsp;<a href="http://conda.pydata.org/docs/intro.html">conda package manager</a>&nbsp;specializing in bioinformatics software. Includes a repository with 3000+ ready-to-install (with&nbsp;<code>conda install</code>) bioinformatics packages. [&nbsp;<a href="https://pubmed.ncbi.nlm.nih.gov/29967506">paper-2018</a>&nbsp;|&nbsp;<a href="https://bioconda.github.io/">web</a>&nbsp;]</li>
<li><a href="https://github.com/BioJulia">BioJulia</a>&nbsp;&ndash; Bioinformatics and computational biology infastructure for the Julia programming language. [&nbsp;<a href="https://biojulia.net/">web</a>&nbsp;]</li>
<li><a href="https://github.com/rust-bio/rust-bio">Rust-Bio</a>&nbsp;&ndash; Rust implementations of algorithms and data structures useful for bioinformatics. [&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/early/2015/10/06/bioinformatics.btv573.short?rss=1">paper-2016</a>&nbsp;]</li>
<li><a href="https://github.com/seqan/seqan3">SeqAn</a>&nbsp;&ndash; The modern C++ library for sequence analysis.</li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/8265/list-of-generic-simulation-softwaretoolsresource-with-brief-description-and-homepage</guid>
	<pubDate>Mon, 10 Feb 2014 05:57:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/8265/list-of-generic-simulation-softwaretoolsresource-with-brief-description-and-homepage</link>
	<title><![CDATA[List of generic simulation software/tools/resource with brief description and homepage !!!]]></title>
	<description><![CDATA[<p>List of generic simulation software/tools/resource with brief description and homepage</p><p><img src="http://www.evolution-of-life.com/fileadmin/images/carousel/genetic.PNG" alt="image" style="border: 0px;"></p><p>ALF <br />A Simulation Framework for Genome Evolution <br />http://www.cbrg.ethz.ch/alf<br /><br />Bayesian Serial SimCoal <br />Bayesian Serial SimCoal, (BayeSSC) is a modification of SIMCOAL 1.0, a program written by Laurent Excoffier, John Novembre, and Stefan Schneider. <br />http://www.stanford.edu/group/hadlylab/ssc/index.html<br /><br />BEERS <br />BEERS was designed to benchmark RNA-Seq alignment algorithms and also algorithms that aim to reconstruct different isoforms and alternate splicing from RNA-Seq data <br />http://cbil.upenn.edu/beers/<br /><br />BOTTLENECK <br />Bottleneck is a program for detecting recent effective population size reductions from allele data frequencies <br />http://www.ensam.inra.fr/urlb/bottleneck/bottleneck.html<br /><br />BottleSim <br />BottleSim is a computer simulation program for simulating the process of population bottlenecks <br />http://chkuo.name/software/bottlesim.html<br /><br />CASS <br />Protein Sequence Simulation <br />http://www.wyomingbioinformatics.org/liberlesgroup/cass/<br /><br />CDPOP <br />CDPOP is a landscape genetics tool for simulating the emergence of spatial genetic structure in populations resulting from specified landscape processes governing organism movement behavior. <br />http://cel.dbs.umt.edu/cdpop<br /><br />CoalFace <br />CoalFace is a simulation of the coalescent process with the visual display of gene genealogies. <br />http://web.up.ac.za/default.asp?ipkcategoryid=3283<br /><br />CoaSim <br />CoaSim is a tool for simulating the coalescent process with recombination and geneconversion under various demographic models. <br />http://users-birc.au.dk/mailund/coasim/index.html<br /><br />cosi <br />The cosi package is written in C and is available as a tar file. <br />http://www.broadinstitute.org/~sfs/cosi/<br /><br />CS-PSeq-Gen <br />A program to simulate the evolution of protein sequences under the constraints of the information of a particular reconstructed phylogeny <br />http://bioserv.rpbs.univ-paris-diderot.fr/software/cs-pseq-gen.html<br /><br />DAWG <br />An application designed to simulate the evolution of recombinant DNA sequences in continuous time <br />http://scit.us/projects/dawg<br /><br />Easypop <br />EASYPOP is an individual based model intended to simulate datasets under a very broad range of conditions <br />http://www.unil.ch/dee/page36926_fr.html<br /><br />EggLib <br />EggLib is a C++/Python library and program package for evolutionary genetics and genomics. <br />http://egglib.sourceforge.net/<br /><br />EvolSimulator <br />A simulation test bed for hypotheses of genome evolution <br />http://acb.qfab.org/acb/evolsim/<br /><br />EvolveAGene <br />A realistic coding sequence simulation program that separates mutation from selection and allows the user to set selection conditions <br />http://bellinghamresearchinstitute.com/software/index.html<br /><br />fastsimcoal <br />A continuous-&not;‐time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios <br />http://cmpg.unibe.ch/software/fastsimcoal/<br /><br />FastSLINK <br />Simulation of Marker and Phenotype Data in Pedigrees <br />http://watson.hgen.pitt.edu/<br /><br />FFPopSim <br />C++/Python library for population genetics. <br />http://webdav.tuebingen.mpg.de/ffpopsim/<br /><br />FLUX SIMULATOR <br />The Flux Simulator aims at providing a deterministic in silico reproduction of the experimental pipelines for RNA-Seq, employing a minimal set of parameters. <br />http://flux.sammeth.net/simulator.html<br /><br />ForSim <br />ForSim: A Forward Evolutionary Computer Simulation <br />http://www.anthro.psu.edu/weiss_lab/research.shtml<br /><br />ForwSim <br />The program given below is based on the algorithm described in Padhukasahasram et al. 2008 to simulate genetic drift in a standard Wright-Fisher process. <br />http://badri-populationgeneticsimulators.blogspot.com/<br /><br />FPG <br />Forward Population Genetic simulation <br />http://genfaculty.rutgers.edu/hey/software#fpg<br /><br />FREGENE <br />FREGENE is a C++ program that simulates sequence-like data over large genomic regions in large diploid populations. <br />http://www.ebi.ac.uk/projects/bargen/download/fregen/documentation_html.html<br /><br />GAMETES <br />Genetic Architecture Model Emulator for Testing and Evaluating Software: Simulates complex SNP models with pure, strict epistatic interactions with n-loci. <br />http://sourceforge.net/projects/gametes/?source=navbar<br /><br />GASP <br />Genometric Analysis Simulation Program. A software tool for testing and investigating methods in statistical genetics by generating samples of family data based on user specified models. <br />http://research.nhgri.nih.gov/gasp/<br /><br />GemSIM <br />Next generation sequencing read simulator <br />http://sourceforge.net/projects/gemsim/<br /><br />GeneArtisan <br />Simulation of Markers in Case-Control Study Designs <br />http://www.rannala.org/?page_id=241<br /><br />GENOME <br />A rapid coalescent-based whole genome simulator <br />http://www.sph.umich.edu/csg/liang/genome/<br /><br />GenomePop2 <br />GenomePop2 is a specialization of the program GenomePop just to manage SNPs under more flexible and useful settings. If you need models with more than 2 alleles please use the GenomePop program version. <br />http://webs.uvigo.es/acraaj/genomepop2.htm<br /><br />GenomeSimla <br />GenomeSIMLA is currently under development- however, we have a beta release that we are asking to be tested <br />http://chgr.mc.vanderbilt.edu/genomesimla/<br /><br />GENS2 <br />Simulates interactions among two genetic and one environmental factor and also allows for epistatic interactions. <br />https://sourceforge.net/projects/gensim/<br /><br />GWAsimulator <br />A rapid whole genome simulation program <br />http://biostat.mc.vanderbilt.edu/wiki/main/gwasimulator<br /><br />HAP-SAMPLE <br />An association simulator for candidate regions or genome scans <br />http://www.hapsample.org/<br /><br />HAPGEN <br />A simulator for the simulation of case control datasets at SNP markers <br />https://mathgen.stats.ox.ac.uk/genetics_software/hapgen/hapgen2.html<br /><br />HapSim <br />A simulation tool for generating haplotype data with pre-specified allele frequencies and LD coefficients <br />http://cran.r-project.org/web/packages/hapsim/index.html<br /><br />HAPSIMU <br />A program that simulates heterogeneous populations with various known and controllable structures under the continuous migration model or the discrete model <br />http://l.web.umkc.edu/liujian/<br /><br />IBDsim <br />IBDSim is a computer package for the simulation of genotypic data under general isolation by distance models. <br />http://raphael.leblois.free.fr/<br /><br />indel-Seq-Gen <br />A biological sequence simulation program that simulates highly divergent DNA sequences and protein superfamilies <br />http://bioinfolab.unl.edu/~cstrope/isg/<br /><br />Indelible <br />A powerful and flexible simulator of biological evolution <br />http://abacus.gene.ucl.ac.uk/software/indelible/<br /><br />invertFREGENE <br />InvertFREGENE is a forward-in-time simulator of inversions in population genetic data <br />http://www.ebi.ac.uk/projects/bargen/<br /><br />kernalPop <br />A spatially explicit population genetic simulation engine <br />http://cran.r-project.org/src/contrib/archive/kernelpop/<br /><br />MaCS <br />Markovian Coalescent Simulator <br />http://www-hsc.usc.edu/~garykche/<br /><br />Mason <br />A package for the simulation of nucleotide data. <br />http://www.seqan.de/projects/mason/<br /><br />mbs <br />modifying Hudson's ms software to generate samples of DNA sequences with a biallelic site under selection <br />http://www.sendou.soken.ac.jp/esb/innan/innanlab/software.html<br /><br />Mendel's Accountant <br />Mendel's Accountant (MENDEL) is an advanced numerical simulation program for modeling genetic change over time and was developed collaboratively by Sanford, Baumgardner, Brewer, Gibson and ReMine <br />http://mendelsaccount.sourceforge.net/<br /><br />MetaSim <br />A tool to generate collections of synthetic reads that reflect the diverse taxonomical composition of typical metagenome data sets <br />http://ab.inf.uni-tuebingen.de/software/metasim/<br /><br />mlcoalsim <br />Multilocus Coalescent Simulations <br />http://code.google.com/p/mlcoalsim-v1/<br /><br />ms <br />The purpose of this program is to allow one to investigate the statistical properties of such samples, to evaluate estimators or statistical tests, and generally to aid in the interpretation of polymorphism data sets. <br />http://home.uchicago.edu/~rhudson1/source/mksamples.html<br /><br />msHOT <br />The purpose of this program is to allow one to investigate the statistical properties of such samples, to evaluate estimators or statistical tests, and generally to aid in the interpretation of polymorphism data sets. <br />http://home.uchicago.edu/~rhudson1/<br /><br />msms <br />A coalescent Simlation tool with selection. <br />http://www.mabs.at/ewing/msms/index.shtml<br /><br />MySSP <br />A program for the simulation of DNA sequence evolution across a phylogenetic tree <br />http://www.rosenberglab.net/software.php<br /><br />Nemo <br />A forward-time, individual-based, genetically explicit, and stochastic simulation program designed to study the evolution of genetic markers, life history traits, and phenotypic traits in a flexible (meta-)population framework. <br />http://nemo2.sourceforge.net/<br /><br />NetRecodon <br />Coalescent simulation of coding DNA sequences with recombination (inter and intracodon), migration and demography <br />http://code.google.com/p/netrecodon/<br /><br />PEDAGOG <br />Software for simulating eco-evolutionary population dynamics <br />https://bcrc.bio.umass.edu/pedigreesoftware/node/5<br /><br />phenosim <br />A tool to add phenotypes to simulated genotypes <br />http://evoplant.uni-hohenheim.de/doku.php?id=software:software<br /><br />PhyloSim <br />An R package for the Monte Carlo simulation of sequence evolution <br />http://bit.ly/rlsim-git<br /><br />pIRS <br />Profile-based Illumina pair-end reads simulator <br />https://code.google.com/p/pirs/<br /><br />ProteinEvolver <br />Simulation of protein evolution along phylogenies under structure-based substitution models <br />http://code.google.com/p/proteinevolver/<br /><br />QMSim <br />QTL and Marker Simulator <br />http://www.aps.uoguelph.ca/~msargol/qmsim/<br /><br />quantiNEMO <br />An individual-based program for the analysis of quantitative traits with explicit genetic architecture potentially under selection in a structured population <br />http://www2.unil.ch/popgen/softwares/quantinemo/<br /><br />RECOAL <br />Simulates new haplotype data from a reference population of haplotypes. <br />ftp://popgen.usc.edu/<br /><br />Recodon <br />Coalescent simulation of coding DNA sequences with recombination, migration and demography <br />http://code.google.com/p/recodon/<br /><br />rlsim <br />A package for simulating RNA-seq library preparation with parameter estimation <br />http://bit.ly/rlsim-git<br /><br />Rmetasim <br />Rmetasim is a front-end for the metasim engine that is implemented as a package that runs in the statistical computing environment R <br />http://linum.cofc.edu/software.html#metasim<br /><br />RNA Seq Simulator <br />RSS takes SAM alignment files from RNA-Seq data and simulates over dispersed, multiple replica, differential, non-stranded RNA-Seq datasets. <br />http://useq.sourceforge.net/cmdlnmenus.html#rnaseqsimulator<br /><br />Rose <br />Random model of sequence evolution <br />http://bibiserv.techfak.uni-bielefeld.de/rose/<br /><br />SelSim <br />SelSim is a program for Monte Carlo simulation of DNA polymorphism data for a recom- bining region within which a single bi-allelic site has experienced natural selection <br />http://www.well.ox.ac.uk/~spencer/selsim/<br /><br />Seq-Gen <br />An application for the Monte Carlo simulation of molecular sequence evolution along phylogenetic trees. <br />http://tree.bio.ed.ac.uk/software/seqgen/<br /><br />SEQPower <br />Statistical power analysis for sequence-based association studies <br />http://bioinformatics.org/spower/<br /><br />SeqSIMLA <br />SeqSIMLA can simulate sequence data with user-specified disease and quantitative trait models. Family or unrelated case-control data can be simulated. <br />http://seqsimla.sourceforge.net/<br /><br />Serial NetEvolve <br />A flexible utility for generating serially-sampled sequences along a tree or recombinant network <br />http://biorg.cis.fiu.edu/sne/<br /><br />SFS_CODE <br />SFS_CODE can perform forward population genetic simulations under a general Wright-Fisher model with arbitrary migration, demographic, selective, and mutational effects. <br />http://sfscode.sourceforge.net/sfs_code/index/index.html<br /><br />SIBSIM <br />Quantitative phenotype simulation in extended pedigrees <br />http://sourceforge.net/projects/sibsim/<br /><br />SIMCOAL2 <br />A coalescent program for the simulation of complex recombination patterns over large genomic regions under various demographic models <br />http://cmpg.unibe.ch/software/simcoal2/<br /><br />SimCopy <br />An R package simulating the evolution of copy number profiles along a tree. <br />http://bit.ly/simcopy<br /><br />SIMLA <br />SIMLA is a SIMuLAtion program that generates data sets of families for use in Linkage and Association studies. <br />http://www.chg.duke.edu/research/simla.html<br /><br />SimPed <br />A Simulation Program to Generate Haplotype and Genotype Data for Pedigree Structures <br />http://www.hgsc.bcm.tmc.edu/content/simped<br /><br />Simprot <br />A program to simulate protein evolution by substitution, insertion and deletion <br />http://www.uhnresearch.ca/labs/tillier/software.htm#3<br /><br />SimRare <br />Rare variant simulation and analysis tool <br />http://code.google.com/p/simrare/<br /><br />simuGWAS <br />A forward-time simulator that simulates realistic samples for genome-wide association studies. <br />http://simupop.sourceforge.net/cookbook/simucomplexdisease<br /><br />simuPOP <br />simuPOP is a general-purpose individual-based forward-time population genetics simulation environment. <br />http://simupop.sourceforge.net/<br /><br />SISSI <br />A software tool to generate data of related sequences along a given phylogeny, taking into account user defined system of neighbourhoods and instantaneous rate matrices. <br />http://www.cibiv.at/software/sissi/<br /><br />SNPsim <br />Coalescent simulation of hotspot recombination <br />http://code.google.com/p/phylosoftware/<br /><br />SPIP <br />SPIP simulates the transmission of genes from parents to offspring in a population having demographic structure defined by the user <br />http://swfsc.noaa.gov/textblock.aspx?division=fed&amp;id=3434<br /><br />Splatche <br />Spatial and Temporal Coalescences in Heterogeneous Environment <br />http://www.splatche.com/<br /><br />srv <br />Simulator of Rare Varaints (srv) is a simulator for the simulation of the introduction and evolution of (rare) genetic variants. <br />http://simupop.sourceforge.net/cookbook/simurarevariants<br /><br />SUP <br />SLINK/FastSLINK utility program <br />http://mlemire.freeshell.org/software.html<br /><br />TreesimJ <br />A flexible, forward-time population genetic simulator <br />http://code.google.com/p/treesimj/<br /><br />Vortex <br />VORTEX is an individual-based simulation model for population viability analysis (PVA). <br />http://www.vortex9.org/vortex.html<br /><br />References:</p><p>Image www.evolution-of-life.com</p><p>www.cancer.gov</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/13523/megadock-40</guid>
	<pubDate>Thu, 07 Aug 2014 18:08:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/13523/megadock-40</link>
	<title><![CDATA[MEGADOCK 4.0]]></title>
	<description><![CDATA[<p>An ultra&ndash;high-performance protein&ndash;protein docking software for heterogeneous supercomputers</p>
<p id="p-4"><strong>Summary:</strong> The application of protein&ndash;protein docking in large-scale interactome analysis is a major challenge in structural bioinformatics and requires huge computing resources. In this work, we present MEGADOCK 4.0, an FFT-based docking software that makes extensive use of recent heterogeneous supercomputers and shows powerful, scalable performance of over 97% strong scaling.</p>
<p id="p-5"><strong>Availability and Implementation:</strong> MEGADOCK 4.0 is written in C++ with OpenMPI and NVIDIA CUDA 5.0 (or later) and is freely available to all academic and non-profit users at: <a href="http://www.bi.cs.titech.ac.jp/megadock">http://www.bi.cs.titech.ac.jp/megadock</a>.</p>
<p id="p-6"><strong>Contact:</strong> <a href="mailto:akiyama@cs.titech.ac.jp">akiyama@cs.titech.ac.jp</a></p><p>Address of the bookmark: <a href="http://bioinformatics.oxfordjournals.org/content/early/2014/08/06/bioinformatics.btu532.short" rel="nofollow">http://bioinformatics.oxfordjournals.org/content/early/2014/08/06/bioinformatics.btu532.short</a></p>]]></description>
	<dc:creator>Suleman Khan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36392/protein-protein-interaction-sites-predictions</guid>
	<pubDate>Wed, 25 Apr 2018 04:53:20 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36392/protein-protein-interaction-sites-predictions</link>
	<title><![CDATA[Protein-Protein Interaction Sites Predictions !]]></title>
	<description><![CDATA[<p><span>The study of Protein&ndash;Protein Interactions (PPIs) has a crucial role in biology, medicine and the pharmaceutical industry. PPIs can be investigated from two aspects: The interaction partners of a specific protein and the amino acid residues participating in a given PPI. Information about a protein&rsquo;s interaction partners allows scientists to construct protein interaction networks, such as signaling pathways, which in turn facilitate the understanding of many biological and clinical observations.&nbsp;</span></p><p><span>Following are the list of tools commonly used to PPIs predictions:</span></p><p>Protein-Protein Interaction Sites</p><p><a href="http://pipe.scs.fsu.edu/ppisp.html" target="_blank">PPISP</a></p><p>A consensus neural network method for predicting protein-protein interaction sites</p><p><a href="http://biunit.naist.jp/homcos/" target="_blank">HOMCOS</a></p><p>A server to predict interacting protein pairs and interacting sites by homology modeling of complex structures</p><p><a href="http://prism.ccbb.ku.edu.tr/hotpoint/" target="_blank">HotPOINT</a></p><p>Prediction of protein interfaces using an empirical model</p><p><a href="http://cubic.bioc.columbia.edu/services/isis/" target="_blank">ISIS</a></p><p>Prediction of interaction hotspots from sequence</p><p><a href="http://kfc.mitchell-lab.org/" target="_blank">KFC server</a></p><p>Automated decision-tree approach to predicting protein-protein interaction hot spots</p><p><a href="http://pipe.scs.fsu.edu/meta-ppisp.html" target="_blank">meta-PPISP</a></p><p>A meta server for predicting protein-protein interaction sites. meta-PPISP is built on three individual web servers:&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#cons">cons-PPISP</a>,&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#pin">PINUP</a>, and&nbsp;<a href="https://bip.weizmann.ac.il/toolbox/structure/binding.htm#pro">Promate</a></p><p><a href="http://www.molsoft.com/oda.html" target="_blank">ODA</a></p><p>Identification of optimal surface patches with the lowest docking desolvation energy values</p><p><a href="http://sparks.informatics.iupui.edu/PINUP/" target="_blank">PINUP</a></p><p>Protein binding site prediction with an empirical scoring function</p><p>Other Sites (DNA, RNA, Metals)</p><p><a href="http://ligin.weizmann.ac.il/~lpgerzon/mbs4/mbs.cgi" target="_blank">CHED</a>&nbsp;</p><p>Web server for predicting soft metal binding sites in proteins</p><p><a href="http://cssb.biology.gatech.edu/skolnick/webservice/DBD-Hunter/" target="_blank">DBD-Hunter</a></p><p>A knowledge-based method for the prediction of DNA-protein interactions</p><p><a href="http://pipe.scs.fsu.edu/displar.html" target="_blank">DISPLAR</a></p><p>Given the structure of a protein known to bind DNA, the method predicts residues that contact DNA using neural network method</p><p><a href="http://idbps.tau.ac.il/" target="_blank">iDBPs</a></p><p>Predicts DNA binding proteins for proteins with known 3D structure.</p><p><a href="http://pfp.technion.ac.il/" target="_blank">PFplus</a></p><div style="text-align: left;">A tool for extracting and displaying positive electrostatic patches on protein surfaces which can be indicative of nucleic acid binding interfaces.</div>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/1295/five-points-for-bioinformatics-softwaretools</guid>
	<pubDate>Mon, 05 Aug 2013 04:12:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/1295/five-points-for-bioinformatics-softwaretools</link>
	<title><![CDATA[Five points for bioinformatics software/tools]]></title>
	<description><![CDATA[<p><span>In the bioinformatics sector we mostly spend time on computational analysis of huge amounts of data and try to make sense of it, biologically. But, most of the newbie bioinformaticians are faced with dilemma when they receive biological sequence data for the first time. They mostly found confusing over open source, user friendly GUI, and commercial bioinformatics software. Don&rsquo;t be surprise this is true and also not an easy task to decide, because analytical step is the most crucial part and believe to be the biggest bottleneck in publishing paper in high impact journals. Through this blog I would like to address the pros and cons of both kind of software/tools and try to assist (Hmmm not really, It looks convince) you to make decision on your software selections.</span></p><p><span><img src="http://bioinformaticsonline.com/mod/photo/five.jpg" alt="image" style="border: 0px;"></span></p><p><span>The most common newbie questions are:</span><span></span></p><p><span>Should I try to use these free open source programs? &nbsp;Why are we not trying GUI software for computational analysis? Should I use commercial bioinformatics programs/software?&rdquo;</span><span><br /></span><span><br />1. Let&rsquo;s be open</span><span></span></p><p><span>We generally think free and cheap are useless. But this concept is not applicable when we discuss open source software. Mostly, the bioinformatics software is developed by highly competitive biological programmers who believe in open sharing of knowledge. They come under Open Bioinformatics Foundation or O|B|F which is a non-profit, volunteer run organization focused on supporting open source programming in bioinformatics. The best part about open source tools/software is that they&rsquo;re free to download the source code and read exactly what the program does. If you are so inclined, you can view all of the parts of the program and see the logical flow of the pipeline. In addition, open source makes an excellent learning tool for any beginning bioinformatician. Moreover, you can modify existing open source programs to deal with cutting-edge problems or to customize your pipeline.</span><span>&nbsp;</span><span>Apart from your computational and analysis work, most of the reviewer also prefers the open source based results so that they can validate the results if validation required.</span></p><p><span>2. Code headache</span><span></span></p><p><span>As a bioinformatician you are supposed to know the basics of programming languages, and if you are not good at it, then please learn it as soon as possible because you are not a bio-analyst but biological programmers. The<span>&nbsp;</span>open source programs usually lack dedicated service and support teams (often because they were the product of an overworked doc/postdoc!) so you are responsible for troubleshooting your own errors most of the time.<span>&nbsp;</span>We commonly receive the HELP email to support and assist to setup the pipeline; you can also find this kind of request on any QA forum. I personally believe this coding horror brings the biggest downside of open-source programs; where you need some programming skills in order to implement the program in your pipeline. But, if you are not able to fix the pipeline and modify the open source code according to your requirements them you should re-think on your bioinformatician name tag!!!</span><span></span></p><p><span>3. Dive into the codes</span><span></span></p><p><span>Some of the biologist turn bioinformatician says &ldquo;if you can do the same thing with commercial software then why to get migraine with weird codes&rdquo;, well this statement looks to me that guys are keen to learn swimming but still don&rsquo;t like to get wet. If you are still using paid software and doing your work by customer support and clicking some of the well-designed GUI button then perhaps you are not interested in learning and trying new and challenging bioinformatics works. You are missing the basic flavour of bioinformatics. Let&rsquo;s dive into the coding world, I am sure your will enjoy it. I recommend your to swim freely in code&rsquo;s sea, and enjoy the journey; do not merely watch it from the outside. &nbsp;</span></p><p><span>4. Paid does not mean better</span><span></span></p><p><span>The bioinformatics company which are specializes in bioinformatics solutions develop well designed/packed, user friendly software by using a large number of specialised scientist, programmers and support staff. They also provide good services to accomplice your biological analysis work. This means that if you hit a &lsquo;snag&rsquo; with your data, help is likely only a phone call away! These companies price their products competitively against the cost of a dedicated bioinformatician. You may be able to afford the program, but not the additional staff! Additionally, most of the functionality that you need in your analysis is already coded into the program. Need to plot a graph? Just click this button right here. It is that easy.</span><span>&nbsp;</span><span>But, as a bioinformatician this is not generally well encouraged approach in biological analysis work, because the software is not available to everyone and your data can&rsquo;t be validated. Moreover, there is very less chances that anyone will repeat your work or love to do similar kind of research (because not all the labs in the world are rich like yours).</span></p><p><span>5. Take a caution<br /><br />In biological analysis work, in which you deal GB/TB of data are having maximum chances of getting errors, so please be careful and always cross check your data before coming to any conclusion. Even an error in two line code can alter your entire analysis and display weird results. Some of the scientist blindly believes on commercial software, which is entirely wrong. Using proprietary tools does not absolve you of the need to actually read and research the type of analysis that you are doing. This is particularly true in the case of genome assembly and annotation.</span></p><p><span><br />At the end, I would like to tell only one think that open source solutions allows you to do more cutting edge analysis than the commercial tools. So let&rsquo;s go for it.</span></p><p>Disclaimer:</p><p>This is my personal view. I have nothing to do with any company or open source community.&nbsp;The views expressed on these pages are mine alone and not those of my current/past employers. I do reserve the right to remove comments left by spammers or off-topic comments.</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/31566/software-and-tools-to-detect-structure-variation-with-long-reads</guid>
	<pubDate>Wed, 15 Mar 2017 14:31:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/31566/software-and-tools-to-detect-structure-variation-with-long-reads</link>
	<title><![CDATA[Software and Tools to detect structure variation with long reads !!]]></title>
	<description><![CDATA[<p>Uncovering the connection between genetics and heritable diseases requires an approach that looks at all the variant bases and types in a genome. While a PacBio&nbsp;<em>de novo</em>&nbsp;assembly resolves the most novel SV variants. 8-10X PacBio coverage of single genomes or trios reveals triple the SVs detectable by short-read data.</p><p>With&nbsp;<span style="text-decoration: underline;"><a href="http://www.pacb.com/smrt-science/">Single Molecule, Real-Time (SMRT) Sequencing</a></span>, you can access structural variations having a broad range of sizes, types, and GC content with the ability to:</p><ul>
<li>Uncover missing heritability linked to structural variation</li>
<li>Unambiguously identify genomic context and variant breakpoints at the sequence level to unravel the genetic etiology of disease</li>
<li>Resolve structural variation across the complete size spectrum with basepair resolution</li>
</ul><p>Following are the SV tools, which can assist you to achieve your goal.</p><p><strong>Sniffles:</strong>&nbsp;Structural variation caller using third generation sequencing</p><p>Sniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore). It detects all types of SVs using evidence from split-read alignments, high-mismatch regions, and coverage analysis. Please note the current version of Sniffles requires sorted output from BWA-MEM (use -M and -x parameter) or NGM-LR with the optional SAM attributes enabled!&nbsp;</p><p>More at&nbsp;https://github.com/fritzsedlazeck/Sniffles</p><p><strong style="font-size: 12.8px;"><br />MultiBreak-SV:</strong> It identifies structural variants from next-generation paired end data, third-generation long read data, or data from a combination of sequencing platforms.</p><p>There are two pieces of software in this release: (1) a pre-processor that takes machineformat (.m5) BLASR files, and (2) MultiBreak-SV. For installation and usage instructions, see doc/MultiBreakSV-Manual.txt.</p><p>More at&nbsp;https://github.com/raphael-group/multibreak-sv</p><p><strong style="font-size: 12.8px;"><br />Parliament:</strong>&nbsp;A Structural Variation Tool. Why ask a single sv-detection approach to find every variant when you can have a parliament of tools deciding?</p><p>Publication about the algorithm and &ldquo;&hellip;the first long-read characterization of structural variation in a diploid human personal genome&hellip;&rdquo; (HS1011) -&nbsp;<a href="http://www.biomedcentral.com/1471-2164/16/286">&ldquo;Assessing structural variation in a personal genome&mdash;towards a human reference diploid genome&rdquo;</a></p><p>More at&nbsp;https://sourceforge.net/projects/parliamentsv/</p><p>https://www.dnanexus.com/papers/Parliament_Info_Sheet.pdf</p><p><br /><strong>PBHoney:</strong>&nbsp;the structural variation discovery tool&nbsp;<br /><br />PBHoney is an implementation of two variant-identification approaches designed to exploit the high mappability of long reads (i.e., greater than 10,000 bp). PBHoney considers both intra-read discordance and soft-clipped tails of long reads to identify structural variants.</p><p>Read The Paper&nbsp;<a href="http://www.biomedcentral.com/1471-2105/15/180/abstract" target="_blank">http://www.biomedcentral.com/1471-2105/15/180/abstract</a></p><p>More at&nbsp;https://sourceforge.net/projects/pb-jelly/</p><p><strong><br />SMRT-SV:</strong> Structural variant and indel caller for PacBio reads</p><p>Structural variant (SV) and indel caller for PacBio reads based on methods from&nbsp;<a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html">Chaisson et al. 2014</a>.</p><p>SMRT-SV provides an official software package for tools described in&nbsp;<a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html">Chaisson et al. 2014</a>&nbsp;and adds several key features including the following.</p><ul>
<li>Unified variant calling user interface with built-in cluster compute support</li>
<li>Small indel calling (2-49 bp)</li>
<li>Improved inversion calling (<code>screenInversions</code>)</li>
<li>Quality metric for SV calls based on number of local assemblies supporting each call</li>
<li>Higher sensitivity for SV calls using tiled local assemblies across the entire genome instead of "signature" regions</li>
<li>Genotyping of SVs with Illumina paired-end reads from WGS samples</li>
</ul><p>More at&nbsp;https://github.com/EichlerLab/pacbio_variant_caller</p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/37514/list-of-non-commercial-ngs-genotype-calling-software</guid>
	<pubDate>Thu, 09 Aug 2018 04:21:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/37514/list-of-non-commercial-ngs-genotype-calling-software</link>
	<title><![CDATA[List of non-commercial NGS genotype-calling software]]></title>
	<description><![CDATA[<p><span>Meaningful analysis of next-generation sequencing (NGS) data, which are produced extensively by genetics and genomics studies, relies crucially on the accurate calling of SNPs and genotypes. Recently developed statistical methods both improve and quantify the considerable uncertainty associated with genotype calling, and will especially benefit the growing number of studies using low- to medium-coverage data.&nbsp;</span></p><p><span>A list of programs for genotype and SNP calling :</span></p><p><br />SOAP2&nbsp;http://soap.genomics.org.cn/index.html</p><p>Single-sample High-quality variant database (for example, dbSNP) Package for NGS data analysis, which includes a single individual genotype caller (SOAPsnp)</p><p>realSFS&nbsp;http://128.32.118.212/thorfinn/realSFS/</p><p>Single-sample Aligned reads Software for SNP and genotype calling using single individuals and allele frequencies. Site frequency spectrum (SFS) estimation</p><p>Samtools http://samtools.sourceforge.net/</p><p>Multi-sample Aligned reads Package for manipulation of NGS alignments, which includes a computation of genotype likelihoods (samtools) and SNP and genotype calling (bcftools)</p><p>GATK http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit Multi-sample Aligned reads Package for aligned NGS data analysis, which includes a SNP and genotype caller (Unifed Genotyper), SNP filtering (Variant Filtration) and SNP quality recalibration (Variant Recalibrator)</p><p>Beagle http://faculty.washington.edu/browning/beagle/beagle.html</p><p>Multi-sample LD Candidate SNPs, genotype likelihoods Software for imputation, phasing and association that includes a mode for genotype calling</p><p>IMPUTE2 http://mathgen.stats.ox.ac.uk/impute/impute_v2.html</p><p>Multi-sample LD Candidate SNPs, genotype likelihoods Software for imputation and phasing, including a mode for genotype calling. Requires fine-scale linkage map</p><p>QCall ftp://ftp.sanger.ac.uk/pub/rd/QCALL</p><p>Multi-sample LD &lsquo;Feasible&rsquo; genealogies at a dense set of loci, genotype likelihoods Software for SNP and genotype calling, including a method for generating candidate SNPs without LD information (NLDA) and a method for incorporating LD information (LDA). The &lsquo;feasible&rsquo; genealogies can be generated using Margarita (http://www.sanger.ac.uk/resources/software/margarita)</p><p>MaCH http://genome.sph.umich.edu/wiki/Thunder</p><p>Multi-sample LD Genotype likelihoods Software for SNP and genotype calling, including a method (GPT_Freq) for generating candidate SNPs without LD information and a method (thunder_glf_freq) for incorporating LD information</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</guid>
	<pubDate>Sun, 07 Mar 2021 00:32:44 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</link>
	<title><![CDATA[Ancient whole genome duplication (WGD) detection tools !]]></title>
	<description><![CDATA[<p>There are two methods for ancient WGD detection, one is collinearity analysis, and the other is based on the Ks distribution map. Among them, Ks is defined as the average number of synonymous substitutions at each synonymous site, and there is also a Ka corresponding to it, which refers to the average number of non-synonymous substitutions at each non-synonymous site.</p><p>At present, some people have posted articles about the analysis process of WGD. I searched for the keyword "wgd pipeline" and found the following:</p><p><strong>GenoDup: https:// github.com/MaoYafei/GenoDup-Pipeline</strong><br /><strong>https://peerj.com/articles/6303/</strong><br /><strong>WGDdetector: https:// github.com/yongzhiyang2 012/WGDdetector</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2670-3</strong><br /><strong>wgd: https:// github.com/arzwa/wgd</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2#Sec1</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>GeNoGAP https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>https://github.com/dfguan/purge_dups</strong><br /><strong>https://www.biorxiv.org/content/10.1101/2020.01.24.917997v1</strong></p><p>This article introduces the usage of wgd.</p><p>Wgd cannot be installed directly with bioconda at present, so it is a little troublesome to install, because it depends on a lot of software. wgd depends on the following software</p><p><strong>BLAST</strong><br /><strong>MCL</strong><br /><strong>MUSCLE/MAFFT/PRANK</strong><br /><strong>PAML</strong><br /><strong>PhyML/FastTree</strong><br /><strong>i-ADHoRe</strong></p><p>But the good news is that most of the software it depends on can be installed with bioconda</p><blockquote><p>conda create -n wgd python=3.5 blast mcl muscle mafft prank paml fasttree cmake libpng mpi=1.0=mpich<br />conda activate wgd</p></blockquote><p>Here mpi=1.0=mpich is selected, because i-adhore depends on mpich. If openmpi is installed, an error will appear while loading shared libraries: libmpi_cxx.so.40: cannot open shared object file: No such file or directory</p><p>After that, the installation is much simpler</p><blockquote><p>git clone https://github.com/arzwa/wgd.git<br />cd wgd<br />pip install .<br />pip install git+https://github.com/arzwa/wgd.git<br />For i-ADHoRe, you need to register at http:// bioinformatics.psb.ugent.be /webtools/i-adhore/licensing/Agree to the license to download i-ADHoRe-3.0</p></blockquote><p>Since my miniconda3 installed ~/opt/, the installation path is so~/opt/miniconda3/envs/wgd/</p><blockquote><p>tar -zxvf i-adhore-3.0.01.tar.gz<br />cd i-adhore-3.0.01<br />mkdir -p build &amp;&amp; cd build<br />cmake .. -DCMAKE_INSTALL_PREFIX=~/opt/miniconda3/envs/wgd/<br />make -j 4 <br />make insatall</p></blockquote><p>Take the sugarcane genome Saccharum spontaneum L as an example. The genome is 8-ploid with 32 chromosomes (2n = 4x8 = 32)</p><p><strong>Download the tutorial for CDS and GFF annotation files</strong></p><blockquote><p><strong>mkdir -p wgd_tutorial &amp;&amp; cd wgd_tutorial</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.cds.fasta.gz</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.gff3.gz</strong><br /><strong>gunzip *.gz</strong></p></blockquote><p>First conda activate wgdstart our analysis environment, and then start the analysis</p><p>Step 1 : Use to wgd mclidentify homologous genes in the genome</p><blockquote><p>wgd mcl -n 20 --cds --mcl -s Sspon.v20190103.cds.fasta -o Sspon_cds.out</p></blockquote><p>Step 2 : Use to wgd ksdbuild Ks distribution</p><blockquote><p>wgd ksd --n_threads 80 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl Sspon.v20190103.cds.fasta</p></blockquote><p>Step 3 : If the quality of the genome is good, then wgd syncollinearity analysis can be used . It can help us find the collinearity block in the genome and the corresponding anchor point</p><blockquote><p>wgd syn --feature gene --gene_attribute ID \<br /> -ks wgd_ksd/Sspon.v20190103.cds.fasta.ks.tsv \<br /> Sspon.v20190103.gff3 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl</p></blockquote><p>&nbsp;For more reading - There are 9 sub-modules in WGD</p><ul>
<li><span>kde: KDE fitting to the Ks distribution</span></li>
<li><span>ksd: Ks distribution construction</span></li>
<li><span>mcl: BLASP comparison of All-vs-ALl + MCL classification analysis.</span></li>
<li><span><span>mix: Hybrid modeling of Ks distribution.</span></span></li>
<li><span>pre: preprocess the CDS file</span></li>
<li><span>syn: Call I-ADHoRe 3.0 to use GFF files for collinearity analysis</span></li>
<li><span>viz: draw histogram and density plot</span></li>
<li><span>wf1: Ks standard analysis procedure of the whole genome paranome (paranome), call mcl, ksd and syn</span></li>
<li><span>wf2: Ks standard analysis procedure of one-vs-one homologous gene (ortholog), call wcl and kSD</span></li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>