<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44329?offset=90</link>
	<atom:link href="https://bioinformaticsonline.com/related/44329?offset=90" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35033/bbsplit-read-binning-tool-for-metagenomes-and-contaminated-libraries</guid>
	<pubDate>Wed, 03 Jan 2018 00:25:27 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35033/bbsplit-read-binning-tool-for-metagenomes-and-contaminated-libraries</link>
	<title><![CDATA[BBSplit: Read Binning Tool for Metagenomes and Contaminated Libraries]]></title>
	<description><![CDATA[<p>BBSplit internally uses BBMap to map reads to multiple genomes at once, and determine which genome they match best. This is different than with ordinary mapping. If a genome (say, human) contains an exact repeat somewhere, reads mapping to it will be mapped ambiguously. But if you want to determine whether reads are mouse or human, it does not matter whether they map ambiguously within human, only whether they are ambiguous between human and mouse. BBSplit tracks this additional ambiguity information and decides how to use it based on the &ldquo;ambig2&rdquo; flag. The normal use of BBSplit is like Seal, either quantifying how many reads go to each reference, or splitting the reads into multiple output files, one per reference. BBSplit can only be run using references indexed with BBSplit, as they contain additional information regarding which sequences came from which reference file.</p><p><span>BBSplit is a tool that bins reads by mapping to multiple references simultaneously, using&nbsp;</span><a href="http://seqanswers.com/forums/showthread.php?t=41057" target="_blank">BBMap</a><span>. The reads go to the bin of the reference they map to best. There are also disambiguation options, such that reads that map to multiple references can be binned with all of them, none of them, one of them, or put in a special "ambiguous" file for each of them. Paired reads will always be kept together.</span><br /><br /><span>For example, if you had a library of something that was contaminated with e.coli and salmonella, you could do this:</span><br /><br /><strong>bbsplit.sh in=reads.fq ref=ecoli.fa,salmonella.fa basename=out_%.fq outu=clean.fq int=t</strong><br /><br /><span>This will produce 3 output files:</span><br /><strong>out_ecoli.fq</strong><span>&nbsp;(ecoli reads)</span><br /><strong>out_salmonella.fq</strong><span>&nbsp;(salmonella reads)</span><br /><strong>clean.fq</strong><span>&nbsp;(unmapped reads)</span><br /><br /><span>In this case, "int=t" means that the input file is paired and interleaved. For single-end reads you would leave that out. For paired reads in 2 files, you would do this:</span><br /><strong>bbsplit.sh in1=reads1.fq in2=reads2.fq ref=ecoli.fa,salmonella.fa basename=out_%.fq outu1=clean1.fq outu2=clean2.fq</strong></p><p><strong><span>BBSplit is available here:</span><br /><a href="https://sourceforge.net/projects/bbmap/" target="_blank">https://sourceforge.net/projects/bbmap/</a></strong></p><p><span>The sensitivity can be raised to be equivalent to BBMap with these flags: "minratio=0.56 minhits=1 maxindel=16000"</span></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/43977/read-simulators</guid>
	<pubDate>Fri, 30 Sep 2022 06:48:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/43977/read-simulators</link>
	<title><![CDATA[Read Simulators]]></title>
	<description><![CDATA[<h1>Short Read Simulators</h1><p>With the popularity of next-generation sequencing (NGS) technologies, many NGS read simulators have been developed. Currently, many of the popular short read simulators are designed to simulate reads mimicking many Illumina, 454 and SOLiD platforms. Listed below are some popular short read simulators. Links to their publications are provided as well.</p><ol>
<li><a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0003373" target="_blank">MetaSim</a></li>
<li><a href="https://github.com/lh3/wgsim" target="_blank">wgsim</a></li>
<li><a href="https://github.com/timmassingham/simNGS" target="_blank">SimNGS</a></li>
<li><a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0049110" target="_blank">ArtificialFastqGenerator</a></li>
<li id="e943"><a href="https://academic.oup.com/bioinformatics/article/35/3/521/5055123" target="_blank">InSilicoSeq</a></li>
</ol><h1>Long Read Simulators</h1><p id="d469">With the advancements in sequencing technologies, scientists have shown an increasing interest in using third-generation sequencing (TGS) technologies. Currently, many of the popular long read simulators are designed to simulate reads mimicking the two main TGS technologies; (1)&nbsp;<em>Pacific Biosciences (PacBio)</em>&nbsp;and (2)&nbsp;<em>Oxford Nanopore (ONT)</em>. Listed below are some of the popular and recently introduced PacBio and ONT simulators. Links to their publications are provided as well.</p><h2><span>PacBio Simulators</span></h2><ol>
<li><a href="https://academic.oup.com/bioinformatics/article/29/1/119/273243" target="_blank">PBSIM</a></li>
<li><a href="https://academic.oup.com/bioinformatics/article/32/24/3829/2525710" target="_blank">LongISLND</a></li>
<li><a href="https://academic.oup.com/bioinformatics/article/32/17/2704/2450740" target="_blank">SimLoRD</a></li>
<li><a href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2208-0" target="_blank">NPBSS</a></li>
<li id="fed0"><a href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2901-7" target="_blank">PaSS</a></li>
</ol><h2><span>ONT Simulators</span></h2><ol>
<li id="f145"><a href="https://academic.oup.com/gigascience/article/6/4/gix010/3051934" target="_blank">NanoSim</a></li>
<li id="c6f5"><a href="https://ieeexplore.ieee.org/document/8621253" target="_blank">Nanopore SimulatION</a></li>
<li><a href="https://academic.oup.com/bioinformatics/article/34/17/2899/4962495" target="_blank">DeepSimulator</a></li>
<li><a href="https://academic.oup.com/bioinformatics/article/36/8/2578/5698265" target="_blank">DeepSimulator1.5</a></li>
</ol>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35061/proovread-large-scale-high-accuracy-pacbio-correction-through-iterative-short-read-consensus</guid>
	<pubDate>Fri, 05 Jan 2018 04:12:20 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35061/proovread-large-scale-high-accuracy-pacbio-correction-through-iterative-short-read-consensus</link>
	<title><![CDATA[proovread : large-scale high-accuracy PacBio correction through iterative short read consensus]]></title>
	<description><![CDATA[<p>proovread : large-scale high-accuracy PacBio correction through iterative short read consensus</p>
<ul>
<li>outperforms PacBioToCA/LSC in terms of accuracy and contiguity/sensitivity (<a href="http://dx.doi.org/10.1093/bioinformatics/btu392">http://dx.doi.org/10.1093/bioinformatics/btu392</a>)</li>
<li>is easy to install/run/configure</li>
<li>supports various types of dat
<ul>
<li><strong>HiSeq/MiSeq&nbsp;</strong>(100-500bp)</li>
<li><strong>Unitigs</strong></li>
<li>454, ...</li>
</ul>
</li>
</ul>
<p>proovread maps high coverage data to pacbio reads (bwa mem, blasr, daligner) in multiple iterations.</p><p>Address of the bookmark: <a href="https://github.com/BioInf-Wuerzburg/proovread" rel="nofollow">https://github.com/BioInf-Wuerzburg/proovread</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36950/salsa-a-tool-to-scaffold-long-read-assemblies-with-hi-c</guid>
	<pubDate>Fri, 15 Jun 2018 04:01:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36950/salsa-a-tool-to-scaffold-long-read-assemblies-with-hi-c</link>
	<title><![CDATA[SALSA: A tool to scaffold long read assemblies with Hi-C]]></title>
	<description><![CDATA[This code is used to scaffold your assemblies using Hi-C data. This version implements some improvements in the original SALSA algorithm. If you want to use the old version, it can be found in the old_salsa branch.

To use the latest version, first run the following commands:

  cd SALSA
  make
To run the code, you will need Python 2.7, BOOST libraries and Networkx(version lower than 1.2).

If you consider using this tool, please cite our publication which describes the methods used for scaffolding.

Ghurye, J., Pop, M., Koren, S., Bickhart, D., &amp; Chin, C. S. (2017). Scaffolding of long read assemblies using long range contact information. BMC genomics, 18(1), 527. Link

Ghurye, J., Rhie, A., Walenz, B.P., Schmitt, A., Selvaraj, S., Pop, M., Phillippy, A.M. and Koren, S., 2018. Integrating Hi-C links with assembly graphs for chromosome-scale assembly. bioRxiv, p.261149 Link

For any queries, please either ask on github issue page or send an email to Jay Ghurye (jayg@cs.umd.edu).<p>Address of the bookmark: <a href="https://github.com/machinegun/SALSA" rel="nofollow">https://github.com/machinegun/SALSA</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37776/rhat-a-seed-and-extension-based-noisy-long-read-alignment-tool</guid>
	<pubDate>Sun, 23 Sep 2018 05:12:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37776/rhat-a-seed-and-extension-based-noisy-long-read-alignment-tool</link>
	<title><![CDATA[rHAT: a seed-and-extension-based noisy long read alignment tool]]></title>
	<description><![CDATA[<p><span>rHAT is a seed-and-extension-based noisy long read alignment tool. It is suitable for aligning 3rd generation sequencing reads which are in large read length with relatively high error rate, especially Pacbio's Single Molecule Read-time (SMRT) sequencing reads.</span></p><p>Address of the bookmark: <a href="https://github.com/dfguan/rHAT" rel="nofollow">https://github.com/dfguan/rHAT</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42280/urmap-an-ultra-fast-read-mapper</guid>
	<pubDate>Thu, 29 Oct 2020 23:03:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42280/urmap-an-ultra-fast-read-mapper</link>
	<title><![CDATA[URMAP, an ultra-fast read mapper]]></title>
	<description><![CDATA[<p><span>URMAP, a new read mapping algorithm. URMAP is an order of magnitude faster than BWA with comparable accuracy on several validation tests. On a Genome in a Bottle (GIAB) variant calling test with 30&times; coverage 2&times;150 reads, URMAP achieves high accuracy (precision 0.998, sensitivity 0.982 and F-measure 0.990) with the strelka2 caller. However, GIAB reference variants are shown to be biased against repetitive regions which are difficult to map and may therefore pose an unrealistically easy challenge to read mappers and variant callers.</span></p>
<p><span>More at&nbsp;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7320720/</span></p><p>Address of the bookmark: <a href="https://github.com/rcedgar/urmap" rel="nofollow">https://github.com/rcedgar/urmap</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36026/mmseqs20-ultra-fast-and-sensitive-protein-search-and-clustering-suite</guid>
	<pubDate>Thu, 22 Mar 2018 10:40:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36026/mmseqs20-ultra-fast-and-sensitive-protein-search-and-clustering-suite</link>
	<title><![CDATA[MMseqs2.0: ultra fast and sensitive protein search and clustering suite]]></title>
	<description><![CDATA[<p>MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The software is designed to run on multiple cores and servers and exhibits very good scalability. MMseqs2 can run 10000 times faster than BLAST. At 100 times its speed it achieves almost the same sensitivity. It can perform profile searches with the same sensitivity as PSI-BLAST at over 400 times its speed.</p>
<p>The MMseqs2 user guide is available as&nbsp;<a href="https://github.com/soedinglab/mmseqs2/wiki">Github Wiki</a>&nbsp;or as&nbsp;<a href="https://mmseqs.com/latest/userguide.pdf">PDF file</a>&nbsp;(Thanks to&nbsp;<a href="https://github.com/jgm/pandoc">pandoc</a>!)</p>
<p>Please cite:&nbsp;<a href="https://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3988.html">Steinegger M and Soeding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature Biotechnology, doi: 10.1038/nbt.3988 (2017)</a>.</p><p>Address of the bookmark: <a href="https://github.com/soedinglab/MMseqs2" rel="nofollow">https://github.com/soedinglab/MMseqs2</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42645/mmseqs2-ultra-fast-and-sensitive-sequence-search-and-clustering-suite</guid>
	<pubDate>Mon, 18 Jan 2021 10:47:56 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42645/mmseqs2-ultra-fast-and-sensitive-sequence-search-and-clustering-suite</link>
	<title><![CDATA[MMseqs2: ultra fast and sensitive sequence search and clustering suite]]></title>
	<description><![CDATA[<p><span>MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The software is designed to run on multiple cores and servers and exhibits very good scalability. MMseqs2 can run 10000 times faster than BLAST. At 100 times its speed it achieves almost the same sensitivity. It can perform profile searches with the same sensitivity as PSI-BLAST at over 400 times its speed.</span></p><p>Address of the bookmark: <a href="https://github.com/soedinglab/MMseqs2" rel="nofollow">https://github.com/soedinglab/MMseqs2</a></p>]]></description>
	<dc:creator>Manisha Mishra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43583/pango-lineage-analysis</guid>
	<pubDate>Mon, 15 Nov 2021 03:38:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43583/pango-lineage-analysis</link>
	<title><![CDATA[Pango Lineage Analysis !]]></title>
	<description><![CDATA[<p>The Pango nomenclature is being used by researchers and public health agencies worldwide to track the transmission and spread of SARS-CoV-2, including variants of concern. This website documents all current Pango lineages and their spread, as well as various software tools which can be used by researchers to perform analyses on SARS-COV-2 sequence data.</p><p>Address of the bookmark: <a href="https://cov-lineages.org/resources/pangolin/output.html" rel="nofollow">https://cov-lineages.org/resources/pangolin/output.html</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33901/rnacon-web-server-for-the-prediction-and-classification-of-non-coding-rnas</guid>
	<pubDate>Mon, 17 Jul 2017 04:55:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33901/rnacon-web-server-for-the-prediction-and-classification-of-non-coding-rnas</link>
	<title><![CDATA[RNAcon: web-server for the prediction and classification of non-coding RNAs]]></title>
	<description><![CDATA[<p style="text-align: justify;">RNAcon is a web-server for the prediction and classification of non-coding RNAs. It uses SVM-based model for the discrimination between coding and ncRNAs and RandomForest-based prediction model for the classification of ncRNAs into different classes. The structural information based graph properties were used for the development of prediction model.</p>
<p style="text-align: justify;">The&nbsp;<a href="http://crdd.osdd.net/raghava/rnacon/RNAcon_v1.0.tar.gz">standalone version (Linux-based command-line) of RNAcon</a>&nbsp;is freely available for the global scientific community.</p>
<p style="text-align: justify;">Reference:&nbsp;<a href="http://www.biomedcentral.com/1471-2164/15/127/abstract">Panwar, B.; Arora, A. and Raghava, G.P.S. (2014) Prediction and classification of ncRNAs using structural information</a>BMC Genomics 2014, 15:127</p><p>Address of the bookmark: <a href="http://crdd.osdd.net/raghava/rnacon/" rel="nofollow">http://crdd.osdd.net/raghava/rnacon/</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>

</channel>
</rss>