<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/34416?offset=200</link>
	<atom:link href="https://bioinformaticsonline.com/related/34416?offset=200" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36890/price-paired-read-iterative-contig-extension-a-de-novo-genome-assembler-implemented-in-c</guid>
	<pubDate>Mon, 11 Jun 2018 03:08:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36890/price-paired-read-iterative-contig-extension-a-de-novo-genome-assembler-implemented-in-c</link>
	<title><![CDATA[PRICE (Paired-Read Iterative Contig Extension), a de novo genome assembler implemented in C++.]]></title>
	<description><![CDATA[We are pleased to release PRICE (Paired-Read Iterative Contig Extension), a de novo genome assembler implemented in C++. Its name describes the strategy that it implements for genome assembly: PRICE uses paired-read information to iteratively increase the size of existing contigs. Initially, those contigs can be individual reads from a subset of the paired-read dataset, non-paired reads from sequencing technologies that provide non-paired data, or contigs that were output from a prior run of PRICE or any other assembler.

http://derisilab.ucsf.edu/software/price/<p>Address of the bookmark: <a href="http://derisilab.ucsf.edu/software/price/" rel="nofollow">http://derisilab.ucsf.edu/software/price/</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39830/the-extensive-de-novo-te-annotator-edta</guid>
	<pubDate>Thu, 08 Aug 2019 04:05:36 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39830/the-extensive-de-novo-te-annotator-edta</link>
	<title><![CDATA[The Extensive de novo TE Annotator (EDTA)]]></title>
	<description><![CDATA[<p><span>The EDTA package was designed to filter out false discoveries in raw TE candidates and generate a high-quality non-redundant TE library for whole-genome TE annotations. Selection of initial search programs were based on benckmarkings on the annotation performance using a manually curated TE library in the rice genome.</span></p><p>Address of the bookmark: <a href="https://github.com/oushujun/EDTA" rel="nofollow">https://github.com/oushujun/EDTA</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36618/lamsa-fast-split-read-alignment-with-long-approximate-matches</guid>
	<pubDate>Tue, 15 May 2018 04:44:42 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36618/lamsa-fast-split-read-alignment-with-long-approximate-matches</link>
	<title><![CDATA[LAMSA: fast split read alignment with long approximate matches]]></title>
	<description><![CDATA[LAMSA (Long Approximate Matches-based Split Aligner) is a novel split alignment approach with faster speed and good ability of handling SV events. It is well-suited to align long reads (over thousands of base-pairs).

LAMSA takes takes the advantage of the rareness of SVs to implement a specifically designed two-step strategy. That is, LAMSA initially splits the read into relatively long fragments and co-linearly align them to solve the small variations or sequencing errors, and mitigate the effect of repeats. The alignments of the fragments are then used for implementing a sparse dynamic programming (SDP)-based split alignment approach to handle the large or non-co-linear variants.

We benchmarked LAMSA with simulated and real datasets having various read lengths and sequencing error rates, the results demonstrate that it is substantially faster than the state-of-the-art long read aligners; mean-while, it also has good ability to handle various categories of SVs.

LAMSA is open source and free for non-commercial use.

LAMSA is mainly designed by Bo Liu &amp; Yan Gao and developed by Yan Gao in Center for Bioinformatics, Harbin Institute of Technology, China.<p>Address of the bookmark: <a href="https://github.com/hitbc/LAMSA" rel="nofollow">https://github.com/hitbc/LAMSA</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34394/tulip-the-uncorrected-long-read-itegration-pipeline</guid>
	<pubDate>Thu, 23 Nov 2017 09:30:01 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34394/tulip-the-uncorrected-long-read-itegration-pipeline</link>
	<title><![CDATA[TULIP - The Uncorrected Long read Itegration Pipeline]]></title>
	<description><![CDATA[<p>#Running TULIP (The Uncorrected Long-read Integration Process), version 0.4 late 2016 (European eel)</p>
<p>TULIP currently consists of to Perl scripts, tulipseed.perl and tulipbulb.perl. These are very much intended as prototypes, and additional components and/or implementations are likely to follow.&nbsp;<br>Tulipseed takes as input alignments files of long reads to sparse short seeds, and outputs a graph and scaffold structures. Tulipbulb adds long read sequencing data to these.</p>
<p>&nbsp;</p>
<p>https://github.com/Generade-nl/TULIP</p><p>Address of the bookmark: <a href="https://github.com/Generade-nl/TULIP" rel="nofollow">https://github.com/Generade-nl/TULIP</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36632/tulip-the-uncorrected-long-read-integration-pipeline</guid>
	<pubDate>Tue, 15 May 2018 09:06:37 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36632/tulip-the-uncorrected-long-read-integration-pipeline</link>
	<title><![CDATA[TULIP - The Uncorrected Long read Integration Pipeline]]></title>
	<description><![CDATA[TULIP currently consists of two Perl scripts, tulipseed.perl and tulipbulb.perl. These are very much intended as prototypes, and additional components and/or implementations are likely to follow.

Tulipseed takes as input alignments files of long reads to sparse short seeds, and outputs a graph and scaffold structures.<p>Address of the bookmark: <a href="https://github.com/Generade-nl/TULIP" rel="nofollow">https://github.com/Generade-nl/TULIP</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34702/run-miniasm-assembler-on-nanopore-reads</guid>
	<pubDate>Mon, 18 Dec 2017 04:07:50 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34702/run-miniasm-assembler-on-nanopore-reads</link>
	<title><![CDATA[Run miniasm assembler on nanopore reads !]]></title>
	<description><![CDATA[<p>Miniasm is a very fast OLC-based&nbsp;<em>de novo</em>&nbsp;assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by&nbsp;<a href="https://github.com/lh3/minimap">minimap</a>) as input and outputs an assembly graph in the&nbsp;<a href="https://github.com/pmelsted/GFA-spec/blob/master/GFA-spec.md">GFA</a>&nbsp;format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final&nbsp;<a href="http://wgs-assembler.sourceforge.net/wiki/index.php/Celera_Assembler_Terminology">unitig</a>&nbsp;sequences. Thus the per-base error rate is similar to the raw input reads.</p><p>Find the detail of the reads repeats:</p><blockquote><p>fq2fa ONT_A.fastq ONT_A.fasta&nbsp;<br /><br />minimap2 -xava-ont ONT_A.fasta ONT_A.fasta -t10 -X &gt; AONT.paf&nbsp;<br /><br />awk '{if($1==$6){print}}' AONT.paf &gt; AONTself.paf&nbsp;<br /><br />awk '$5=="-"' AONTself.paf | awk '{print $1}'| sort|uniq &gt; invertedrepeat.list</p></blockquote><p>Generated a few palindrome and repeats plots (highlighting only repeats largest than 10, 20 and 30 kb)</p><blockquote><p>minidot -f 5 -m 30000 AONTself.paf &gt; AONTself30000.eps&nbsp;<br />sed 's/_template_pass_FAH31515//' AONTself30000.eps &gt; AONTself30000final.eps&nbsp;<br /><br />minidot -f 5 -m 20000 AONTself.paf &gt; AONTself20000.eps&nbsp;<br />sed 's/_template_pass_FAH31515//' AONTself20000.eps &gt; AONTself20000final.eps&nbsp;<br /><br />minidot -f 5 -m 10000 AONTself.paf &gt; AONTself10000.eps&nbsp;<br />sed 's/_template_pass_FAH31515//' AONTself10000.eps &gt; AONTself10000final.eps&nbsp;</p></blockquote><p>Assemble with miniasm:</p><blockquote><p>miniasm -f ONT_A.fasta AONT.paf &gt; AONT.gfa&nbsp;</p><p>grep '^S' AONT.gfa |awk '{print "&gt;"$2"\n"$3}' &gt; AONT_miniasm.fasta&nbsp;<br /><br />minimap2 -xasm10 AONT_miniasm.fasta AONT_miniasm.fasta -t1 -X &gt; AONT_miniasm.paf&nbsp;<br /><br />awk '{if($1==$6){print}}' AONT_miniasm.paf &gt; AONT_miniasm_self.paf&nbsp;<br /><br />minidot -f 5 -m 10000 AONT_miniasm_self.paf &gt; AONT_miniasm_self10000.eps&nbsp;</p></blockquote><p>Njoy the assembly !</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36607/tarean-a-computational-tool-for-identification-and-characterization-of-satellite-dna-from-unassembled-short-reads</guid>
	<pubDate>Tue, 15 May 2018 02:53:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36607/tarean-a-computational-tool-for-identification-and-characterization-of-satellite-dna-from-unassembled-short-reads</link>
	<title><![CDATA[TAREAN: A computational tool for identification and characterization of satellite DNA from unassembled short reads]]></title>
	<description><![CDATA[<p><strong>TA</strong>ndem&nbsp;<strong>RE</strong>peat&nbsp;<strong>AN</strong>alyzer -TAREAN &ndash; is a computational pipeline for&nbsp;<strong>unsupervised identification of satellite repeats</strong>&nbsp;from unassembled sequence reads. The pipeline uses low-pass whole genome sequence reads and performs their graph-based clustering. Resulting clusters, representing all types of repeats, are then examined for the presence of circular structures and putative satellite repeats are reported.</p>
<p><em><strong>How to use TAREAN</strong></em>:</p>
<ul>
<li>Install a local instance of the pipeline using its source code available from&nbsp;<a href="https://bitbucket.org/petrnovak/repex_tarean" target="_blank" title="TAREAN source code">bitbucket repository</a>.</li>
<li>Use&nbsp; public Galaxy-based server at&nbsp;<a href="https://repeatexplorer-elixir.cerit-sc.cz/" target="_blank">https://repeatexplorer-elixir.cerit-sc.cz/</a>. The server is provided in frame of the&nbsp;<a href="https://www.elixir-czech.cz/" target="_blank">Elixir CZ project</a>&nbsp;and is maintained by&nbsp;<a href="https://www.cesnet.cz/" target="_blank">CESNET</a>&nbsp;and&nbsp;<a href="https://www.cerit-sc.cz/en/index.html" target="_blank">CERIT-SC</a>. Simple registration is required to use this service.</li>
</ul>
<p>Development of TAREAN was supported by&nbsp;<a href="https://www.elixir-czech.cz/" target="_blank" title="ELIXIR-CZ">ELIXIR CZ</a>&nbsp;research infrastructure project (MEYS Grant No: LM2015047).</p>
<p><strong><em>References</em></strong></p>
<p>Novak, P., Avila Robledillo, L., Koblizkova, A., Vrbova, I., Neumann, P., Macas, J. (2017) &ndash;&nbsp;<a href="https://academic.oup.com/nar/article/3574061/" target="_blank">TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads</a>.&nbsp;<em>Nucleic Acids Res.</em>, doi:10.1093/nar/gkx257</p><p>Address of the bookmark: <a href="https://bitbucket.org/petrnovak/repex_tarean" rel="nofollow">https://bitbucket.org/petrnovak/repex_tarean</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36739/blasr-mapping-single-molecule-sequencing-reads-using-basic-local-alignment-with-successive-refinement-blasr-theory-and-application</guid>
	<pubDate>Wed, 23 May 2018 06:54:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36739/blasr-mapping-single-molecule-sequencing-reads-using-basic-local-alignment-with-successive-refinement-blasr-theory-and-application</link>
	<title><![CDATA[BlasR Mapping single molecule sequencing reads using Basic Local Alignment with Successive Refinement (BLASR): Theory and Application,]]></title>
	<description><![CDATA[<p><span>BLASR (Basic Local Alignment with Successive Refinement) for mapping Single Molecule Sequencing (SMS) reads that are thousands to tens of thousands of bases long with divergence between the read and genome dominated by insertion and deletion error.</span></p>
<p>Here is how I use the blasr to align PacBio reads to the contigs (target.fasta). The &ldquo;target.fasta.sa&rdquo; is the suffix array from &ldquo;target.fasta&rdquo; generated by sawriter.</p>
<blockquote>
<p>blasr query.fa ./target.fasta -sa ./target.fasta.sa -bestn 40 -maxScore -500 -m 4 -nproc 24 -out target.m4 -maxLCPLength 15</p>
</blockquote>
<p>the output format option &ldquo;-m 4&Prime; generate the alignment coordinate. Not fully documented, but I can explain that to you.&nbsp;</p>
<p>I use a 24 cores / 48G ram server for the alignment. It took about 2 to 3 hours aligning 3G PacBio Reads to 10^6 sequences of short read contigs with a mean 3.5kbp length.</p><p>Address of the bookmark: <a href="http://bix.ucsd.edu/projects/blasr/" rel="nofollow">http://bix.ucsd.edu/projects/blasr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36895/npscarf-real-time-scaffolder-using-spades-contigs-and-nanopore-sequencing-reads</guid>
	<pubDate>Mon, 11 Jun 2018 05:14:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36895/npscarf-real-time-scaffolder-using-spades-contigs-and-nanopore-sequencing-reads</link>
	<title><![CDATA[npScarf: real-time scaffolder using SPAdes contigs and Nanopore sequencing reads]]></title>
	<description><![CDATA[npScarf (jsa.np.npscarf) is a program that connect contigs from a draft genomes to generate sequences that are closer to finish. These pipelines can run on a single laptop for microbial datasets. In real-time mode, it can be integrated with simple structural analyses such as gene ordering, plasmid forming.<p>Address of the bookmark: <a href="http://japsa.readthedocs.io/en/latest/tools/jsa.np.npscarf.html" rel="nofollow">http://japsa.readthedocs.io/en/latest/tools/jsa.np.npscarf.html</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37574/simlord-a-read-simulator-for-third-generation-sequencing-reads</guid>
	<pubDate>Wed, 22 Aug 2018 10:40:27 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37574/simlord-a-read-simulator-for-third-generation-sequencing-reads</link>
	<title><![CDATA[SimLoRD: A read simulator for third generation sequencing reads]]></title>
	<description><![CDATA[<p>SimLoRD is a read simulator for third generation sequencing reads and is currently focused on the Pacific Biosciences SMRT error model.</p>
<p>Reads are simulated from both strands of a provided or randomly generated reference sequence.</p>
<div id="rst-header-features">
<ul>
<li>The reference can be read from a FASTA file or randomly generated with a given GC content. It can consist of several chromosomes, whose structure is respected when drawing reads. (Simulation of genome rearrangements may be incorporated at a later stage.)</li>
<li>The read lengths can be determined in four ways: drawing from a log-normal distribution (typical for genomic DNA), sampling from an existing FASTQ file (typical for RNA), sampling from a a text file with integers (RNA), or using a fixed length</li>
<li>Quality values and number of passes depend on fragment length.</li>
<li>Provided subread error probabilities are modified according to number of passes</li>
<li>Outputs reads in FASTQ format and alignments in SAM format</li>
</ul>
</div><p>Address of the bookmark: <a href="https://bitbucket.org/genomeinformatics/simlord/" rel="nofollow">https://bitbucket.org/genomeinformatics/simlord/</a></p>]]></description>
	<dc:creator>Aaryan Lokwani</dc:creator>
</item>

</channel>
</rss>