<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/34528?offset=340</link>
	<atom:link href="https://bioinformaticsonline.com/related/34528?offset=340" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34530/musket-a-multistage-k-mer-spectrum-based-corrector</guid>
	<pubDate>Wed, 06 Dec 2017 02:09:56 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34530/musket-a-multistage-k-mer-spectrum-based-corrector</link>
	<title><![CDATA[Musket: a multistage k-mer spectrum based corrector]]></title>
	<description><![CDATA[<p><strong>Musket</strong><span>&nbsp;is a well-established leading next-generation sequencing read error correction algorithm targetting Illumina sequencing. This corrector employs the&nbsp;</span><em>k</em><span>-mer spectrum approach and introduces three correction techniques in a multistage workflow. Our performance evaluation results, in terms of correction quality and de novo genome assembly measures, reveal that Musket is consistently one of the top performing substitution-error-based correctors. In addition, Musket is multi-threaded using a master-slave model and demonstrates superior parallel scalability compared to all other evaluated correctors as well as a highly competitive overall execution time.</span></p><p>Address of the bookmark: <a href="http://musket.sourceforge.net/homepage.htm" rel="nofollow">http://musket.sourceforge.net/homepage.htm</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35619/tallymer-method-to-compute-k-mer-frequencies-and-its-application-to-annotate-large-repetitive-plant-genomes</guid>
	<pubDate>Thu, 15 Feb 2018 10:21:02 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35619/tallymer-method-to-compute-k-mer-frequencies-and-its-application-to-annotate-large-repetitive-plant-genomes</link>
	<title><![CDATA[Tallymer: method to compute K-mer frequencies and its application to annotate large repetitive plant genomes]]></title>
	<description><![CDATA[<p>Tallymer is based on enhanced suffix arrays. This gives a much larger flexibility concerning the choice of the&nbsp;<span>k</span>-mer size. Tallymer can process large data sizes of several billion bases. We used it in a variety of applications to study the genomes of maize and other plant species. In particular, Tallymer was used to index a set whole genome shotgun sequences from maize (B73) (total size 10<sup>9</sup>&nbsp;bp).&nbsp;<br>Tallymer was effective in a variety of applications to aid genome annotation in maize, despite limitations imposed by the relatively low coverage of sequence available.</p>
<p>A manual can be found&nbsp;<a href="https://www.zbh.uni-hamburg.de/fileadmin/gi/tallymer/tallymer.pdf" target="_blank" title="tallymer.pdf (111 KB)">here</a>.</p><p>Address of the bookmark: <a href="https://www.zbh.uni-hamburg.de/forschung/arbeitsgruppe-genominformatik/software/tallymer.html" rel="nofollow">https://www.zbh.uni-hamburg.de/forschung/arbeitsgruppe-genominformatik/software/tallymer.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35899/reference-free-prediction-of-rearrangement-breakpoint-reads</guid>
	<pubDate>Thu, 08 Mar 2018 05:05:25 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35899/reference-free-prediction-of-rearrangement-breakpoint-reads</link>
	<title><![CDATA[Reference-free prediction of rearrangement breakpoint reads]]></title>
	<description><![CDATA[<p><span>lideSort-BPR (&nbsp;</span><span>b</span><span>&nbsp;reak&nbsp;</span><span>p</span><span>&nbsp;oint&nbsp;</span><span>r</span><span>&nbsp;eads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100&times;, it finds &sim;88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome.</span></p>
<p><span>https://github.com/ewijaya/slidesort-bpr</span></p><p>Address of the bookmark: <a href="https://code.google.com/archive/p/slidesort-bpr/" rel="nofollow">https://code.google.com/archive/p/slidesort-bpr/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36739/blasr-mapping-single-molecule-sequencing-reads-using-basic-local-alignment-with-successive-refinement-blasr-theory-and-application</guid>
	<pubDate>Wed, 23 May 2018 06:54:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36739/blasr-mapping-single-molecule-sequencing-reads-using-basic-local-alignment-with-successive-refinement-blasr-theory-and-application</link>
	<title><![CDATA[BlasR Mapping single molecule sequencing reads using Basic Local Alignment with Successive Refinement (BLASR): Theory and Application,]]></title>
	<description><![CDATA[<p><span>BLASR (Basic Local Alignment with Successive Refinement) for mapping Single Molecule Sequencing (SMS) reads that are thousands to tens of thousands of bases long with divergence between the read and genome dominated by insertion and deletion error.</span></p>
<p>Here is how I use the blasr to align PacBio reads to the contigs (target.fasta). The &ldquo;target.fasta.sa&rdquo; is the suffix array from &ldquo;target.fasta&rdquo; generated by sawriter.</p>
<blockquote>
<p>blasr query.fa ./target.fasta -sa ./target.fasta.sa -bestn 40 -maxScore -500 -m 4 -nproc 24 -out target.m4 -maxLCPLength 15</p>
</blockquote>
<p>the output format option &ldquo;-m 4&Prime; generate the alignment coordinate. Not fully documented, but I can explain that to you.&nbsp;</p>
<p>I use a 24 cores / 48G ram server for the alignment. It took about 2 to 3 hours aligning 3G PacBio Reads to 10^6 sequences of short read contigs with a mean 3.5kbp length.</p><p>Address of the bookmark: <a href="http://bix.ucsd.edu/projects/blasr/" rel="nofollow">http://bix.ucsd.edu/projects/blasr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36895/npscarf-real-time-scaffolder-using-spades-contigs-and-nanopore-sequencing-reads</guid>
	<pubDate>Mon, 11 Jun 2018 05:14:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36895/npscarf-real-time-scaffolder-using-spades-contigs-and-nanopore-sequencing-reads</link>
	<title><![CDATA[npScarf: real-time scaffolder using SPAdes contigs and Nanopore sequencing reads]]></title>
	<description><![CDATA[npScarf (jsa.np.npscarf) is a program that connect contigs from a draft genomes to generate sequences that are closer to finish. These pipelines can run on a single laptop for microbial datasets. In real-time mode, it can be integrated with simple structural analyses such as gene ordering, plasmid forming.<p>Address of the bookmark: <a href="http://japsa.readthedocs.io/en/latest/tools/jsa.np.npscarf.html" rel="nofollow">http://japsa.readthedocs.io/en/latest/tools/jsa.np.npscarf.html</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37561/hercules-a-profile-hmm-based-hybrid-error-correction-algorithm-for-long-reads</guid>
	<pubDate>Mon, 20 Aug 2018 14:14:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37561/hercules-a-profile-hmm-based-hybrid-error-correction-algorithm-for-long-reads</link>
	<title><![CDATA[Hercules: a profile HMM-based hybrid error correction algorithm for long reads]]></title>
	<description><![CDATA[<p><span>Choosing whether to use second or third generation sequencing platforms can lead to trade-offs between accuracy and read length. Several studies require long and accurate reads including de novo assembly, fusion and structural variation detection. In such cases researchers often combine both technologies and the more erroneous long reads are corrected using the short reads. Current approaches rely on various graph based alignment techniques and do not take the error profile of the underlying technology into account. Memory- and time- efficient machine learning algorithms that address these shortcomings have the potential to achieve better and more accurate integration of these two technologies. Results: We designed and developed Hercules, the first machine learning-based long read error correction algorithm. The algorithm models every long read as a profile Hidden Markov Model with respect to the underlying platformtextquoterights error profile. The algorithm learns a posterior transition/emission probability distribution for each long read and uses this to correct errors in these reads. Using datasets from two DNA-seq BAC clones (CH17-157L1 and CH17-227A2), and human brain cerebellum polyA RNA-seq, we show that Hercules-corrected reads have the highest mapping rate among all competing algorithms and highest accuracy when most of the basepairs of a long read are covered with short reads. Availability: </span></p>
<p><span>Hercules source code is available at https://github.com/BilkentCompGen/Hercules</span></p><p>Address of the bookmark: <a href="https://github.com/BilkentCompGen/Hercules" rel="nofollow">https://github.com/BilkentCompGen/Hercules</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37959/rainbow-an-integrated-tool-for-efficient-clustering-and-assembling-rad-seq-reads</guid>
	<pubDate>Fri, 19 Oct 2018 08:23:42 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37959/rainbow-an-integrated-tool-for-efficient-clustering-and-assembling-rad-seq-reads</link>
	<title><![CDATA[Rainbow: an integrated tool for efficient clustering and assembling RAD-seq reads]]></title>
	<description><![CDATA[<p><span>Rainbow is developed to provide an ultra-fast and memory-efficient solution to clustering and assembling short reads produced by RAD-seq. First, Rainbow clusters reads using a spaced seed method. Then, Rainbow implements a heterozygote calling like strategy to divide potential groups into haplotypes in a top&ndash;down manner. And along a guided tree, it iteratively merges sibling leaves in a bottom&ndash;up manner if they are similar enough. Here, the similarity is defined by comparing the 2nd reads of a RAD segment. This approach tries to collapse heterozygote while discriminate repetitive sequences. At last, Rainbow uses a greedy algorithm to locally assemble merged reads into contigs. Rainbow not only outputs the optimal but also suboptimal assembly results. Based on simulation and a real guppy RAD-seq data, we show that Rainbow is more competent than the other tools in dealing with RAD-seq data</span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/bio-rainbow/files/" rel="nofollow">https://sourceforge.net/projects/bio-rainbow/files/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40460/sviper-swipe-your-structural-variants-called-on-long-ontpacbio-reads-with-short-exact-illumina-reads</guid>
	<pubDate>Sun, 22 Dec 2019 03:48:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40460/sviper-swipe-your-structural-variants-called-on-long-ontpacbio-reads-with-short-exact-illumina-reads</link>
	<title><![CDATA[SViper: Swipe your Structural Variants called on long (ONT/PacBio) reads with short exact (Illumina) reads.]]></title>
	<description><![CDATA[<p>Call sviper</p>
<pre><code>~$ ./sviper -s short-reads.bam -l long-reads.bam -r ref.fa -c variants.vcf -o polished_variants
</code></pre>
<p>This will output a&nbsp;<code>polished_variants.vcf</code>&nbsp;file, that contains all the refined variants.</p>
<p>Sometimes it is helpful to look at the polished sequence, e.g. with the IGV browser. In that case you want SViper to output the polished and aligned sequences in a bam file via the option&nbsp;<code>--output-polished-bam</code>:</p>
<pre><code>~$ ./sviper -s short-reads.bam -l long-reads.bam -r ref.fa -c variants.vcf -o polished_variants --output-</code>polished-bam</pre><p>Address of the bookmark: <a href="https://github.com/smehringer/SViper" rel="nofollow">https://github.com/smehringer/SViper</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42485/fastprongs-fast-preprocessing-of-next-generation-sequencing-reads</guid>
	<pubDate>Sat, 26 Dec 2020 08:35:21 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42485/fastprongs-fast-preprocessing-of-next-generation-sequencing-reads</link>
	<title><![CDATA[FastProNGS: fast preprocessing of next-generation sequencing reads]]></title>
	<description><![CDATA[<p><span>FastProNGS to integrate the quality control process with automatic adapter removal. Parallel processing was implemented to speed up the process by allocating multiple threads. Compared with similar up-to-date preprocessing tools, FastProNGS is by far the fastest.&nbsp;</span></p><p>Address of the bookmark: <a href="https://github.com/Megagenomics/FastProNGS" rel="nofollow">https://github.com/Megagenomics/FastProNGS</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44171/hairsplitter-assembling-long-reads-in-an-unknown-number-of-haplotypes</guid>
	<pubDate>Wed, 07 Dec 2022 00:13:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44171/hairsplitter-assembling-long-reads-in-an-unknown-number-of-haplotypes</link>
	<title><![CDATA[HairSplitter: assembling long reads in an unknown number of haplotypes]]></title>
	<description><![CDATA[<p>Pros and cons of HairSplitter Limitations of HairSplitter:</p>
<p>Not very fast: it re-polishes the whole assembly&nbsp;</p>
<p>Limited in the number of haplotypes</p>
<p>Strengths of HairSplitter:</p>
<p>Very modular, can be used with any assembler</p>
<p>Naive: makes no assumption on ploidy, parameter-free</p>
<p>Safe: won&rsquo;t artificially duplicate contigs</p>
<p>&nbsp;</p>
<p>HairSplitter splits collapsed assemblies from &ldquo;draft&rdquo; assemblies obtained by any means</p>
<p>HairSplitter can recover haplotypes and distinguish repeated elements</p>
<p>Only needs sequencing reads, potentially error-prone</p>
<p>HairSplitter splits collapsed assemblies from &ldquo;draft&rdquo; assemblies obtained by any means</p>
<p>HairSplitter can recover haplotypes and distinguish repeated elements</p>
<p>Only needs sequencing reads, potentially error-prone</p>
<p>Not really available yet (github.com/RolandFaure/HairSplitter)</p>
<p>https://hal.archives-ouvertes.fr/hal-03864075/file/RolandFaure_presentation_SeqBIM_2022.pdf</p><p>Address of the bookmark: <a href="https://hal.archives-ouvertes.fr/hal-03817928/document" rel="nofollow">https://hal.archives-ouvertes.fr/hal-03817928/document</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>

</channel>
</rss>