<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/38224?offset=40</link>
	<atom:link href="https://bioinformaticsonline.com/related/38224?offset=40" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36897/gmcloser-closing-gaps-in-assemblies-accurately-with-a-likelihood-based-selection-of-contig-or-long-read-alignments</guid>
	<pubDate>Mon, 11 Jun 2018 05:43:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36897/gmcloser-closing-gaps-in-assemblies-accurately-with-a-likelihood-based-selection-of-contig-or-long-read-alignments</link>
	<title><![CDATA[GMcloser: closing gaps in assemblies accurately with a likelihood-based selection of contig or long-read alignments]]></title>
	<description><![CDATA[GMcloser uses likelihood-based classifiers calculated from the alignment statistics between scaffolds, contigs and paired-end reads to correctly assign contigs or long reads to gap regions of scaffolds, thereby achieving accurate and efficient gap closure. We demonstrate with sequencing data from various organisms that the gap-closing accuracy of GMcloser is 3–100-fold higher than those of other available tools, with similar efficiency.

https://academic.oup.com/bioinformatics/article/31/23/3733/209212<p>Address of the bookmark: <a href="https://academic.oup.com/bioinformatics/article/31/23/3733/209212" rel="nofollow">https://academic.oup.com/bioinformatics/article/31/23/3733/209212</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37416/gfinisher-a-new-strategy-to-refine-and-finish-bacterial-genome-assemblies</guid>
	<pubDate>Thu, 26 Jul 2018 09:31:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37416/gfinisher-a-new-strategy-to-refine-and-finish-bacterial-genome-assemblies</link>
	<title><![CDATA[GFinisher: a new strategy to refine and finish bacterial genome assemblies]]></title>
	<description><![CDATA[<p>GFinisher is an application tools for refinement and finalization of prokaryotic genomes assemblies using the bias of GC Skew to identify assembly errors and organizes the contigs/scaffolds with genomes references.</p>
<pre>java -Xms2G -Xmx4G -jar GenomeFinisher.jar  \
    -i target_contigs.fasta  \
    -ds alternative_assemblies.fasta -ref reference.fasta  \
    -o outputDirectory</pre><p>Address of the bookmark: <a href="http://gfinisher.sourceforge.net" rel="nofollow">http://gfinisher.sourceforge.net</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41831/merqury-reference-free-quality-and-phasing-assessment-for-genome-assemblies</guid>
	<pubDate>Sat, 06 Jun 2020 05:38:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41831/merqury-reference-free-quality-and-phasing-assessment-for-genome-assemblies</link>
	<title><![CDATA[Merqury: reference-free quality and phasing assessment for genome assemblies]]></title>
	<description><![CDATA[<p><span>Often, genome assembly projects have illumina whole genome sequencing reads available for the assembled individual. The k-mer spectrum of this read set can be used for independently evaluating assembly quality without the need of a high quality reference. Merqury provides a set of tools for this purpose.</span></p>
<p><span><a href="https://github.com/marbl/meryl">https://github.com/marbl/meryl</a></span></p><p>Address of the bookmark: <a href="https://github.com/marbl/merqury" rel="nofollow">https://github.com/marbl/merqury</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41937/merqury-evaluate-genome-assemblies-with-k-mers</guid>
	<pubDate>Fri, 03 Jul 2020 19:29:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41937/merqury-evaluate-genome-assemblies-with-k-mers</link>
	<title><![CDATA[merqury: Evaluate genome assemblies with k-mers]]></title>
	<description><![CDATA[<p><span>Often, genome assembly projects have illumina whole genome sequencing reads available for the assembled individual. The k-mer spectrum of this read set can be used for independently evaluating assembly quality without the need of a high quality reference. Merqury provides a set of tools for this purpose.</span></p>
<p><span>More at&nbsp;<a href="https://www.biorxiv.org/content/10.1101/2020.03.15.992941v1.full">https://www.biorxiv.org/content/10.1101/2020.03.15.992941v1.full</a></span></p><p>Address of the bookmark: <a href="https://github.com/marbl/merqury" rel="nofollow">https://github.com/marbl/merqury</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36257/aligngraph-algorithm-for-secondary-de-novo-genome-assembly-guided-by-closely-related-references</guid>
	<pubDate>Tue, 17 Apr 2018 16:21:20 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36257/aligngraph-algorithm-for-secondary-de-novo-genome-assembly-guided-by-closely-related-references</link>
	<title><![CDATA[AlignGraph: algorithm for secondary de novo genome assembly guided by closely related references]]></title>
	<description><![CDATA[<p>AlignGraph is a software that extends and joins contigs or scaffolds by reassembling them with help provided by a reference genome of a closely related organism.</p>
<p>Using AlignGraph</p>
<pre><code>AlignGraph --read1 reads_1.fa --read2 reads_2.fa --contig contigs.fa --genome genome.fa --distanceLow distanceLow --distanceHigh distancehigh --extendedContig extendedContigs.fa --remainingContig remainingContigs.fa [--kMer k --insertVariation insertVariation --coverage coverage --part p --fastMap --ratioCheck --iterativeMap --misassemblyRemoval --resume]</code></pre>
<h3>&nbsp;</h3><p>Address of the bookmark: <a href="https://github.com/baoe/AlignGraph" rel="nofollow">https://github.com/baoe/AlignGraph</a></p>]]></description>
	<dc:creator>Manisha Mishra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36890/price-paired-read-iterative-contig-extension-a-de-novo-genome-assembler-implemented-in-c</guid>
	<pubDate>Mon, 11 Jun 2018 03:08:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36890/price-paired-read-iterative-contig-extension-a-de-novo-genome-assembler-implemented-in-c</link>
	<title><![CDATA[PRICE (Paired-Read Iterative Contig Extension), a de novo genome assembler implemented in C++.]]></title>
	<description><![CDATA[We are pleased to release PRICE (Paired-Read Iterative Contig Extension), a de novo genome assembler implemented in C++. Its name describes the strategy that it implements for genome assembly: PRICE uses paired-read information to iteratively increase the size of existing contigs. Initially, those contigs can be individual reads from a subset of the paired-read dataset, non-paired reads from sequencing technologies that provide non-paired data, or contigs that were output from a prior run of PRICE or any other assembler.

http://derisilab.ucsf.edu/software/price/<p>Address of the bookmark: <a href="http://derisilab.ucsf.edu/software/price/" rel="nofollow">http://derisilab.ucsf.edu/software/price/</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39624/cogent-a-tool-for-reconstructing-the-coding-genome-using-high-quality-full-length-transcriptome-sequences</guid>
	<pubDate>Tue, 18 Jun 2019 05:33:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39624/cogent-a-tool-for-reconstructing-the-coding-genome-using-high-quality-full-length-transcriptome-sequences</link>
	<title><![CDATA[Cogent: a tool for reconstructing the coding genome using high-quality full-length transcriptome sequences.]]></title>
	<description><![CDATA[<div id="yui_3_14_1_1_1560853173251_3865">Cogent is a tool that identifies gene&nbsp;families and reconstructs the coding genome using high-quality transcriptome data without a reference genome, and can be used to check&nbsp;assemblies&nbsp;for the presence of&nbsp;these known coding sequences.</div>
<div>&nbsp;</div>
<div>
<p>Cogent is a tool for reconstructing the coding genome using high-quality full-length transcriptome sequences. It is designed to be used on&nbsp;<a href="https://github.com/PacificBiosciences/cDNA_primer/wiki">Iso-Seq data</a>&nbsp;and in cases where there is no reference genome or the ref genome is highly incomplete.</p>
<p>See a&nbsp;<a href="https://www.dropbox.com/s/mn6hwhguh0pqceu/20160106_Cogent_developers_conference_slides_Cuttlefish.pdf?dl=0">recent presentation</a>&nbsp;on Cogent being applied to the Cuttlefish Iso-Seq data.</p>
<p><a href="https://www.dropbox.com/s/kz0gi7qg0w82k9a/20161026_Cogent_manuscript_forGitHub.pdf?dl=0">Cogent preliminary draft paper (updated 2016Dec version)</a>,&nbsp;<a href="https://www.dropbox.com/s/37412o8glvnfhf9/20161026_Cogent_ManuscriptPlusSupplement_forGitHub.pdf?dl=0">Supplementary</a></p>
<p>Please see&nbsp;<a href="https://github.com/Magdoll/Cogent/wiki">wiki</a>&nbsp;for details on usage.</p>
</div><p>Address of the bookmark: <a href="https://github.com/Magdoll/Cogent" rel="nofollow">https://github.com/Magdoll/Cogent</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40465/airlift-a-methodology-and-tool-for-comprehensively-moving-mappings-and-annotations-from-one-genome-to-another-similar-genome</guid>
	<pubDate>Mon, 23 Dec 2019 10:20:13 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40465/airlift-a-methodology-and-tool-for-comprehensively-moving-mappings-and-annotations-from-one-genome-to-another-similar-genome</link>
	<title><![CDATA[AirLift, a methodology and tool for comprehensively moving mappings and annotations from one genome to another similar genome]]></title>
	<description><![CDATA[<p>We propose AirLift, a methodology and tool for comprehensively moving mappings and annotations from one genome to another similar genome while maintaining the accuracy of a full mapper.</p><p>Address of the bookmark: <a href="https://github.com/CMU-SAFARI/AirLift" rel="nofollow">https://github.com/CMU-SAFARI/AirLift</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</guid>
	<pubDate>Sun, 07 Mar 2021 00:32:44 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</link>
	<title><![CDATA[Ancient whole genome duplication (WGD) detection tools !]]></title>
	<description><![CDATA[<p>There are two methods for ancient WGD detection, one is collinearity analysis, and the other is based on the Ks distribution map. Among them, Ks is defined as the average number of synonymous substitutions at each synonymous site, and there is also a Ka corresponding to it, which refers to the average number of non-synonymous substitutions at each non-synonymous site.</p><p>At present, some people have posted articles about the analysis process of WGD. I searched for the keyword "wgd pipeline" and found the following:</p><p><strong>GenoDup: https:// github.com/MaoYafei/GenoDup-Pipeline</strong><br /><strong>https://peerj.com/articles/6303/</strong><br /><strong>WGDdetector: https:// github.com/yongzhiyang2 012/WGDdetector</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2670-3</strong><br /><strong>wgd: https:// github.com/arzwa/wgd</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2#Sec1</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>GeNoGAP https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>https://github.com/dfguan/purge_dups</strong><br /><strong>https://www.biorxiv.org/content/10.1101/2020.01.24.917997v1</strong></p><p>This article introduces the usage of wgd.</p><p>Wgd cannot be installed directly with bioconda at present, so it is a little troublesome to install, because it depends on a lot of software. wgd depends on the following software</p><p><strong>BLAST</strong><br /><strong>MCL</strong><br /><strong>MUSCLE/MAFFT/PRANK</strong><br /><strong>PAML</strong><br /><strong>PhyML/FastTree</strong><br /><strong>i-ADHoRe</strong></p><p>But the good news is that most of the software it depends on can be installed with bioconda</p><blockquote><p>conda create -n wgd python=3.5 blast mcl muscle mafft prank paml fasttree cmake libpng mpi=1.0=mpich<br />conda activate wgd</p></blockquote><p>Here mpi=1.0=mpich is selected, because i-adhore depends on mpich. If openmpi is installed, an error will appear while loading shared libraries: libmpi_cxx.so.40: cannot open shared object file: No such file or directory</p><p>After that, the installation is much simpler</p><blockquote><p>git clone https://github.com/arzwa/wgd.git<br />cd wgd<br />pip install .<br />pip install git+https://github.com/arzwa/wgd.git<br />For i-ADHoRe, you need to register at http:// bioinformatics.psb.ugent.be /webtools/i-adhore/licensing/Agree to the license to download i-ADHoRe-3.0</p></blockquote><p>Since my miniconda3 installed ~/opt/, the installation path is so~/opt/miniconda3/envs/wgd/</p><blockquote><p>tar -zxvf i-adhore-3.0.01.tar.gz<br />cd i-adhore-3.0.01<br />mkdir -p build &amp;&amp; cd build<br />cmake .. -DCMAKE_INSTALL_PREFIX=~/opt/miniconda3/envs/wgd/<br />make -j 4 <br />make insatall</p></blockquote><p>Take the sugarcane genome Saccharum spontaneum L as an example. The genome is 8-ploid with 32 chromosomes (2n = 4x8 = 32)</p><p><strong>Download the tutorial for CDS and GFF annotation files</strong></p><blockquote><p><strong>mkdir -p wgd_tutorial &amp;&amp; cd wgd_tutorial</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.cds.fasta.gz</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.gff3.gz</strong><br /><strong>gunzip *.gz</strong></p></blockquote><p>First conda activate wgdstart our analysis environment, and then start the analysis</p><p>Step 1 : Use to wgd mclidentify homologous genes in the genome</p><blockquote><p>wgd mcl -n 20 --cds --mcl -s Sspon.v20190103.cds.fasta -o Sspon_cds.out</p></blockquote><p>Step 2 : Use to wgd ksdbuild Ks distribution</p><blockquote><p>wgd ksd --n_threads 80 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl Sspon.v20190103.cds.fasta</p></blockquote><p>Step 3 : If the quality of the genome is good, then wgd syncollinearity analysis can be used . It can help us find the collinearity block in the genome and the corresponding anchor point</p><blockquote><p>wgd syn --feature gene --gene_attribute ID \<br /> -ks wgd_ksd/Sspon.v20190103.cds.fasta.ks.tsv \<br /> Sspon.v20190103.gff3 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl</p></blockquote><p>&nbsp;For more reading - There are 9 sub-modules in WGD</p><ul>
<li><span>kde: KDE fitting to the Ks distribution</span></li>
<li><span>ksd: Ks distribution construction</span></li>
<li><span>mcl: BLASP comparison of All-vs-ALl + MCL classification analysis.</span></li>
<li><span><span>mix: Hybrid modeling of Ks distribution.</span></span></li>
<li><span>pre: preprocess the CDS file</span></li>
<li><span>syn: Call I-ADHoRe 3.0 to use GFF files for collinearity analysis</span></li>
<li><span>viz: draw histogram and density plot</span></li>
<li><span>wf1: Ks standard analysis procedure of the whole genome paranome (paranome), call mcl, ksd and syn</span></li>
<li><span>wf2: Ks standard analysis procedure of one-vs-one homologous gene (ortholog), call wcl and kSD</span></li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43736/odgi-optimized-dynamic-genomegraph-implementation</guid>
	<pubDate>Tue, 01 Feb 2022 23:42:21 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43736/odgi-optimized-dynamic-genomegraph-implementation</link>
	<title><![CDATA[odgi: optimized dynamic genome/graph implementation]]></title>
	<description><![CDATA[<p dir="auto"><code>odgi</code>&nbsp;provides an efficient and succinct dynamic DNA sequence graph model, as well as a host of algorithms that allow the use of such graphs in bioinformatic analyses.</p>
<p dir="auto">Careful encoding of graph entities allows&nbsp;<code>odgi</code>&nbsp;to efficiently compute and transform&nbsp;<a href="https://pangenome.github.io/">pangenomes</a>&nbsp;with minimal overheads.&nbsp;<code>odgi</code>&nbsp;implements a dynamic data structure that leveraged multi-core CPUs and can be updated on the fly.</p>
<p dir="auto">The edges and path steps are recorded as deltas between the current node id and the target node id, where the node id corresponds to the rank in the global array of nodes. Graphs built from biological data sets tend to have local partial order and, when sorted, the deltas be small. This allows them to be compressed with a variable length integer representation, resulting in a small in-memory footprint at the cost of packing and unpacking.</p>
<p dir="auto">The RAM and computational savings are substantial. In partially ordered regions of the graph, most deltas will require only a single byte.</p><p>Address of the bookmark: <a href="https://github.com/pangenome/odgi" rel="nofollow">https://github.com/pangenome/odgi</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>

</channel>
</rss>