<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/35384?offset=430</link>
	<atom:link href="https://bioinformaticsonline.com/related/35384?offset=430" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</guid>
	<pubDate>Sun, 07 Mar 2021 00:32:44 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</link>
	<title><![CDATA[Ancient whole genome duplication (WGD) detection tools !]]></title>
	<description><![CDATA[<p>There are two methods for ancient WGD detection, one is collinearity analysis, and the other is based on the Ks distribution map. Among them, Ks is defined as the average number of synonymous substitutions at each synonymous site, and there is also a Ka corresponding to it, which refers to the average number of non-synonymous substitutions at each non-synonymous site.</p><p>At present, some people have posted articles about the analysis process of WGD. I searched for the keyword "wgd pipeline" and found the following:</p><p><strong>GenoDup: https:// github.com/MaoYafei/GenoDup-Pipeline</strong><br /><strong>https://peerj.com/articles/6303/</strong><br /><strong>WGDdetector: https:// github.com/yongzhiyang2 012/WGDdetector</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2670-3</strong><br /><strong>wgd: https:// github.com/arzwa/wgd</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2#Sec1</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>GeNoGAP https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>https://github.com/dfguan/purge_dups</strong><br /><strong>https://www.biorxiv.org/content/10.1101/2020.01.24.917997v1</strong></p><p>This article introduces the usage of wgd.</p><p>Wgd cannot be installed directly with bioconda at present, so it is a little troublesome to install, because it depends on a lot of software. wgd depends on the following software</p><p><strong>BLAST</strong><br /><strong>MCL</strong><br /><strong>MUSCLE/MAFFT/PRANK</strong><br /><strong>PAML</strong><br /><strong>PhyML/FastTree</strong><br /><strong>i-ADHoRe</strong></p><p>But the good news is that most of the software it depends on can be installed with bioconda</p><blockquote><p>conda create -n wgd python=3.5 blast mcl muscle mafft prank paml fasttree cmake libpng mpi=1.0=mpich<br />conda activate wgd</p></blockquote><p>Here mpi=1.0=mpich is selected, because i-adhore depends on mpich. If openmpi is installed, an error will appear while loading shared libraries: libmpi_cxx.so.40: cannot open shared object file: No such file or directory</p><p>After that, the installation is much simpler</p><blockquote><p>git clone https://github.com/arzwa/wgd.git<br />cd wgd<br />pip install .<br />pip install git+https://github.com/arzwa/wgd.git<br />For i-ADHoRe, you need to register at http:// bioinformatics.psb.ugent.be /webtools/i-adhore/licensing/Agree to the license to download i-ADHoRe-3.0</p></blockquote><p>Since my miniconda3 installed ~/opt/, the installation path is so~/opt/miniconda3/envs/wgd/</p><blockquote><p>tar -zxvf i-adhore-3.0.01.tar.gz<br />cd i-adhore-3.0.01<br />mkdir -p build &amp;&amp; cd build<br />cmake .. -DCMAKE_INSTALL_PREFIX=~/opt/miniconda3/envs/wgd/<br />make -j 4 <br />make insatall</p></blockquote><p>Take the sugarcane genome Saccharum spontaneum L as an example. The genome is 8-ploid with 32 chromosomes (2n = 4x8 = 32)</p><p><strong>Download the tutorial for CDS and GFF annotation files</strong></p><blockquote><p><strong>mkdir -p wgd_tutorial &amp;&amp; cd wgd_tutorial</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.cds.fasta.gz</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.gff3.gz</strong><br /><strong>gunzip *.gz</strong></p></blockquote><p>First conda activate wgdstart our analysis environment, and then start the analysis</p><p>Step 1 : Use to wgd mclidentify homologous genes in the genome</p><blockquote><p>wgd mcl -n 20 --cds --mcl -s Sspon.v20190103.cds.fasta -o Sspon_cds.out</p></blockquote><p>Step 2 : Use to wgd ksdbuild Ks distribution</p><blockquote><p>wgd ksd --n_threads 80 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl Sspon.v20190103.cds.fasta</p></blockquote><p>Step 3 : If the quality of the genome is good, then wgd syncollinearity analysis can be used . It can help us find the collinearity block in the genome and the corresponding anchor point</p><blockquote><p>wgd syn --feature gene --gene_attribute ID \<br /> -ks wgd_ksd/Sspon.v20190103.cds.fasta.ks.tsv \<br /> Sspon.v20190103.gff3 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl</p></blockquote><p>&nbsp;For more reading - There are 9 sub-modules in WGD</p><ul>
<li><span>kde: KDE fitting to the Ks distribution</span></li>
<li><span>ksd: Ks distribution construction</span></li>
<li><span>mcl: BLASP comparison of All-vs-ALl + MCL classification analysis.</span></li>
<li><span><span>mix: Hybrid modeling of Ks distribution.</span></span></li>
<li><span>pre: preprocess the CDS file</span></li>
<li><span>syn: Call I-ADHoRe 3.0 to use GFF files for collinearity analysis</span></li>
<li><span>viz: draw histogram and density plot</span></li>
<li><span>wf1: Ks standard analysis procedure of the whole genome paranome (paranome), call mcl, ksd and syn</span></li>
<li><span>wf2: Ks standard analysis procedure of one-vs-one homologous gene (ortholog), call wcl and kSD</span></li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43112/calling-variants-in-non-diploid-systems</guid>
	<pubDate>Sat, 26 Jun 2021 15:37:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43112/calling-variants-in-non-diploid-systems</link>
	<title><![CDATA[Calling variants in non-diploid systems]]></title>
	<description><![CDATA[<p><span>The main challenge associated with non-diploid variant calling is the difficulty in distinguishing between the sequencing noise (abundant in all NGS platforms) and true low frequency variants. Some of the early attempts to do this well have been accomplished on human mitochondrial&nbsp;</span><span>DNA</span><span>&nbsp;although the same approaches will work equally good on viral and bacterial genomes (</span><a href="https://training.galaxyproject.org/training-material/topics/variant-analysis/tutorials/non-dip/tutorial.html#Rebolledo-Jaramillo2014">Rebolledo-Jaramillo&nbsp;<em>et al.</em>&nbsp;2014</a><span>,&nbsp;</span><a href="https://training.galaxyproject.org/training-material/topics/variant-analysis/tutorials/non-dip/tutorial.html#Li2015">Li&nbsp;<em>et al.</em>&nbsp;2015</a><span>).</span></p><p>Address of the bookmark: <a href="https://training.galaxyproject.org/training-material/topics/variant-analysis/tutorials/non-dip/tutorial.html" rel="nofollow">https://training.galaxyproject.org/training-material/topics/variant-analysis/tutorials/non-dip/tutorial.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43614/mitoz-a-toolkit-for-animal-mitochondrial-genome-assembly-annotation-and-visualization</guid>
	<pubDate>Tue, 30 Nov 2021 23:23:57 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43614/mitoz-a-toolkit-for-animal-mitochondrial-genome-assembly-annotation-and-visualization</link>
	<title><![CDATA[MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization]]></title>
	<description><![CDATA[<p>MitoZ, consisting of independent modules of <em>de novo</em> assembly, findMitoScaf (find Mitochondrial Scaffolds), annotation and visualization, that can generate mitogenome assembly together with annotation and visualization results from HTS raw reads.</p>
<p>https://academic.oup.com/nar/article/47/11/e63/5377471</p><p>Address of the bookmark: <a href="https://github.com/linzhi2013/MitoZ" rel="nofollow">https://github.com/linzhi2013/MitoZ</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43661/maftools</guid>
	<pubDate>Fri, 17 Dec 2021 03:18:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43661/maftools</link>
	<title><![CDATA[maftools]]></title>
	<description><![CDATA[<p>With advances in Cancer Genomics, <a href="https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/">Mutation Annotation Format</a> (MAF) is being widely accepted and used to store somatic variants detected. <a href="http://cancergenome.nih.gov">The Cancer Genome Atlas</a> Project has sequenced over 30 different cancers with sample size of each cancer type being over 200. <a href="https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files">Resulting data</a> consisting of somatic variants are stored in the form of <a href="https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/">Mutation Annotation Format</a>. This package attempts to summarize, analyze, annotate and visualize MAF files in an efficient manner from either TCGA sources or any in-house studies as long as the data is in MAF format.</p>
<p>https://www.bioconductor.org/packages/devel/bioc/vignettes/maftools/inst/doc/maftools.html</p><p>Address of the bookmark: <a href="https://github.com/PoisonAlien/maftools" rel="nofollow">https://github.com/PoisonAlien/maftools</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</guid>
	<pubDate>Mon, 31 Jan 2022 07:18:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</link>
	<title><![CDATA[Short-read assembly using Spades !]]></title>
	<description><![CDATA[<h2 id="short-read-assembly-a-comparison">If we only had Illumina reads, we could also assemble these using the tool Spades.</h2><p>You can try this here, or try it later on your own data.</p><h2 id="get-data">Get data</h2><p>We will use the same Illumina data as we used above:</p><ul>
<li>illumina_R1.fastq.gz: the Illumina forward reads</li>
<li>illumina_R2.fastq.gz: the Illumina reverse reads</li>
</ul><h2 id="assemble">Assemble</h2><p>Run Spades:</p><div><pre>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o spades_assembly_all_illumina
</pre></div><ul>
<li><code>-1</code>&nbsp;is input file of forward reads</li>
<li><code>-2</code>&nbsp;is input file of reverse reads</li>
<li><code>--careful</code>&nbsp;minimizes mismatches and short indels</li>
<li><code>--cov-cutoff auto</code>&nbsp;computes the coverage threshold (rather than the default setting, &ldquo;off&rdquo;)</li>
<li><code>-o</code>&nbsp;is the output directory</li>
</ul><h2 id="results">Results</h2><p>Move into the output directory and look at the contigs:</p><div><pre>infoseq contigs.fasta</pre></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43923/monkeypox-virus-isolate-mpxv-usa-2022-ma001-complete-genome</guid>
	<pubDate>Tue, 26 Jul 2022 06:21:07 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43923/monkeypox-virus-isolate-mpxv-usa-2022-ma001-complete-genome</link>
	<title><![CDATA[Monkeypox virus isolate MPXV_USA_2022_MA001, complete genome]]></title>
	<description><![CDATA[<pre>LOCUS       ON563414              197205 bp    DNA     linear   VRL 30-MAY-2022
DEFINITION  Monkeypox virus isolate MPXV_USA_2022_MA001, complete genome.
ACCESSION   ON563414
VERSION     ON563414.3
KEYWORDS    .
SOURCE      Monkeypox virus (monkeypox)
  ORGANISM  <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=10244">Monkeypox virus</a>
            Viruses; Varidnaviria; Bamfordvirae; Nucleocytoviricota;
            Pokkesviricetes; Chitovirales; Poxviridae; Chordopoxvirinae;
            Orthopoxvirus.</pre><p>Address of the bookmark: <a href="https://www.ncbi.nlm.nih.gov/nuccore/ON563414" rel="nofollow">https://www.ncbi.nlm.nih.gov/nuccore/ON563414</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44373/mitohifi-a-python-pipeline-for-mitochondrial-genome-assembly-from-pacbio-high-fidelity-reads</guid>
	<pubDate>Tue, 05 Sep 2023 07:31:35 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44373/mitohifi-a-python-pipeline-for-mitochondrial-genome-assembly-from-pacbio-high-fidelity-reads</link>
	<title><![CDATA[MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads]]></title>
	<description><![CDATA[<p dir="auto">MitoHiFi v3.2 is a python pipeline distributed under&nbsp;<a href="https://github.com/marcelauliano/MitoHiFi/blob/master/LICENSE">MIT License</a>&nbsp;!</p>
<p dir="auto">MitoHiFi was first developed to assemble the mitogenomes for a wide range of species in the Darwin Tree of Life Project (DToL)</p>
<p dir="auto">https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05385-y&nbsp;</p>
<p dir="auto"><a href="https://github.com/marcelauliano/MitoHiFi/blob/master/docs/dtol-logo-round-300x132.png" target="_blank"><img src="https://github.com/marcelauliano/MitoHiFi/raw/master/docs/dtol-logo-round-300x132.png" alt="" style="border: 0px; border: 0px;"></a></p><p>Address of the bookmark: <a href="https://github.com/marcelauliano/MitoHiFi" rel="nofollow">https://github.com/marcelauliano/MitoHiFi</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44549/quartet-a-telomere-to-telomere-toolkit-for-gap-free-genome-assembly-and-centromeric-repeat-identification</guid>
	<pubDate>Sat, 08 Jun 2024 15:54:36 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44549/quartet-a-telomere-to-telomere-toolkit-for-gap-free-genome-assembly-and-centromeric-repeat-identification</link>
	<title><![CDATA[quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification.]]></title>
	<description><![CDATA[<p><span>quarTeT is a collection of tools for T2T genome assembly and basic analysis in automatic workflow.</span><br><br><span>Task include:</span></p>
<ul>
<li><a href="http://www.atcgn.com:8080/quarTeT/docuWeb.html#AssemblyMapper">AssemblyMapper</a>&nbsp;: reference-guided genome assembly</li>
<li><a href="http://www.atcgn.com:8080/quarTeT/docuWeb.html#GapFiller">GapFiller</a>&nbsp;: long-reads based gap filling</li>
<li><a href="http://www.atcgn.com:8080/quarTeT/docuWeb.html#TeloExplorer">TeloExplorer</a>&nbsp;: telomere identification</li>
<li><a href="http://www.atcgn.com:8080/quarTeT/docuWeb.html#CentroMiner">CentroMiner</a>&nbsp;: centromere candidate prediction</li>
</ul>
<p>https://academic.oup.com/hr/article/10/8/uhad127/7197191?login=false&nbsp;</p><p>Address of the bookmark: <a href="http://www.atcgn.com:8080/quarTeT/home.html" rel="nofollow">http://www.atcgn.com:8080/quarTeT/home.html</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/view/982</guid>
	<pubDate>Wed, 17 Jul 2013 15:25:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/view/982</link>
	<title><![CDATA[Is reference genome necessary for gene expression study in transcriptome sequencing or for variant discovery in genome sequencing?]]></title>
	<description><![CDATA[<p><span>Like in case of plant genomes where nature of genome is too complex and huge in size to accomplish complete<em> de novo</em> assembly by current sequencing technology. What would be alternate solution? Can we live in reference free world?</span></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/4183/320000-viruses-in-mammals-yet-to-sequenced-in-future</guid>
	<pubDate>Tue, 03 Sep 2013 08:35:30 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/4183/320000-viruses-in-mammals-yet-to-sequenced-in-future</link>
	<title><![CDATA[320000 viruses in mammals yet to sequenced in future!!!]]></title>
	<description><![CDATA[<p>With current biological technique improvements, finally it is now possible to look at millions of unknown viruses at genomic level and understand the mechanism. According to available data, close to 70 per cent of emerging viral diseases such as HIV/AIDS, West Nile, Ebola, SARS, and influenza, are zoonoses - infections of animals that cross into humans.</p><p>To address the challenges of describing and estimating virodiversity, a team of investigators from Center for Infection and Immunity (CII) and EcoHealth Alliance began in jungles of Bangladesh - home to the flying fox.</p><p>Reference:</p><p><a href="http://economictimes.indiatimes.com/news/news-by-industry/et-cetera/mammals-harbour-at-least-320000-new-viruses/articleshow/22253268.cms">http://economictimes.indiatimes.com/news/news-by-industry/et-cetera/mammals-harbour-at-least-320000-new-viruses/articleshow/22253268.cms</a></p><p><a href="http://www.bbc.co.uk/news/science-environment-23932400">http://www.bbc.co.uk/news/science-environment-23932400</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>

</channel>
</rss>