<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/19555?offset=1260</link>
	<atom:link href="https://bioinformaticsonline.com/related/19555?offset=1260" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</guid>
	<pubDate>Sun, 07 Mar 2021 00:32:44 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</link>
	<title><![CDATA[Ancient whole genome duplication (WGD) detection tools !]]></title>
	<description><![CDATA[<p>There are two methods for ancient WGD detection, one is collinearity analysis, and the other is based on the Ks distribution map. Among them, Ks is defined as the average number of synonymous substitutions at each synonymous site, and there is also a Ka corresponding to it, which refers to the average number of non-synonymous substitutions at each non-synonymous site.</p><p>At present, some people have posted articles about the analysis process of WGD. I searched for the keyword "wgd pipeline" and found the following:</p><p><strong>GenoDup: https:// github.com/MaoYafei/GenoDup-Pipeline</strong><br /><strong>https://peerj.com/articles/6303/</strong><br /><strong>WGDdetector: https:// github.com/yongzhiyang2 012/WGDdetector</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2670-3</strong><br /><strong>wgd: https:// github.com/arzwa/wgd</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2#Sec1</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>GeNoGAP https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>https://github.com/dfguan/purge_dups</strong><br /><strong>https://www.biorxiv.org/content/10.1101/2020.01.24.917997v1</strong></p><p>This article introduces the usage of wgd.</p><p>Wgd cannot be installed directly with bioconda at present, so it is a little troublesome to install, because it depends on a lot of software. wgd depends on the following software</p><p><strong>BLAST</strong><br /><strong>MCL</strong><br /><strong>MUSCLE/MAFFT/PRANK</strong><br /><strong>PAML</strong><br /><strong>PhyML/FastTree</strong><br /><strong>i-ADHoRe</strong></p><p>But the good news is that most of the software it depends on can be installed with bioconda</p><blockquote><p>conda create -n wgd python=3.5 blast mcl muscle mafft prank paml fasttree cmake libpng mpi=1.0=mpich<br />conda activate wgd</p></blockquote><p>Here mpi=1.0=mpich is selected, because i-adhore depends on mpich. If openmpi is installed, an error will appear while loading shared libraries: libmpi_cxx.so.40: cannot open shared object file: No such file or directory</p><p>After that, the installation is much simpler</p><blockquote><p>git clone https://github.com/arzwa/wgd.git<br />cd wgd<br />pip install .<br />pip install git+https://github.com/arzwa/wgd.git<br />For i-ADHoRe, you need to register at http:// bioinformatics.psb.ugent.be /webtools/i-adhore/licensing/Agree to the license to download i-ADHoRe-3.0</p></blockquote><p>Since my miniconda3 installed ~/opt/, the installation path is so~/opt/miniconda3/envs/wgd/</p><blockquote><p>tar -zxvf i-adhore-3.0.01.tar.gz<br />cd i-adhore-3.0.01<br />mkdir -p build &amp;&amp; cd build<br />cmake .. -DCMAKE_INSTALL_PREFIX=~/opt/miniconda3/envs/wgd/<br />make -j 4 <br />make insatall</p></blockquote><p>Take the sugarcane genome Saccharum spontaneum L as an example. The genome is 8-ploid with 32 chromosomes (2n = 4x8 = 32)</p><p><strong>Download the tutorial for CDS and GFF annotation files</strong></p><blockquote><p><strong>mkdir -p wgd_tutorial &amp;&amp; cd wgd_tutorial</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.cds.fasta.gz</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.gff3.gz</strong><br /><strong>gunzip *.gz</strong></p></blockquote><p>First conda activate wgdstart our analysis environment, and then start the analysis</p><p>Step 1 : Use to wgd mclidentify homologous genes in the genome</p><blockquote><p>wgd mcl -n 20 --cds --mcl -s Sspon.v20190103.cds.fasta -o Sspon_cds.out</p></blockquote><p>Step 2 : Use to wgd ksdbuild Ks distribution</p><blockquote><p>wgd ksd --n_threads 80 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl Sspon.v20190103.cds.fasta</p></blockquote><p>Step 3 : If the quality of the genome is good, then wgd syncollinearity analysis can be used . It can help us find the collinearity block in the genome and the corresponding anchor point</p><blockquote><p>wgd syn --feature gene --gene_attribute ID \<br /> -ks wgd_ksd/Sspon.v20190103.cds.fasta.ks.tsv \<br /> Sspon.v20190103.gff3 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl</p></blockquote><p>&nbsp;For more reading - There are 9 sub-modules in WGD</p><ul>
<li><span>kde: KDE fitting to the Ks distribution</span></li>
<li><span>ksd: Ks distribution construction</span></li>
<li><span>mcl: BLASP comparison of All-vs-ALl + MCL classification analysis.</span></li>
<li><span><span>mix: Hybrid modeling of Ks distribution.</span></span></li>
<li><span>pre: preprocess the CDS file</span></li>
<li><span>syn: Call I-ADHoRe 3.0 to use GFF files for collinearity analysis</span></li>
<li><span>viz: draw histogram and density plot</span></li>
<li><span>wf1: Ks standard analysis procedure of the whole genome paranome (paranome), call mcl, ksd and syn</span></li>
<li><span>wf2: Ks standard analysis procedure of one-vs-one homologous gene (ortholog), call wcl and kSD</span></li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43273/understanding-kmer</guid>
	<pubDate>Wed, 18 Aug 2021 04:27:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43273/understanding-kmer</link>
	<title><![CDATA[Understanding kmer !]]></title>
	<description><![CDATA[<p><a href="https://en.wikipedia.org/wiki/k-mer">What is a&nbsp;<em>k-mer</em>&nbsp;anyway?</a><span>&nbsp;A&nbsp;</span><em>k-mer</em><span>&nbsp;is just a sequence of&nbsp;</span><em>k</em><span>&nbsp;characters in a string (or nucleotides in a DNA sequence). Now, it is important to remember that to get&nbsp;</span><em>all k-mers</em><span>&nbsp;from a sequence you need to get the first&nbsp;</span><em>k</em><span>&nbsp;characters, then move just a single character for the start of the next&nbsp;</span><em>k-mer</em><span>&nbsp;and so on. Effectively, this will create sequences that overlap in&nbsp;</span><code>k-1</code><span>&nbsp;positions.</span></p><p>Address of the bookmark: <a href="https://bioinfologics.github.io/post/2018/09/17/k-mer-counting-part-i-introduction/" rel="nofollow">https://bioinfologics.github.io/post/2018/09/17/k-mer-counting-part-i-introduction/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/19580/internship-program-for-bioinformatics-biotechnology-mba-mca-no-of-vacancy-5</guid>
  <pubDate>Mon, 15 Dec 2014 08:11:02 -0600</pubDate>
  <link></link>
  <title><![CDATA[Internship Program for Bioinformatics / Biotechnology / MBA / MCA (No. Of Vacancy: 5)]]></title>
  <description><![CDATA[
<p>ArrayGen is offering an Internship Program for Post graduate Bioinformatics / Biotechnology / MBA / MCA students and professionals. ArrayGen Technologies provide an excellent opportunity to gain research experience and explore if a scientific career is right for you. Currently we offer positions to outstanding students interested in Next Generation Sequencing (NGS) data analysis or marketing or software development. Applications are accepted throughout the year. Accepted students will be notified through email.</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43661/maftools</guid>
	<pubDate>Fri, 17 Dec 2021 03:18:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43661/maftools</link>
	<title><![CDATA[maftools]]></title>
	<description><![CDATA[<p>With advances in Cancer Genomics, <a href="https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/">Mutation Annotation Format</a> (MAF) is being widely accepted and used to store somatic variants detected. <a href="http://cancergenome.nih.gov">The Cancer Genome Atlas</a> Project has sequenced over 30 different cancers with sample size of each cancer type being over 200. <a href="https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files">Resulting data</a> consisting of somatic variants are stored in the form of <a href="https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/">Mutation Annotation Format</a>. This package attempts to summarize, analyze, annotate and visualize MAF files in an efficient manner from either TCGA sources or any in-house studies as long as the data is in MAF format.</p>
<p>https://www.bioconductor.org/packages/devel/bioc/vignettes/maftools/inst/doc/maftools.html</p><p>Address of the bookmark: <a href="https://github.com/PoisonAlien/maftools" rel="nofollow">https://github.com/PoisonAlien/maftools</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</guid>
	<pubDate>Mon, 31 Jan 2022 07:18:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</link>
	<title><![CDATA[Short-read assembly using Spades !]]></title>
	<description><![CDATA[<h2 id="short-read-assembly-a-comparison">If we only had Illumina reads, we could also assemble these using the tool Spades.</h2><p>You can try this here, or try it later on your own data.</p><h2 id="get-data">Get data</h2><p>We will use the same Illumina data as we used above:</p><ul>
<li>illumina_R1.fastq.gz: the Illumina forward reads</li>
<li>illumina_R2.fastq.gz: the Illumina reverse reads</li>
</ul><h2 id="assemble">Assemble</h2><p>Run Spades:</p><div><pre>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o spades_assembly_all_illumina
</pre></div><ul>
<li><code>-1</code>&nbsp;is input file of forward reads</li>
<li><code>-2</code>&nbsp;is input file of reverse reads</li>
<li><code>--careful</code>&nbsp;minimizes mismatches and short indels</li>
<li><code>--cov-cutoff auto</code>&nbsp;computes the coverage threshold (rather than the default setting, &ldquo;off&rdquo;)</li>
<li><code>-o</code>&nbsp;is the output directory</li>
</ul><h2 id="results">Results</h2><p>Move into the output directory and look at the contigs:</p><div><pre>infoseq contigs.fasta</pre></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43801/smudgeplot-inference-of-ploidy-and-heterozygosity-structure-using-whole-genome-sequencing-data</guid>
	<pubDate>Fri, 25 Feb 2022 04:42:09 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43801/smudgeplot-inference-of-ploidy-and-heterozygosity-structure-using-whole-genome-sequencing-data</link>
	<title><![CDATA[Smudgeplot: Inference of ploidy and heterozygosity structure using whole genome sequencing data]]></title>
	<description><![CDATA[<p dir="auto">This tool extracts heterozygous kmer pairs from kmer count databases and performs gymnastics with them. We are able to disentangle genome structure by comparing the sum of kmer pair coverages (CovA + CovB) to their relative coverage (CovB / (CovA + CovB)). Such an approach also allows us to analyze obscure genomes with duplications, various ploidy levels, etc.</p>
<p dir="auto">Smudgeplots are computed from raw or even better from trimmed reads and show the haplotype structure using heterozygous kmer pairs. For example:</p>
<p dir="auto"><a href="https://user-images.githubusercontent.com/8181573/45959760-f1032d00-c01a-11e8-8576-ff0512c33da9.png" target="_blank"><img src="https://user-images.githubusercontent.com/8181573/45959760-f1032d00-c01a-11e8-8576-ff0512c33da9.png" alt="smudgeexample" style="border: 0px;"></a></p><p>Address of the bookmark: <a href="https://github.com/KamilSJaron/smudgeplot" rel="nofollow">https://github.com/KamilSJaron/smudgeplot</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44307/genomenotebook</guid>
	<pubDate>Thu, 20 Apr 2023 13:19:01 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44307/genomenotebook</link>
	<title><![CDATA[genomenotebook]]></title>
	<description><![CDATA[<p><a href="https://dbikard.github.io/genomenotebook/">https://dbikard.github.io/genomenotebook/</a></p>
<h2>Install<a href="https://dbikard.github.io/genomenotebook/#install"></a></h2>
<pre><code>pip install genomenotebook</code></pre>
<h2>How to use<a href="https://dbikard.github.io/genomenotebook/#how-to-use"></a></h2>
<p>Create a simple genome browser with a search bar. The sequence appears when zooming in.</p>
<div>
<div id="cb2">
<pre><code><span><a href="https://dbikard.github.io/genomenotebook/#cb2-1"></a><span>import</span> genomenotebook <span>as</span> gn</span>
<span><a href="https://dbikard.github.io/genomenotebook/#cb2-2"></a></span>
<span><a href="https://dbikard.github.io/genomenotebook/#cb2-3"></a>g<span>=</span>gn.GenomeBrowser(genome_path, gff_path, init_pos<span>=</span><span>10000</span>)</span>
<span><a href="https://dbikard.github.io/genomenotebook/#cb2-4"></a>g.show()</span></code><button title="Copy to Clipboard"></button></pre>
</div>
</div>
<p>Tracks can be added to visualize your favorite genomics data. See&nbsp;<code>Examples</code>&nbsp;for more !!!!</p><p>Address of the bookmark: <a href="https://dbikard.github.io/genomenotebook/" rel="nofollow">https://dbikard.github.io/genomenotebook/</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44377/mitochondrial-genome-assembly-tools</guid>
	<pubDate>Wed, 06 Sep 2023 00:37:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44377/mitochondrial-genome-assembly-tools</link>
	<title><![CDATA[Mitochondrial genome assembly tools !]]></title>
	<description><![CDATA[<p>Mitochondrial genome assembly tools are specialized software and algorithms designed to accurately reconstruct the mitochondrial genome (mitogenome) from sequencing data, typically obtained through techniques like next-generation sequencing (NGS). The mitochondrial genome is relatively small compared to the nuclear genome, making it an ideal target for assembly. Here are some commonly used mitochondrial genome assembly tools:</p><p><strong>MitoFinder:</strong> Mitofinder is a pipeline to assemble mitochondrial genomes and annotate mitochondrial genes from trimmed read sequencing data.</p><p><strong>MitoHiFi:</strong> a python pipeline for mitochondrial genome assembly from PacBio high fidelity reads</p><p>MITObim: MITObim is a tool specifically developed for the iterative assembly of mitochondrial genomes. It starts with a reference mitogenome and iteratively refines the assembly using the read data.</p><p><strong>MITOS:</strong> MITOS is a web-based platform that provides a pipeline for annotating mitochondrial genomes. It integrates multiple software tools for assembly, annotation, and visualization of mitogenomes.</p><p><strong>MIRA:</strong> MIRA (Mimicking Intelligent Read Assembly) is a versatile genome assembly tool that can be used for mitochondrial genome assembly. It supports various sequencing technologies and allows for reference-based or de novo assembly.</p><p><strong>NOVOPlasty:</strong> NOVOPlasty is a user-friendly tool designed for de novo assembly of organelle genomes, including mitochondria. It utilizes a seed-and-extend algorithm and is suitable for both short-read and long-read data.</p><p><strong>MITOS2:</strong> MITOS2 is an updated version of the MITOS pipeline, which automates the annotation of mitochondrial genomes. It provides improved accuracy and additional features for mitochondrial genome analysis.</p><p><strong>GetOrganelle:</strong> While primarily designed for chloroplast genome assembly, GetOrganelle can also be used for mitochondrial genome assembly. It is particularly useful for dealing with high-throughput sequencing data.</p><p><strong>SPAdes:</strong> SPAdes (St. Petersburg genome assembler) is a versatile genome assembly tool that can be employed for mitochondrial genome assembly, especially when dealing with complex datasets that may contain nuclear mitochondrial DNA sequences (numts).</p><p><strong>IDBA-UD:</strong> IDBA-UD (Iterative De Bruijn Graph De Novo Assembler) is another de novo assembly tool that can be used for mitochondrial genome assembly, especially in cases with relatively low coverage.</p><p><strong>Velvet:</strong> Velvet is a de novo assembly tool that can be applied to mitochondrial genome assembly, especially when working with short-read data.</p><p>When selecting a mitochondrial genome assembly tool, it's important to consider the specific characteristics of your sequencing data, such as read length and coverage, as well as the complexity of the mitochondrial genome. Additionally, some tools are better suited for specific organisms or research objectives, so choosing the right tool will depend on your particular project requirements.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/88/regular-expression-cheat-sheet</guid>
	<pubDate>Tue, 09 Jul 2013 17:38:42 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/88/regular-expression-cheat-sheet</link>
	<title><![CDATA[Regular Expression Cheat Sheet]]></title>
	<description><![CDATA[<p><span>The Regular Expression are the sole of Perl language, and for bioinformatician it is just a magical stick to resolve gingatic string data. We did not find any good and user friendly regular expression cheat sheet, hence write our own cheat sheet.&nbsp;</span><span>The Regular Expressions Cheat Sheet, a quick reference guide for regular expressions, including symbols, ranges, grouping, assertions and some sample patterns to get you started.</span></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/88" length="14944" type="application/pdf" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/3917/the-story-of-you-encode-and-the-human-genome</guid>
	<pubDate>Sat, 24 Aug 2013 18:49:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/3917/the-story-of-you-encode-and-the-human-genome</link>
	<title><![CDATA[The Story of You: ENCODE and the human genome]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/TwXXgEz9o4w" frameborder="0" allowfullscreen></iframe><p>Ever since a monk called Mendel started breeding pea plants we've been learning about our genomes. In 1953, Watson, Crick and Franklin described the structure of the molecule that makes up our genomes: the DNA double helix. Then, in 2001, scientists wrote down the entire 3-billion letter code contained in the average human genome. Now they're trying to interpret that code; to work out how it's used to make different types of cells and different people. The ENCODE project, as it's called, is the latest chapter in the story of you. To read the ENCODE research papers and more, visit http://www.nature.com/ENCODE</p>]]></description>
	
</item>

</channel>
</rss>