<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44551?offset=100</link>
	<atom:link href="https://bioinformaticsonline.com/related/44551?offset=100" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44618/important-bioinformatics-tools</guid>
	<pubDate>Tue, 30 Jul 2024 05:03:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44618/important-bioinformatics-tools</link>
	<title><![CDATA[Important Bioinformatics Tools !]]></title>
	<description><![CDATA[<p><span>1. Ktrim: An extra-fast, accurate adapter trimmer for sequencing data. It processes FASTQ files from multiple lanes with minimal mismatching and over-trimming of adapters.</span><span><br /></span><span><br /></span><span>2. BWA MEM: A reliable alignment tool (particularly for mapping ALT contigs and HLA genes, which are not fully addressed in BWA-MEM2).</span><span><br /></span><span><br /></span><span>3. Sambamba markdup: Quickly marks or removes duplicate reads using Picard's criteria.</span><span><br /></span><span><br /></span><span>4. ichorCNA: Estimates the tumor DNA fraction in cell-free DNA from ultra-low-pass whole genome sequencing (0.1x coverage) based on copy number alterations (CNA).</span><span><br /></span><span><br /></span><span>5. Fragle: A deep learning method for quantifying ctDNA levels from cell-free DNA fragmentomic profiles. It detects TF as low as ~1% ctDNA and works with targeted genomic panel sequencing data.</span><span><br /></span><span><br /></span><span>6. AlfredQC: A quality control tool for high-throughput sequencing data. It assesses metrics like read quality scores, GC content, and duplication rates, visualized through detailed plots and summary statistics.</span><span><br /></span><span><br /></span><span>7. Mosdepth: A fast tool for calculating sequencing coverage depth, offering a quicker alternative to samtools/sambamba depth by processing BAM and CRAM files.</span><span><br /></span><span><br /></span><span>8. Bedtools: A versatile toolkit for genomics, enabling operations like intersect, merge, count, and shuffle on genomic intervals across formats such as BAM, BED, GFF/GTF, and VCF.</span><span><br /></span><span><br /></span><span>9. Datamash: A command-line tool for basic numeric, textual, and statistical operations on input data streams. It supports operations such as grouping, sorting, transposing, and performing arithmetic calculations on tabular data.</span><span><br /></span><span><br /></span><span>10.</span><span> </span><a href="http://gwf.app/" target="_self">gwf.app</a><span>: A pragmatic alternative to Snakemake. Developed at</span><span> </span><a href="https://www.linkedin.com/company/aarhus-university-denmark-/" target="_self"><span>Aarhus University</span></a><span>, this flexible, generic workflow tool builds and runs large scientific workflows.</span></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44914/predicting-pathogen-virulence-using-bioinformatics-tools</guid>
	<pubDate>Tue, 04 Nov 2025 07:55:53 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44914/predicting-pathogen-virulence-using-bioinformatics-tools</link>
	<title><![CDATA[Predicting Pathogen Virulence Using Bioinformatics Tools]]></title>
	<description><![CDATA[<p>In the genomic era, the ability to predict the virulence potential of pathogens has become an indispensable part of infectious disease research. With the exponential growth of microbial genome data, bioinformatics tools now enable scientists to identify virulence factors, model pathogen behavior, and even forecast outbreak risks &mdash; all from sequence data.</p><p>In an age where pathogens continue to evolve and cross boundaries, understanding <strong>what makes them virulent</strong>&mdash;that is, capable of causing disease&mdash;has become a critical focus in modern microbiology and genomics. <strong>Virulence prediction</strong> bridges computational biology, genomics, and machine learning to forecast the pathogenic potential of microbes before they strike.</p><h3>What Is Virulence?</h3><p><em>Virulence</em> refers to the degree of damage a pathogen can inflict on its host. It is determined by a combination of genetic factors&mdash;called <strong>virulence factors (VFs)</strong>&mdash;that allow the organism to attach, invade, evade, and harm the host. These include genes coding for toxins, secretion systems, adhesins, and enzymes that disrupt host defenses.</p><p>Understanding virulence factors not only helps in deciphering the mechanisms of infection but also provides early warning signs for emerging threats.</p><h3>Why Predict Virulence?</h3><p>Traditional virulence studies relied heavily on experimental infection models, which, although accurate, are <strong>time-consuming, expensive, and ethically constrained</strong>.<br /> Today, the availability of whole-genome sequences and large-scale pathogen databases has paved the way for <strong>in silico virulence prediction</strong>&mdash;a computational approach that can screen thousands of genomes within hours.</p><p>This approach enables researchers to:</p><ul>
<li>
<p>Rapidly identify potential <strong>high-risk strains</strong>.</p>
</li>
<li>
<p>Prioritize pathogens for <strong>containment, surveillance, or further study</strong>.</p>
</li>
<li>
<p>Guide <strong>vaccine development</strong> and <strong>drug target discovery</strong>.</p>
</li>
<li>
<p>Support <strong>One Health frameworks</strong>, linking animal, human, and environmental health data.</p>
</li>
</ul><h3>How Is Virulence Predicted?</h3><p>Virulence prediction combines <strong>bioinformatics pipelines</strong> with <strong>machine learning</strong> and <strong>comparative genomics</strong>. The process generally involves:</p><ol>
<li>
<p><strong>Genome Annotation:</strong> Identifying genes and coding sequences in microbial genomes.</p>
</li>
<li>
<p><strong>Feature Extraction:</strong> Comparing sequences with curated databases like <strong>VFDB (Virulence Factor Database)</strong>, <strong>PATRIC</strong>, or <strong>Victors</strong>.</p>
</li>
<li>
<p><strong>Pattern Recognition:</strong> Using algorithms (e.g., Random Forest, SVM, or deep learning models) to classify genes or strains as virulent or non-virulent based on sequence patterns, motifs, and protein domains.</p>
</li>
<li>
<p><strong>Scoring and Visualization:</strong> Assigning a virulence score or confidence level and visualizing it through heatmaps or genome maps.</p>
</li>
</ol><h3>Tools and Resources for Virulence Prediction</h3><p>A number of tools and databases make virulence prediction accessible to the scientific community:</p><ul>
<li>
<p><strong>VFanalyzer</strong> &ndash; For identifying virulence genes based on VFDB.</p>
</li>
<li>
<p><strong>PathoFact</strong> &ndash; Predicts virulence, antimicrobial resistance (AMR), and toxin genes from metagenomic data.</p>
</li>
<li>
<p><strong>Pangenome-based models</strong> &ndash; Identify virulence-associated gene clusters across strains.</p>
</li>
<li>
<p><strong>Machine learning models</strong> &ndash; Use features like GC content, codon usage bias, or protein domains to predict pathogenicity.</p>
</li>
</ul><p>Emerging tools now integrate <strong>multi-omic data</strong>&mdash;including transcriptomics, proteomics, and metabolomics&mdash;to understand virulence in a systems biology framework.</p><h3>Applications in the Real World</h3><p>Virulence prediction has major implications across public health and research sectors:</p><ul>
<li>
<p><strong>Epidemic preparedness:</strong> Early identification of virulent strains in outbreak samples.</p>
</li>
<li>
<p><strong>AMR surveillance:</strong> Linking virulence profiles with antibiotic resistance determinants.</p>
</li>
<li>
<p><strong>Environmental monitoring:</strong> Predicting pathogenic potential of soil or waterborne microbes.</p>
</li>
<li>
<p><strong>Clinical diagnostics:</strong> Supporting personalized treatment through pathogen profiling.</p>
</li>
</ul><p>For instance, integrating virulence prediction pipelines into <strong>national surveillance networks</strong> could enable faster risk assessment and response to infectious outbreaks.</p><h3>The Road Ahead</h3><p>As machine learning and genomics advance, virulence prediction will evolve from simple gene-based detection to <strong>dynamic, context-aware models</strong> that account for host&ndash;pathogen interactions, environmental signals, and evolutionary adaptation.</p><p>Future tools may predict <strong>not just if a strain is virulent</strong>, but <strong>under what conditions</strong> it expresses that virulence&mdash;bridging the gap between genotype and phenotype.</p><h3>In Summary</h3><p>Virulence prediction is redefining how we understand and anticipate infectious diseases. By coupling <strong>genomic insights</strong> with <strong>computational intelligence</strong>, researchers can identify potential threats earlier, design smarter interventions, and ultimately, strengthen our preparedness against emerging pathogens.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35395/comprehensive-list-of-visualization-tools-for-biological-pathways</guid>
	<pubDate>Tue, 30 Jan 2018 06:01:31 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35395/comprehensive-list-of-visualization-tools-for-biological-pathways</link>
	<title><![CDATA[Comprehensive list of visualization tools for biological pathways]]></title>
	<description><![CDATA[<p>The study of biological pathways is a key to understand the different processes inside a cell: proteins exert their function not in isolation but in a tightly controlled network of interactions and reactions. Activation of a pathway typically leads to a change of state in the cell. Pathways come in different flavors, depending on their functions in the cell &ndash; the three main types are metabolic pathways, gene regulatory pathways, and signaling pathways. These biological pathways and networks are not only an appropriate approach to visualize molecular reactions. They have also become one leading method in -omics data analysis and visualization.</p><p><img src="https://photos-1.dropbox.com/t/2/AABemz29qAuSTqSzr5mEsQE7JIMxZlU1CBy0E5n0yUVYbA/12/85115969/png/32x32/1/_/1/2/pathway.png/EOfXoUIYrJ8CIAcoBw/01qsT2eykyPvSH-rNpy3cqioDzZPc4i-xULG3BEZvCk?preserve_transparency=1&amp;size=1280x960&amp;size_mode=3" width="800" height="533" alt="image" style="border: 0px;"></p><p>Following are the comprehensive list of visualization tools for biological pathways:</p><p>BiNA</p><p>Drawings of metabolic networks supporting hiding of cofactors and drawing of chemical structures</p><p>http://bina.unipax.info/</p><p>BioTapestry</p><p>Interactive tool for building, visualizing and sharing gene regulatory network models over the web</p><p>http://www.biotapestry.org/</p><p>Caleydo</p><p>Visual analysis framework targeted at biomolecular data. Visualization of interdependencies between multiple datasets</p><p>http://www.caleydo.org/</p><p>CellDesigner</p><p>A modeling tool for biochemical networks</p><p>http://www.celldesigner.org/</p><p>Edinburgh Pathway Editor</p><p>Edit and draw pathway diagrams</p><p>http://epe.sourceforge.net/SourceForge/EPE.html</p><p>GenMAPP</p><p>Visualization of gene expression and other genomic data on maps representing biological pathways and groupings of genes</p><p>http://www.genmapp.org/</p><p>Ingenuity IPA</p><p>Data integration platform and manually annotated pathways</p><p>http://tinyurl.com/IngenuityPath</p><p>JDesigner</p><p>Graphical modeling environment for biochemical reaction networks</p><p>http://jdesigner.sourceforge.net/Site/JDesigner.html</p><p>KaPPA View</p><p>Plant pathways</p><p>http://kpv.kazusa.or.jp/</p><p>KEGG Atlas</p><p>Interactive Kyoto Encyclopedia of Genes and Genomes pathways</p><p>http://www.genome.jp/kegg/</p><p>Omix&nbsp;</p><p>Visualizing multi-omics data in metabolic networks</p><p>https://www.omix-visualization.com</p><p>PathVisio&nbsp;</p><p>Biological pathway analysis software that allows drawing, editing and analysis of biological pathways</p><p>http://www.pathvisio.org/</p><p>VitaPad&nbsp;</p><p>Application to visualize biological pathways and map experimental data to them</p><p>http://tinyurl.com/vitapad/</p><p>Web tools for pathways</p><p>ArrayXPath&nbsp;</p><p>Mapping and visualizing microarray gene-expression data and integrated biological pathway resources using SVG</p><p>http://tinyurl.com/ArrayXPath/</p><p>GEPAT&nbsp;</p><p>Integrated analysis of transcriptome data in genomic, proteomic and metabolic contexts</p><p>http://gepat.sourceforge.net/</p><p>iPath&nbsp;</p><p>Web-based tool for the visualization, analysis and customization of pathway maps</p><p>http://pathways.embl.de/</p><p>Kegg-Based Viewer&nbsp;</p><p>KEGG-based pathway visualization tool for complex high-throughput data</p><p>http://www.g-language.org/data/marray/</p><p>MapMan&nbsp;</p><p>User-driven tool that displays large datasets onto diagrams of metabolic pathways or other processes</p><p>http://mapman.gabipd.org/web/guest/mapman</p><p>MetPA&nbsp;</p><p>Analysis and visualization of metabolomic data within the biological context of metabolic pathways</p><p>http://metpa.metabolomics.ca</p><p>Omics Viewer&nbsp;</p><p>Data mapping on BioCyc pathways (collection of 5500 pathway/genome databases)</p><p>http://www.biocyc.org/</p><p>Pathway Explorer</p><p>Interactive Java drawing tool for the construction of biological pathway diagrams in a visual way and the annotation of the components and interactions between them</p><p>http://genome.tugraz.at/pathwayexplorer/pathwayexplorer_description.shtml</p><p>Pathway projector&nbsp;</p><p>Zoomable pathway browser using KEGG atlas and Google Maps API</p><p>http://www.g-language.org/PathwayProjector/</p><p>PATIKA&nbsp;</p><p>Integrated environment composed of a central database and a visual editor, built around an extensive ontology and an integration framework</p><p>http://www.cs.bilkent.edu.tr/~patikaweb/</p><p>Reactome SkyPainter&nbsp;</p><p>Visualization of over-represented pathways and reactions from gene lists</p><p>http://www.reactome.org/skypainter-2</p><p>WikiPathways</p><p>Wiki-based, open, public platform dedicated to the curation of biological pathways by and for the scientific community</p><p>http://www.wikipathways.org/</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35057/ectools-long-read-correction-and-other-correction-tools</guid>
	<pubDate>Fri, 05 Jan 2018 04:02:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35057/ectools-long-read-correction-and-other-correction-tools</link>
	<title><![CDATA[ECTOOLS: Long Read Correction and other Correction tools]]></title>
	<description><![CDATA[<p>Long Read Correction and other Correction tools</p>
<p>This package is a loose collection of scripts. To run the correction<br>routine see the section below. Descriptions of the other scripts<br>are at the bottom of this file.</p>
<p>Contact: gurtowsk@cshl.edu</p>
<p>In short, the correction algorithm takes as input the unitigs from a short read assembly and uses them to correct long read data. More background information for the algorithm can be found:<br>http://schatzlab.cshl.edu/presentations/2013-06-18.PBUserMeeting.pdf</p><p>Address of the bookmark: <a href="https://github.com/jgurtowski/ectools" rel="nofollow">https://github.com/jgurtowski/ectools</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36516/metassembler-merging-and-optimizing-de-novo-genome-assemblies</guid>
	<pubDate>Tue, 08 May 2018 04:52:33 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36516/metassembler-merging-and-optimizing-de-novo-genome-assemblies</link>
	<title><![CDATA[Metassembler: merging and optimizing de novo genome assemblies]]></title>
	<description><![CDATA[<p><span>Metassembler combines multiple whole genome de novo assemblies into a combined consensus assembly using the best segments of the individual assemblies.</span></p>
<p><span><span>Genome assembly projects typically run multiple algorithms in an attempt to find the single best assembly, although those assemblies often have complementary, if untapped, strengths and weaknesses. We present our metassembler algorithm that merges multiple assemblies of a genome into a single superior sequence.&nbsp;</span></span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/metassembler/?source=directory" rel="nofollow">https://sourceforge.net/projects/metassembler/?source=directory</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37915/dna-nucleotide-counter</guid>
	<pubDate>Fri, 12 Oct 2018 04:37:01 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37915/dna-nucleotide-counter</link>
	<title><![CDATA[DNA Nucleotide Counter]]></title>
	<description><![CDATA[<p style="margin: 2px 5px 4px 6px; color: #000011; font-size: 12px; font-style: normal; font-weight: 400; text-align: justify;">DNA Nucleotide Counter is delivered in a DNA Baser package together with other free molecular biology tools.<span>&nbsp;</span><a href="http://www.dnabaser.com/download/biology-tools-package-download-count.html">Download</a><span>&nbsp;</span>the package and double click it. The programs inside the package will be extracted to the destination folder (specified by you). Go to the destination folder&nbsp;and double click the program you want to use.</p>
<p style="margin: 2px 5px 4px 6px; color: #000011; font-size: 12px; font-style: normal; font-weight: 400; text-align: justify;">It<span>&nbsp;</span><a href="http://www.dnabaser.com/download/install-anywhere.html">installs in any computer</a><span>&nbsp;</span>even if you don't have administrator rights!</p><p>Address of the bookmark: <a href="http://www.dnabaser.com/download/DNA-Counter/index.html" rel="nofollow">http://www.dnabaser.com/download/DNA-Counter/index.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</guid>
	<pubDate>Tue, 14 Jan 2020 06:47:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</link>
	<title><![CDATA[Shasta long read assembler]]></title>
	<description><![CDATA[<p>The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by&nbsp;<a href="https://nanoporetech.com/">Oxford Nanopore</a>&nbsp;flow cells.</p>
<p>Computational methods used by the Shasta assembler include:</p>
<ul>
<li>Using a&nbsp;<a href="https://en.wikipedia.org/wiki/Run-length_encoding">run-length</a>&nbsp;representation of the read sequence. This makes the assembly process more resilient to errors in homopolymer repeat counts, which are the most common type of errors in Oxford Nanopore reads.</li>
<li>Using in some phases of the computation a representation of the read sequence based on&nbsp;<em>markers</em>, a fixed subset of short k-mers (k &asymp; 10).</li>
</ul>
<p>More at&nbsp;<a href="https://chanzuckerberg.github.io/shasta/index.html">https://chanzuckerberg.github.io/shasta/index.html</a></p><p>Address of the bookmark: <a href="https://github.com/chanzuckerberg/shasta" rel="nofollow">https://github.com/chanzuckerberg/shasta</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41996/wgd%E2%80%94simple-command-line-tools-for-the-analysis-of-ancient-whole-genome-duplications</guid>
	<pubDate>Thu, 23 Jul 2020 05:49:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41996/wgd%E2%80%94simple-command-line-tools-for-the-analysis-of-ancient-whole-genome-duplications</link>
	<title><![CDATA[wgd—simple command line tools for the analysis of ancient whole-genome duplications]]></title>
	<description><![CDATA[<p><span>wgd is a easy to use command-line tool for<span>&nbsp;</span></span><em>K</em><sub>S</sub><span><span>&nbsp;</span>distribution construction named wgd. The wgd suite provides commonly used<span>&nbsp;</span></span><em>K</em><sub>S</sub><span><span>&nbsp;</span>and colinearity analysis workflows together with tools for modeling and visualization, rendering these analyses accessible to genomics researchers in a convenient manner.</span></p>
<p><a href="https://academic.oup.com/bioinformatics/article/35/12/2153/5162749">https://academic.oup.com/bioinformatics/article/35/12/2153/5162749</a></p><p>Address of the bookmark: <a href="https://github.com/arzwa/wgd" rel="nofollow">https://github.com/arzwa/wgd</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</guid>
	<pubDate>Sun, 07 Mar 2021 00:32:44 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42936/ancient-whole-genome-duplication-wgd-detection-tools</link>
	<title><![CDATA[Ancient whole genome duplication (WGD) detection tools !]]></title>
	<description><![CDATA[<p>There are two methods for ancient WGD detection, one is collinearity analysis, and the other is based on the Ks distribution map. Among them, Ks is defined as the average number of synonymous substitutions at each synonymous site, and there is also a Ka corresponding to it, which refers to the average number of non-synonymous substitutions at each non-synonymous site.</p><p>At present, some people have posted articles about the analysis process of WGD. I searched for the keyword "wgd pipeline" and found the following:</p><p><strong>GenoDup: https:// github.com/MaoYafei/GenoDup-Pipeline</strong><br /><strong>https://peerj.com/articles/6303/</strong><br /><strong>WGDdetector: https:// github.com/yongzhiyang2 012/WGDdetector</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2670-3</strong><br /><strong>wgd: https:// github.com/arzwa/wgd</strong><br /><strong>https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2#Sec1</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>GeNoGAP https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1142-2</strong><br /><strong>https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-017-0399-x</strong><br /><strong>https://github.com/dfguan/purge_dups</strong><br /><strong>https://www.biorxiv.org/content/10.1101/2020.01.24.917997v1</strong></p><p>This article introduces the usage of wgd.</p><p>Wgd cannot be installed directly with bioconda at present, so it is a little troublesome to install, because it depends on a lot of software. wgd depends on the following software</p><p><strong>BLAST</strong><br /><strong>MCL</strong><br /><strong>MUSCLE/MAFFT/PRANK</strong><br /><strong>PAML</strong><br /><strong>PhyML/FastTree</strong><br /><strong>i-ADHoRe</strong></p><p>But the good news is that most of the software it depends on can be installed with bioconda</p><blockquote><p>conda create -n wgd python=3.5 blast mcl muscle mafft prank paml fasttree cmake libpng mpi=1.0=mpich<br />conda activate wgd</p></blockquote><p>Here mpi=1.0=mpich is selected, because i-adhore depends on mpich. If openmpi is installed, an error will appear while loading shared libraries: libmpi_cxx.so.40: cannot open shared object file: No such file or directory</p><p>After that, the installation is much simpler</p><blockquote><p>git clone https://github.com/arzwa/wgd.git<br />cd wgd<br />pip install .<br />pip install git+https://github.com/arzwa/wgd.git<br />For i-ADHoRe, you need to register at http:// bioinformatics.psb.ugent.be /webtools/i-adhore/licensing/Agree to the license to download i-ADHoRe-3.0</p></blockquote><p>Since my miniconda3 installed ~/opt/, the installation path is so~/opt/miniconda3/envs/wgd/</p><blockquote><p>tar -zxvf i-adhore-3.0.01.tar.gz<br />cd i-adhore-3.0.01<br />mkdir -p build &amp;&amp; cd build<br />cmake .. -DCMAKE_INSTALL_PREFIX=~/opt/miniconda3/envs/wgd/<br />make -j 4 <br />make insatall</p></blockquote><p>Take the sugarcane genome Saccharum spontaneum L as an example. The genome is 8-ploid with 32 chromosomes (2n = 4x8 = 32)</p><p><strong>Download the tutorial for CDS and GFF annotation files</strong></p><blockquote><p><strong>mkdir -p wgd_tutorial &amp;&amp; cd wgd_tutorial</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.cds.fasta.gz</strong><br /><strong>wget http://www.life.illinois.edu/ming/downloads/Spontaneum_genome/Sspon.v20190103.gff3.gz</strong><br /><strong>gunzip *.gz</strong></p></blockquote><p>First conda activate wgdstart our analysis environment, and then start the analysis</p><p>Step 1 : Use to wgd mclidentify homologous genes in the genome</p><blockquote><p>wgd mcl -n 20 --cds --mcl -s Sspon.v20190103.cds.fasta -o Sspon_cds.out</p></blockquote><p>Step 2 : Use to wgd ksdbuild Ks distribution</p><blockquote><p>wgd ksd --n_threads 80 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl Sspon.v20190103.cds.fasta</p></blockquote><p>Step 3 : If the quality of the genome is good, then wgd syncollinearity analysis can be used . It can help us find the collinearity block in the genome and the corresponding anchor point</p><blockquote><p>wgd syn --feature gene --gene_attribute ID \<br /> -ks wgd_ksd/Sspon.v20190103.cds.fasta.ks.tsv \<br /> Sspon.v20190103.gff3 Sspon_cds.out/Sspon.v20190103.cds.fasta.blast.tsv.mcl</p></blockquote><p>&nbsp;For more reading - There are 9 sub-modules in WGD</p><ul>
<li><span>kde: KDE fitting to the Ks distribution</span></li>
<li><span>ksd: Ks distribution construction</span></li>
<li><span>mcl: BLASP comparison of All-vs-ALl + MCL classification analysis.</span></li>
<li><span><span>mix: Hybrid modeling of Ks distribution.</span></span></li>
<li><span>pre: preprocess the CDS file</span></li>
<li><span>syn: Call I-ADHoRe 3.0 to use GFF files for collinearity analysis</span></li>
<li><span>viz: draw histogram and density plot</span></li>
<li><span>wf1: Ks standard analysis procedure of the whole genome paranome (paranome), call mcl, ksd and syn</span></li>
<li><span>wf2: Ks standard analysis procedure of one-vs-one homologous gene (ortholog), call wcl and kSD</span></li>
</ul>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43364/ragtag-a-collection-of-software-tools-for-scaffolding-and-improving-modern-genome-assemblies</guid>
	<pubDate>Sat, 11 Sep 2021 00:28:14 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43364/ragtag-a-collection-of-software-tools-for-scaffolding-and-improving-modern-genome-assemblies</link>
	<title><![CDATA[RagTag: a collection of software tools for scaffolding and improving modern genome assemblies]]></title>
	<description><![CDATA[<p>RagTag is a collection of software tools for scaffolding and improving modern genome assemblies. Tasks include:</p>
<ul>
<li>Homology-based misassembly&nbsp;<a href="https://github.com/malonge/RagTag/wiki/correct">correction</a></li>
<li>Homology-based assembly&nbsp;<a href="https://github.com/malonge/RagTag/wiki/scaffold">scaffolding</a>&nbsp;and&nbsp;<a href="https://github.com/malonge/RagTag/wiki/patch">patching</a></li>
<li>Scaffold&nbsp;<a href="https://github.com/malonge/RagTag/wiki/merge">merging</a></li>
</ul><p>Address of the bookmark: <a href="https://github.com/malonge/RagTag" rel="nofollow">https://github.com/malonge/RagTag</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>