<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/38443?offset=590</link>
	<atom:link href="https://bioinformaticsonline.com/related/38443?offset=590" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</guid>
	<pubDate>Fri, 13 Dec 2024 11:35:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</link>
	<title><![CDATA[Step-by-Step Guide to Running Genome Assembly]]></title>
	<description><![CDATA[<p>Genome assembly is a critical process in bioinformatics, enabling the reconstruction of an organism's genome from short DNA sequence reads. Whether you&rsquo;re working on a new microbial genome or a complex eukaryotic organism, this guide will walk you through the steps of genome assembly using state-of-the-art tools and best practices.</p><h4><strong>What is Genome Assembly?</strong></h4><p>Genome assembly involves piecing together short DNA sequence reads generated by sequencing platforms (e.g., Illumina, PacBio, Oxford Nanopore) into longer, contiguous sequences called contigs. This can be performed as:</p><ul>
<li><strong>De Novo Assembly</strong>: Without a reference genome.</li>
<li><strong>Reference-Guided Assembly</strong>: Using a reference genome to guide the assembly process.</li>
</ul><h4><strong>Step 1: Preparing Your Data</strong></h4><p>Before starting the assembly, ensure that your raw sequencing data is high quality.</p><ol>
<li>
<p><strong>Input Data</strong></p>
<ul>
<li><strong>Short Reads</strong>: Illumina sequencing generates short, accurate reads ideal for scaffolding.</li>
<li><strong>Long Reads</strong>: PacBio and Nanopore sequencing provide long reads for resolving repetitive regions.</li>
</ul>
</li>
<li>
<p><strong>Quality Control (QC)</strong><br />Use tools like <strong>FastQC</strong> or <strong>MultiQC</strong> to assess the quality of your reads:</p>
<div>
<div dir="ltr"><code>fastqc reads.fastq multiqc . </code></div>
</div>
<p>Look for issues like low-quality bases, adapter contamination, or overrepresented sequences.</p>
</li>
<li>
<p><strong>Read Trimming and Filtering</strong><br />Trim low-quality bases and adapters using <strong>Trimmomatic</strong> or <strong>Cutadapt</strong>:</p>
<div>
<div dir="ltr"><code>trimmomatic PE reads_R1.fastq reads_R2.fastq trimmed_R1.fastq trimmed_R2.fastq \ ILLUMINACLIP:adapters.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36 </code></div>
</div>
</li>
</ol><h4><strong>Step 2: Choosing an Assembly Strategy</strong></h4><p>Select an assembly strategy based on your data type:</p><ul>
<li>
<p><strong>Short-Read Assemblers</strong>:</p>
<ul>
<li>SPAdes: Popular for microbial genomes.</li>
<li>Velvet: Fast for smaller genomes.</li>
</ul>
</li>
<li>
<p><strong>Long-Read Assemblers</strong>:</p>
<ul>
<li>Canu: Ideal for long-read datasets.</li>
<li>Flye: Versatile for small and large genomes.</li>
</ul>
</li>
<li>
<p><strong>Hybrid Assemblers</strong>:</p>
<ul>
<li>MaSuRCA: Combines short and long reads.</li>
<li>Unicycler: Optimized for bacterial genomes.</li>
</ul>
</li>
</ul><h4><strong>Step 3: Running the Assembly</strong></h4><h5><strong>3.1. SPAdes (Short-Read Assembly)</strong></h5><p>SPAdes is an excellent choice for small genomes, such as bacteria.</p><div><div dir="ltr"><code>spades.py -1 trimmed_R1.fastq -2 trimmed_R2.fastq -o spades_output </code></div></div><p>The output includes assembled contigs (<code>contigs.fasta</code>) and scaffolds (<code>scaffolds.fasta</code>).</p><h5><strong>3.2. Canu (Long-Read Assembly)</strong></h5><p>Canu is designed for high-error long reads from PacBio or Nanopore.</p><div><div dir="ltr"><code>canu -p genome -d canu_output genomeSize=4.7m -nanopore-raw reads.fastq </code></div></div><p>The output will be in <code>canu_output/genome.contigs.fasta</code>.</p><h5><strong>3.3. Hybrid Assembly with Unicycler</strong></h5><p>Unicycler combines short and long reads for improved assemblies.</p><div><div dir="ltr"><code>unicycler -1 trimmed_R1.fastq -2 trimmed_R2.fastq -l long_reads.fastq -o unicycler_output </code></div></div><h4><strong>Step 4: Assessing Assembly Quality</strong></h4><p>After assembly, evaluate its quality using the following tools:</p><ol>
<li>
<p><strong>QUAST</strong><br />QUAST generates assembly statistics, such as N50, genome size, and GC content:</p>
<div>
<div dir="ltr"><code>quast contigs.fasta -o quast_output </code></div>
</div>
</li>
<li>
<p><strong>BUSCO</strong><br />BUSCO checks genome completeness by identifying conserved genes:</p>
<div>
<div dir="ltr"><code>busco -i contigs.fasta -o busco_output -l fungi_odb10 -m genome </code></div>
</div>
</li>
<li>
<p><strong>Assembly Graph Visualization</strong><br />Visualize assembly graphs with <strong>Bandage</strong>:</p>
<div>
<div dir="ltr"><code>Bandage load assembly_graph.gfa </code></div>
</div>
</li>
</ol><hr><h4><strong>Step 5: Post-Assembly Steps</strong></h4><ol>
<li>
<p><strong>Polishing</strong><br />Improve assembly accuracy using tools like <strong>Pilon</strong> (for short reads) or <strong>Racon</strong> (for long reads).</p>
<div>
<div dir="ltr"><code>racon long_reads.fasta mapped_reads.sam contigs.fasta &gt; polished_contigs.fasta </code></div>
</div>
</li>
<li>
<p><strong>Scaffolding</strong><br />Link contigs into scaffolds using tools like <strong>SSPACE</strong> or <strong>Opera-LG</strong> if required.</p>
</li>
<li>
<p><strong>Annotation</strong><br />Annotate the assembled genome using <strong>Prokka</strong> for prokaryotes or <strong>Maker</strong> for eukaryotes.</p>
<div>
<div dir="ltr"><code>prokka --outdir annotation_output --prefix genome contigs.fasta </code></div>
</div>
</li>
</ol><h4><strong>Step 6: Sharing and Archiving</strong></h4><ol>
<li>
<p><strong>Submit to Public Repositories</strong><br />Share your assembly in databases like <strong>NCBI GenBank</strong>, <strong>ENA</strong>, or <strong>DDBJ</strong>.</p>
</li>
<li>
<p><strong>Metadata Preparation</strong><br />Include detailed metadata for your submission, such as organism name, sequencing platform, and coverage.</p>
</li>
</ol><h4><strong>Best Practices</strong></h4><ul>
<li>Always perform quality checks at each stage to ensure data integrity.</li>
<li>Use multiple tools to cross-validate results when working with complex genomes.</li>
<li>Document parameters and software versions for reproducibility.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Genome assembly is a powerful process that transforms raw sequencing data into a coherent representation of an organism&rsquo;s genome. By following this step-by-step guide, you can successfully assemble genomes and uncover valuable biological insights. Whether you&rsquo;re assembling a microbial genome or tackling the complexities of a eukaryotic genome, these tools and strategies will set you on the path to success.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44902/hite-a-fast-and-accurate-dynamic-boundary-adjustment-approach-for-full-length-transposable-elements-detection-and-annotation-in-genome-assemblies</guid>
	<pubDate>Sat, 20 Sep 2025 09:34:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44902/hite-a-fast-and-accurate-dynamic-boundary-adjustment-approach-for-full-length-transposable-elements-detection-and-annotation-in-genome-assemblies</link>
	<title><![CDATA[HiTE: a fast and accurate dynamic boundary adjustment approach for full-length Transposable Elements detection and annotation in Genome Assemblies]]></title>
	<description><![CDATA[<p dir="auto"><code>HiTE</code>&nbsp;is a Python software that uses a dynamic boundary adjustment approach to detect and annotate full-length Transposable Elements in Genome Assemblies. In comparison to other tools, HiTE demonstrates superior performance in detecting a greater number of full-length TEs.</p>
<div dir="auto">
<h2 dir="auto">panHiTE</h2>
<a href="https://github.com/CSU-KangHu/HiTE#panhite"></a></div>
<p dir="auto">We have developed panHiTE, a comprehensive and accurate pipeline for TE detection in large-scale population genomes. It has been successfully applied to hundreds of plant population genomes, demonstrating its effectiveness and scalability.</p>
<p dir="auto">For detailed instructions, please refer to the&nbsp;<a href="https://github.com/CSU-KangHu/HiTE/wiki/panHiTE-tutorial">panHiTE tutorial</a>.</p><p>Address of the bookmark: <a href="https://github.com/CSU-KangHu/HiTE" rel="nofollow">https://github.com/CSU-KangHu/HiTE</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41485/chromosight-computer-vision-based-program-for-pattern-recognition-in-chromosome-hi-c-contact-maps</guid>
	<pubDate>Mon, 23 Mar 2020 06:20:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41485/chromosight-computer-vision-based-program-for-pattern-recognition-in-chromosome-hi-c-contact-maps</link>
	<title><![CDATA[chromosight: Computer vision based program for pattern recognition in chromosome (Hi-C) contact maps]]></title>
	<description><![CDATA[<p>Python package to detect chromatin loops (and other patterns) in Hi-C contact maps.</p>
<p>Stable version with pip:</p>
<div>
<pre>pip3 install --user chromosight</pre>
</div>
<p>Stable version with conda:</p>
<div>
<pre>conda install -c bioconda -c conda-forge chromosight</pre>
</div>
<p>or, if you want to get the latest development version:</p>
<pre><code>pip3 install --user -e git+https://github.com/koszullab/chromosight.git@master#egg=chromosight</code></pre><p>Address of the bookmark: <a href="https://github.com/koszullab/Chromosight" rel="nofollow">https://github.com/koszullab/Chromosight</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44387/creating-genetic-maps-from-gbs-data</guid>
	<pubDate>Fri, 08 Sep 2023 06:31:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44387/creating-genetic-maps-from-gbs-data</link>
	<title><![CDATA[Creating Genetic Maps from GBS data]]></title>
	<description><![CDATA[<p><span>Genetic map, as the name suggest is simply knowing the relative positions of specific sequences across the genome. There are various methods to generate them, but most popular method is to use a cross between the known parents and examining their progenies. These kinds of crosses to create specific group of individuals of known ancestry is called as mapping population. Many types of mapping population exist. Here we will use the data collected from a Recombinant Inbred Line (RIL) (through selfing) to create a genetic map.</span></p><p>Address of the bookmark: <a href="https://bioinformaticsworkbook.org/dataAnalysis/GenomeAssembly/GeneticMaps/creating-genetic-maps.html" rel="nofollow">https://bioinformaticsworkbook.org/dataAnalysis/GenomeAssembly/GeneticMaps/creating-genetic-maps.html</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29917/gojs</guid>
	<pubDate>Tue, 22 Nov 2016 08:25:37 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29917/gojs</link>
	<title><![CDATA[GoJS]]></title>
	<description><![CDATA[<p><strong>GoJS</strong> is a feature-rich JavaScript library for implementing custom interactive diagrams and complex visualizations across modern web browsers and platforms. <strong>GoJS</strong> makes constructing JavaScript diagrams of complex nodes, links, and groups easy with customizable templates and layouts.</p>
<p><strong>GoJS</strong> offers many advanced features for user interactivity such as drag-and-drop, copy-and-paste, in-place text editing, tooltips, context menus, automatic layouts, templates, data binding and models, transactional state and undo management, palettes, overviews, event handlers, commands, and an extensible tool system for custom operations.</p>
<p><strong>GoJS</strong> is pure JavaScript, so users get interactivity without requiring round-trips to servers and without plugins. <strong>GoJS</strong> normally runs completely in the browser, rendering to an HTML5 Canvas element or SVG without any server-side requirements. <strong>GoJS</strong> does not depend on any JavaScript libraries or frameworks, so it should work with any HTML or JavaScript framework or with no framework at all. &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;</p>
<p>More at&nbsp;http://gojs.net/latest/index.html</p><p>Address of the bookmark: <a href="http://gojs.net/latest/index.html" rel="nofollow">http://gojs.net/latest/index.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36109/sankeynetwork-with-networkd3</guid>
	<pubDate>Fri, 06 Apr 2018 12:07:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36109/sankeynetwork-with-networkd3</link>
	<title><![CDATA[sankeyNetwork with networkD3]]></title>
	<description><![CDATA[<p><span>You can also create&nbsp;</span><a href="http://en.wikipedia.org/wiki/Sankey_diagram">Sankey diagrams</a><span>&nbsp;with&nbsp;</span><code>sankeyNetwork</code><span>. Here is an example using downloaded JSON data:</span></p>
<p><span>https://en.wikipedia.org/wiki/Sankey_diagram</span></p><p>Address of the bookmark: <a href="https://christophergandrud.github.io/networkD3/#sankey" rel="nofollow">https://christophergandrud.github.io/networkD3/#sankey</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39039/dotplotly-generate-an-interactive-dot-plot-from-mummer-or-minimap-alignments</guid>
	<pubDate>Thu, 21 Feb 2019 10:22:17 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39039/dotplotly-generate-an-interactive-dot-plot-from-mummer-or-minimap-alignments</link>
	<title><![CDATA[dotPlotly: Generate an interactive dot plot from mummer or minimap alignments]]></title>
	<description><![CDATA[<p>Create an interactive dot plot from mummer output OR PAF format</p>
<p>R script that makes a plotly interactive and/or static (png/pdf) dot plot.</p>
<p><a href="https://tom-poorten.shinyapps.io/dotplotly_shiny/">Shiny app available for testing</a></p><p>Address of the bookmark: <a href="https://github.com/tpoorten/dotPlotly" rel="nofollow">https://github.com/tpoorten/dotPlotly</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43319/k-mers-tutorial-classification-and-taxonomy</guid>
	<pubDate>Thu, 26 Aug 2021 10:28:43 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43319/k-mers-tutorial-classification-and-taxonomy</link>
	<title><![CDATA[k-mers tutorial - classification and taxonomy]]></title>
	<description><![CDATA[<p>DNA k-mers underlie much of our assembly work, and we (along with many others!) have spent a lot of time thinking about how to&nbsp;<a href="http://www.pnas.org/content/109/33/13272">store k-mer graphs efficiently</a>,&nbsp;<a href="http://ivory.idyll.org/blog/what-is-diginorm.html">discard redundant data</a>, and&nbsp;<a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0101271">count them efficiently</a>.</p>
<p>More recently, we've been enthused about&nbsp;<a href="http://joss.theoj.org/papers/3d793c6e7db683bee7c03377a4a7f3c9">using k-mer based similarity measures</a>&nbsp;and&nbsp;<a href="http://ivory.idyll.org/blog/2016-sourmash-sbt.html">computing and searching k-mer-based sketch search databases for all the things</a>.</p>
<p>But I haven't spent too much talking about using k-mers for taxonomy, although that has become an&nbsp;<em>ahem</em>&nbsp;area of interest recently,&nbsp;<a href="http://www.biorxiv.org/content/early/2017/07/03/155358">if you read into our papers a bit</a>.</p>
<p>In this blog post I'm going to fix this by doing a little bit of a literature review and waxing enthusiastic about other people's work. Then in a future blog post I'll talk about how we're building off of this work in fun! and interesting? ways!</p><p>Address of the bookmark: <a href="http://ivory.idyll.org/blog/2017-something-about-kmers.html" rel="nofollow">http://ivory.idyll.org/blog/2017-something-about-kmers.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/24462/icar-project-ra-position-institute-of-bioinformatics-iob-bangalore</guid>
  <pubDate>Tue, 22 Sep 2015 23:41:31 -0500</pubDate>
  <link></link>
  <title><![CDATA[ICAR project RA position @ Institute of Bioinformatics (IOB) Bangalore]]></title>
  <description><![CDATA[
<p>Applications are invited for the post of Research Associate (RA) in the ICAR project on "Lactation stress associated postpartum anestrus SNP array in buffaloes". We are looking for a motivated candidate for handling Next Generation sequencing data analysis with a strong background in bioinformatics and programming.</p>

<p>The position is open for immediate appointment and available for two years and then extendable for additional one year. The applicant will be appointed as Research Associate based on qualifications as detailed below:</p>

<p>Research Associate:</p>

<p>-Master’s degree with bioinformatics with at least 2 years of research experience in Next Generation sequencing data analysis as evidence from Fellowship/ Associateship / Training / other engagements.</p>

<p>-Familiarity with bioinformatics tools, database development, programming skills</p>

<p>-Minimum 1 publication in any peer reviewed journal</p>

<p>Salary will be as per ICAR rules and guidelines. Application will be shortlisted based on CV, reference letters from mentors and telephonic interview. Candidates will be called for a personal interview at Bangalore before appointment. No travel expense will be provided for attending interview at Bangalore.</p>

<p>Interested candidates may send a Letter of Interest and CV by email to: keshav@ibioinformatics.org before September 29, 2015.</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32011/fools-guide</guid>
	<pubDate>Sun, 02 Apr 2017 14:31:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32011/fools-guide</link>
	<title><![CDATA[Fools guide]]></title>
	<description><![CDATA[<p><span>This website and accompaning documents are intended as a tool to help researchers dealing with non-model organisms acquire and process transcriptomic high-throughput sequencing data without having to learn extensive bioinformatics skills. It covers all steps from tissue collection, sample preparation and computer setup, through addressing biological questions with gene expression and SNP data.</span></p>
<p>http://sfg.stanford.edu/denovo.html</p>
<p>http://sfg.stanford.edu/sequencing.html</p>
<p>http://sfg.stanford.edu/BLAST.html</p>
<p>http://sfg.stanford.edu/denovo.html&nbsp;</p><p>Address of the bookmark: <a href="http://sfg.stanford.edu/guide.html" rel="nofollow">http://sfg.stanford.edu/guide.html</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>

</channel>
</rss>