<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/31371?offset=190</link>
	<atom:link href="https://bioinformaticsonline.com/related/31371?offset=190" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</guid>
	<pubDate>Fri, 13 Dec 2024 11:35:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</link>
	<title><![CDATA[Step-by-Step Guide to Running Genome Assembly]]></title>
	<description><![CDATA[<p>Genome assembly is a critical process in bioinformatics, enabling the reconstruction of an organism's genome from short DNA sequence reads. Whether you&rsquo;re working on a new microbial genome or a complex eukaryotic organism, this guide will walk you through the steps of genome assembly using state-of-the-art tools and best practices.</p><h4><strong>What is Genome Assembly?</strong></h4><p>Genome assembly involves piecing together short DNA sequence reads generated by sequencing platforms (e.g., Illumina, PacBio, Oxford Nanopore) into longer, contiguous sequences called contigs. This can be performed as:</p><ul>
<li><strong>De Novo Assembly</strong>: Without a reference genome.</li>
<li><strong>Reference-Guided Assembly</strong>: Using a reference genome to guide the assembly process.</li>
</ul><h4><strong>Step 1: Preparing Your Data</strong></h4><p>Before starting the assembly, ensure that your raw sequencing data is high quality.</p><ol>
<li>
<p><strong>Input Data</strong></p>
<ul>
<li><strong>Short Reads</strong>: Illumina sequencing generates short, accurate reads ideal for scaffolding.</li>
<li><strong>Long Reads</strong>: PacBio and Nanopore sequencing provide long reads for resolving repetitive regions.</li>
</ul>
</li>
<li>
<p><strong>Quality Control (QC)</strong><br />Use tools like <strong>FastQC</strong> or <strong>MultiQC</strong> to assess the quality of your reads:</p>
<div>
<div dir="ltr"><code>fastqc reads.fastq multiqc . </code></div>
</div>
<p>Look for issues like low-quality bases, adapter contamination, or overrepresented sequences.</p>
</li>
<li>
<p><strong>Read Trimming and Filtering</strong><br />Trim low-quality bases and adapters using <strong>Trimmomatic</strong> or <strong>Cutadapt</strong>:</p>
<div>
<div dir="ltr"><code>trimmomatic PE reads_R1.fastq reads_R2.fastq trimmed_R1.fastq trimmed_R2.fastq \ ILLUMINACLIP:adapters.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36 </code></div>
</div>
</li>
</ol><h4><strong>Step 2: Choosing an Assembly Strategy</strong></h4><p>Select an assembly strategy based on your data type:</p><ul>
<li>
<p><strong>Short-Read Assemblers</strong>:</p>
<ul>
<li>SPAdes: Popular for microbial genomes.</li>
<li>Velvet: Fast for smaller genomes.</li>
</ul>
</li>
<li>
<p><strong>Long-Read Assemblers</strong>:</p>
<ul>
<li>Canu: Ideal for long-read datasets.</li>
<li>Flye: Versatile for small and large genomes.</li>
</ul>
</li>
<li>
<p><strong>Hybrid Assemblers</strong>:</p>
<ul>
<li>MaSuRCA: Combines short and long reads.</li>
<li>Unicycler: Optimized for bacterial genomes.</li>
</ul>
</li>
</ul><h4><strong>Step 3: Running the Assembly</strong></h4><h5><strong>3.1. SPAdes (Short-Read Assembly)</strong></h5><p>SPAdes is an excellent choice for small genomes, such as bacteria.</p><div><div dir="ltr"><code>spades.py -1 trimmed_R1.fastq -2 trimmed_R2.fastq -o spades_output </code></div></div><p>The output includes assembled contigs (<code>contigs.fasta</code>) and scaffolds (<code>scaffolds.fasta</code>).</p><h5><strong>3.2. Canu (Long-Read Assembly)</strong></h5><p>Canu is designed for high-error long reads from PacBio or Nanopore.</p><div><div dir="ltr"><code>canu -p genome -d canu_output genomeSize=4.7m -nanopore-raw reads.fastq </code></div></div><p>The output will be in <code>canu_output/genome.contigs.fasta</code>.</p><h5><strong>3.3. Hybrid Assembly with Unicycler</strong></h5><p>Unicycler combines short and long reads for improved assemblies.</p><div><div dir="ltr"><code>unicycler -1 trimmed_R1.fastq -2 trimmed_R2.fastq -l long_reads.fastq -o unicycler_output </code></div></div><h4><strong>Step 4: Assessing Assembly Quality</strong></h4><p>After assembly, evaluate its quality using the following tools:</p><ol>
<li>
<p><strong>QUAST</strong><br />QUAST generates assembly statistics, such as N50, genome size, and GC content:</p>
<div>
<div dir="ltr"><code>quast contigs.fasta -o quast_output </code></div>
</div>
</li>
<li>
<p><strong>BUSCO</strong><br />BUSCO checks genome completeness by identifying conserved genes:</p>
<div>
<div dir="ltr"><code>busco -i contigs.fasta -o busco_output -l fungi_odb10 -m genome </code></div>
</div>
</li>
<li>
<p><strong>Assembly Graph Visualization</strong><br />Visualize assembly graphs with <strong>Bandage</strong>:</p>
<div>
<div dir="ltr"><code>Bandage load assembly_graph.gfa </code></div>
</div>
</li>
</ol><hr><h4><strong>Step 5: Post-Assembly Steps</strong></h4><ol>
<li>
<p><strong>Polishing</strong><br />Improve assembly accuracy using tools like <strong>Pilon</strong> (for short reads) or <strong>Racon</strong> (for long reads).</p>
<div>
<div dir="ltr"><code>racon long_reads.fasta mapped_reads.sam contigs.fasta &gt; polished_contigs.fasta </code></div>
</div>
</li>
<li>
<p><strong>Scaffolding</strong><br />Link contigs into scaffolds using tools like <strong>SSPACE</strong> or <strong>Opera-LG</strong> if required.</p>
</li>
<li>
<p><strong>Annotation</strong><br />Annotate the assembled genome using <strong>Prokka</strong> for prokaryotes or <strong>Maker</strong> for eukaryotes.</p>
<div>
<div dir="ltr"><code>prokka --outdir annotation_output --prefix genome contigs.fasta </code></div>
</div>
</li>
</ol><h4><strong>Step 6: Sharing and Archiving</strong></h4><ol>
<li>
<p><strong>Submit to Public Repositories</strong><br />Share your assembly in databases like <strong>NCBI GenBank</strong>, <strong>ENA</strong>, or <strong>DDBJ</strong>.</p>
</li>
<li>
<p><strong>Metadata Preparation</strong><br />Include detailed metadata for your submission, such as organism name, sequencing platform, and coverage.</p>
</li>
</ol><h4><strong>Best Practices</strong></h4><ul>
<li>Always perform quality checks at each stage to ensure data integrity.</li>
<li>Use multiple tools to cross-validate results when working with complex genomes.</li>
<li>Document parameters and software versions for reproducibility.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Genome assembly is a powerful process that transforms raw sequencing data into a coherent representation of an organism&rsquo;s genome. By following this step-by-step guide, you can successfully assemble genomes and uncover valuable biological insights. Whether you&rsquo;re assembling a microbial genome or tackling the complexities of a eukaryotic genome, these tools and strategies will set you on the path to success.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34328/dfast-a-flexible-prokaryotic-genome-annotation-pipeline-for-faster-genome-publication</guid>
	<pubDate>Tue, 14 Nov 2017 10:26:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34328/dfast-a-flexible-prokaryotic-genome-annotation-pipeline-for-faster-genome-publication</link>
	<title><![CDATA[DFAST: a flexible prokaryotic genome annotation pipeline for faster genome publication]]></title>
	<description><![CDATA[<p>We developed a prokaryotic genome annotation pipeline, DFAST, that also supports genome submission to public sequence databases. DFAST was originally started as an on-line annotation server, and to date, over 7,000 jobs have been processed since its first launch in 2016. Here, we present a newly implemented background annotation engine for DFAST, which is also available as a standalone command-line program. The new engine can annotate a typical-sized bacterial genome within 10 minutes, with rich information such as pseudogenes, translation exceptions, and orthologous gene assignment between given reference genomes. In addition, the modular framework of DFAST allows users to customize the annotation workflow easily and will also facilitate extensions for new functions and incorporation of new tools in the future.</p>
<div>Availability and Implementation</div>
<p>The software is implemented in Python 3 and runs in both Python 2.7 and 3.4&ndash; on Macintosh and Linux systems. It is freely available at&nbsp;<a href="https://github.com/nigyta/dfast_core/" target="">https://github.com/nigyta/dfast_core/</a>&nbsp;under the GPLv3 license with external binaries bundled in the software distribution. An on-line version is also available at&nbsp;<a href="https://dfast.nig.ac.jp/" target="">https://dfast.nig.ac.jp/</a>.</p><p>Address of the bookmark: <a href="https://dfast.nig.ac.jp/" rel="nofollow">https://dfast.nig.ac.jp/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37211/jbrowse-embeddable-genome-browser-built-completely-with-javascript-and-html5</guid>
	<pubDate>Fri, 29 Jun 2018 09:19:56 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37211/jbrowse-embeddable-genome-browser-built-completely-with-javascript-and-html5</link>
	<title><![CDATA[JBrowse: Embeddable genome browser built completely with JavaScript and HTML5]]></title>
	<description><![CDATA[JBrowse is a fast, embeddable genome browser built completely with JavaScript and HTML5, with optional run-once data formatting tools written in Perl.

Headline Features:
Fast, smooth scrolling and zooming. Explore your genome with unparalleled speed.
Scales easily to multi-gigabase genomes and deep-coverage sequencing.
Quickly open and view data files on your computer without uploading them to any server.
Supports GFF3, BED, FASTA, Wiggle, BigWig, BAM, VCF (with either .tbi or .idx index), REST, and more.  BAM, BigBed, BigWig, and VCF data are displayed directly from chunks of the compressed binary files, no conversion needed.
Includes an optional “faceted” track selector (see demo) suitable for large installations with thousands of tracks.
Very light server resource requirements. In fact, JBrowse has no back-end server code, just tools for formatting data files to be read directly over HTTP. Serve huge datasets from a single low-cost cloud instance.
Can run as a stand-alone app on OSX and Windows using the Electron platform
Highly extensible plugin architecture, with a large plugin registry of existing examples here https://gmod.github.io/jbrowse-registry

https://jbrowse.org/<p>Address of the bookmark: <a href="https://github.com/GMOD/jbrowse" rel="nofollow">https://github.com/GMOD/jbrowse</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/8504/update-genome-workbench-2715-released</guid>
	<pubDate>Wed, 26 Feb 2014 16:12:17 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/8504/update-genome-workbench-2715-released</link>
	<title><![CDATA[Update Genome Workbench 2.7.15 released]]></title>
	<description><![CDATA[<p>NCBI Genome Workbench is an integrated application for viewing and analyzing sequence data. With Genome Workbench, you can view data in publically available sequence databases at NCBI, and mix this data with your own private data.</p><p><img src="http://www.ncbi.nlm.nih.gov/core/assets/gbench/images/firstscreen_still.gif" alt="Introductory screen shot" style="border: 0px; border: 0px;"></p><p>Genome Workbench can display sequence data in many ways, including graphical sequence views, various alignment views, phylogenetic tree views, and tabular views of data. It can also align your private data to data in public databases, display your data in the context of public data, and retrieve BLAST results.</p><p>Genome Workbench is built on the NCBI C++ ToolKit and uses cross-platform APIs for graphics. It runs on your local machine, and is available for Windows 2000/XP, Linux, MacOS X, and various flavors of Unix.</p><p>NCBI Genome Workbench is an integrated application for viewing and analyzing sequence data. Genome Workbench was developed entirely in-house at NCBI and makes use of the NCBI C++ ToolKit. The C++ ToolKit provides a convenient and flexible cross-platform API for managing system internals, database connections, network sockets, and the NCBI data model. In addition, the C++ ToolKit provides the Object Manager, which abstracts handling of sequences and sequence-related objects.</p><p>&nbsp;New Features in Genome Workbench 2.7.15 <br /><br /></p><ul>
<li>Multiple Alignment View: implemented adaptive feature display when zooming in</li>
<li>Active Objects Inspector replaces Selection Inspector. New View should offer an improved selection context examination. See Using Active Objects Inspector tutorial for more details.</li>
<li>Binary packages for Linux OpenSUSE 13.1 are now available</li>
</ul><p><br />Bug Fixes and Improvements in Genome Workbench 2.7.15 <br /><br /></p><ul>
<li>Fixed major issue with OpenGL overlay/scrolling. Could cause crashes or view scrolling irregularities</li>
<li>Multiple Pane View: fixed crash on loading BLAST results</li>
<li>Graphical Sequence View: fixed crash on zooming in and out, related to SNP track</li>
<li>Graphical Sequence View: fixed Go To Position dialog to give better diagnostics in case of a user error</li>
<li>Graphical Sequence View: PDF export fixed rendering of Markers with commas in the name</li>
<li>Text View / Flat File: fixed Mac OS rendering issues</li>
<li>Text View / Flat File: performance optimization, extended capabilities of real-time rendering of molecules to tens of thousands</li>
<li>File Import: optimization improvement to speed up load of files containing multiple project items</li>
<li>File Import: remapping stage now shows accession.version and description of molecules, instead of plain GI numbers</li>
<li>Mac OS: improved tooltips for toolbar buttons</li>
<li>Phylogenetic Tree Builder Tool: improved diagnostics of errors</li>
<li>Multiple Alignment View: optimizations to avoid main GUI freezes</li>
<li>Open Dialog: removed duplicate elements in table of genomes (load Genome)</li>
<li>PDF export: fixed issue with XREF table errors</li>
<li>Tree View: fixed issues with showing Force Layout progress on Mac OS</li>
<li>Tree View: PDF export fixed issues for showing labels of collapsed nodes</li>
<li>Tree View: added an option to stop layout</li>
<li>Tree View: broadcasting mechanism fixed not to accumulate selected nodes</li>
</ul><p>Reference:</p><p>NCBI news</p><p>http://www.ncbi.nlm.nih.gov/tools/gbench/</p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/14215/the-8000-years-old-tibetian-gene-mutation</guid>
	<pubDate>Wed, 20 Aug 2014 21:57:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/14215/the-8000-years-old-tibetian-gene-mutation</link>
	<title><![CDATA[The 8000 years old Tibetian gene mutation !!!]]></title>
	<description><![CDATA[<p>A new study has provided insight into how gene mutation around 8,000 years ago helped Tibetans' to survive in the thin air on the Tibetan Plateau, where an average elevation is of 14,800 feet.<br /><br />A study led by University of Utah scientists is the first to find a genetic cause for the adaptation, a single DNA base pair change that dates back 8,000 years and demonstrate how it contributes to the Tibetans' ability to live in low oxygen conditions.</p><p>About 8,000 years ago, the gene EGLN1 changed by a single DNA base pair. Today, a relatively short time later on the scale of human history, 88 percent of Tibetans have the genetic variation, and it was virtually absent from closely related lowland Asians. The findings indicate the genetic variation endows its carriers with an advantage.<br /><br />In those without the adaptation, low oxygen caused their blood to become thick with oxygen-carrying red blood cells, an attempt to feed starved tissues, which could cause long-term complications such as heart failure. The researchers found that the newly identified genetic variation protected Tibetans by decreasing the over-response to low oxygen.</p><p>Reference: http://www.nature.com/nature/journal/v512/n7513/abs/nature13408.html</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/19633/vital-it</guid>
	<pubDate>Thu, 18 Dec 2014 10:46:59 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/19633/vital-it</link>
	<title><![CDATA[Vital-IT]]></title>
	<description><![CDATA[<p>Vital-IT is a <strong>bioinformatics competence center</strong> that supports and collaborates with life scientists in Switzerland and beyond. The <a href="http://www.vital-it.ch/about/team.php">multi-disciplinary team</a> provides expertise, training and maintains a high-performance computing (HPC) and storage infrastructure, so as to help develop, maintain and extend life science and medical research (<a href="http://www.vital-it.ch/about/activities.php">activities</a>).</p><p>Address of the bookmark: <a href="http://www.vital-it.ch/" rel="nofollow">http://www.vital-it.ch/</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/19648/mit-computational-biology-group</guid>
  <pubDate>Thu, 18 Dec 2014 14:47:01 -0600</pubDate>
  <link></link>
  <title><![CDATA[MIT Computational Biology Group]]></title>
  <description><![CDATA[
<p>My research group consists primarily of computer science graduate students and postdocs with expertise in algorithms, statistical inferences and machine learning, and sharing a passion for understanding fundamental biological problems.</p>

<p>We work in a highly interdisciplinary environment at the interface of Computer Science and Biology. Since its inception, our lab has eagerly engaged in collaborative research partnerships with biological and experimental collaborators, facilitated by our affiliation with the Broad Institute and the Computational and Systems Biology initiative (CSBi) at MIT, our participation in the Epigenome Roadmap, ENCODE, and modENCODE consortia, and by several other ongoing collaborations at MIT, Harvard, and the Harvard Medical School affiliated hospitals.</p>

<p>http://compbio.mit.edu/</p>
]]></description>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/26569/genome-stability-laboratory</guid>
  <pubDate>Mon, 07 Mar 2016 04:16:32 -0600</pubDate>
  <link></link>
  <title><![CDATA[Genome Stability Laboratory]]></title>
  <description><![CDATA[
<p>The bakers yeast, Saccharomyces cerevisiae is an ideal model organism to understand mechanisms of meiotic chromosome segregation. In S. cerevisiae and in mammals, the majority of meiotic crossovers are formed through a highly conserved MSH4p-MSH5p, MLH1p-MLH3p dependent pathway. We are interested in charactering the role of these complexes in crossover formation and distribution among all homolog pairs. Errors in this process are linked to congenital birth defects in humans such as Down's syndrome.Our laboratory is also interested in understanding the effect of genetic background on mutation rate variation using S. cerevisiae as a model. These studies are relevant for understanding cancer progression, genome evolution and architecture. We use high- throughput genomic methods as well as classical genetics to achieve these aims. </p>

<p>More at http://faculty.iisertvm.ac.in/~nishantkt/index.html</p>
]]></description>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/26499/katju-lab</guid>
  <pubDate>Fri, 26 Feb 2016 03:25:32 -0600</pubDate>
  <link></link>
  <title><![CDATA[Katju Lab]]></title>
  <description><![CDATA[
<p>TheLab seek to understand the genetic factors contributing to genomic variation and phenotypic diversity.  To this end, we employ molecular and bioinformatic tools to study evolutionary processes at the level of populations, both experimental and natural, and genomes.  Our research interests encompass a wide range of topics, including the evolution of organellar and nuclear genomes, gene duplication and the origin of novel function, and the fitness and phenotypic consequences of mutation in evolution. For details regards ongoing projects, please see the Research page.</p>

<p>http://katjulab.com/research.html</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/17176/arvados</guid>
	<pubDate>Sat, 20 Sep 2014 16:54:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/17176/arvados</link>
	<title><![CDATA[Arvados]]></title>
	<description><![CDATA[<p>Arvados is a free and open&nbsp;source bioinformatics&nbsp;platform for genomic and&nbsp;biomedical data. User can&nbsp;Store | Organize | Compute | Share the data for free.&nbsp;</p>
<p><img src="https://arvados.org/images/dax.png" width="400" height="535" alt="image" style="border: 0px;"></p><p>Address of the bookmark: <a href="https://arvados.org/" rel="nofollow">https://arvados.org/</a></p>]]></description>
	<dc:creator>Martin Jones</dc:creator>
</item>

</channel>
</rss>