<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/2726?offset=70</link>
	<atom:link href="https://bioinformaticsonline.com/related/2726?offset=70" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</guid>
	<pubDate>Fri, 13 Dec 2024 11:35:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</link>
	<title><![CDATA[Step-by-Step Guide to Running Genome Assembly]]></title>
	<description><![CDATA[<p>Genome assembly is a critical process in bioinformatics, enabling the reconstruction of an organism's genome from short DNA sequence reads. Whether you&rsquo;re working on a new microbial genome or a complex eukaryotic organism, this guide will walk you through the steps of genome assembly using state-of-the-art tools and best practices.</p><h4><strong>What is Genome Assembly?</strong></h4><p>Genome assembly involves piecing together short DNA sequence reads generated by sequencing platforms (e.g., Illumina, PacBio, Oxford Nanopore) into longer, contiguous sequences called contigs. This can be performed as:</p><ul>
<li><strong>De Novo Assembly</strong>: Without a reference genome.</li>
<li><strong>Reference-Guided Assembly</strong>: Using a reference genome to guide the assembly process.</li>
</ul><h4><strong>Step 1: Preparing Your Data</strong></h4><p>Before starting the assembly, ensure that your raw sequencing data is high quality.</p><ol>
<li>
<p><strong>Input Data</strong></p>
<ul>
<li><strong>Short Reads</strong>: Illumina sequencing generates short, accurate reads ideal for scaffolding.</li>
<li><strong>Long Reads</strong>: PacBio and Nanopore sequencing provide long reads for resolving repetitive regions.</li>
</ul>
</li>
<li>
<p><strong>Quality Control (QC)</strong><br />Use tools like <strong>FastQC</strong> or <strong>MultiQC</strong> to assess the quality of your reads:</p>
<div>
<div dir="ltr"><code>fastqc reads.fastq multiqc . </code></div>
</div>
<p>Look for issues like low-quality bases, adapter contamination, or overrepresented sequences.</p>
</li>
<li>
<p><strong>Read Trimming and Filtering</strong><br />Trim low-quality bases and adapters using <strong>Trimmomatic</strong> or <strong>Cutadapt</strong>:</p>
<div>
<div dir="ltr"><code>trimmomatic PE reads_R1.fastq reads_R2.fastq trimmed_R1.fastq trimmed_R2.fastq \ ILLUMINACLIP:adapters.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36 </code></div>
</div>
</li>
</ol><h4><strong>Step 2: Choosing an Assembly Strategy</strong></h4><p>Select an assembly strategy based on your data type:</p><ul>
<li>
<p><strong>Short-Read Assemblers</strong>:</p>
<ul>
<li>SPAdes: Popular for microbial genomes.</li>
<li>Velvet: Fast for smaller genomes.</li>
</ul>
</li>
<li>
<p><strong>Long-Read Assemblers</strong>:</p>
<ul>
<li>Canu: Ideal for long-read datasets.</li>
<li>Flye: Versatile for small and large genomes.</li>
</ul>
</li>
<li>
<p><strong>Hybrid Assemblers</strong>:</p>
<ul>
<li>MaSuRCA: Combines short and long reads.</li>
<li>Unicycler: Optimized for bacterial genomes.</li>
</ul>
</li>
</ul><h4><strong>Step 3: Running the Assembly</strong></h4><h5><strong>3.1. SPAdes (Short-Read Assembly)</strong></h5><p>SPAdes is an excellent choice for small genomes, such as bacteria.</p><div><div dir="ltr"><code>spades.py -1 trimmed_R1.fastq -2 trimmed_R2.fastq -o spades_output </code></div></div><p>The output includes assembled contigs (<code>contigs.fasta</code>) and scaffolds (<code>scaffolds.fasta</code>).</p><h5><strong>3.2. Canu (Long-Read Assembly)</strong></h5><p>Canu is designed for high-error long reads from PacBio or Nanopore.</p><div><div dir="ltr"><code>canu -p genome -d canu_output genomeSize=4.7m -nanopore-raw reads.fastq </code></div></div><p>The output will be in <code>canu_output/genome.contigs.fasta</code>.</p><h5><strong>3.3. Hybrid Assembly with Unicycler</strong></h5><p>Unicycler combines short and long reads for improved assemblies.</p><div><div dir="ltr"><code>unicycler -1 trimmed_R1.fastq -2 trimmed_R2.fastq -l long_reads.fastq -o unicycler_output </code></div></div><h4><strong>Step 4: Assessing Assembly Quality</strong></h4><p>After assembly, evaluate its quality using the following tools:</p><ol>
<li>
<p><strong>QUAST</strong><br />QUAST generates assembly statistics, such as N50, genome size, and GC content:</p>
<div>
<div dir="ltr"><code>quast contigs.fasta -o quast_output </code></div>
</div>
</li>
<li>
<p><strong>BUSCO</strong><br />BUSCO checks genome completeness by identifying conserved genes:</p>
<div>
<div dir="ltr"><code>busco -i contigs.fasta -o busco_output -l fungi_odb10 -m genome </code></div>
</div>
</li>
<li>
<p><strong>Assembly Graph Visualization</strong><br />Visualize assembly graphs with <strong>Bandage</strong>:</p>
<div>
<div dir="ltr"><code>Bandage load assembly_graph.gfa </code></div>
</div>
</li>
</ol><hr><h4><strong>Step 5: Post-Assembly Steps</strong></h4><ol>
<li>
<p><strong>Polishing</strong><br />Improve assembly accuracy using tools like <strong>Pilon</strong> (for short reads) or <strong>Racon</strong> (for long reads).</p>
<div>
<div dir="ltr"><code>racon long_reads.fasta mapped_reads.sam contigs.fasta &gt; polished_contigs.fasta </code></div>
</div>
</li>
<li>
<p><strong>Scaffolding</strong><br />Link contigs into scaffolds using tools like <strong>SSPACE</strong> or <strong>Opera-LG</strong> if required.</p>
</li>
<li>
<p><strong>Annotation</strong><br />Annotate the assembled genome using <strong>Prokka</strong> for prokaryotes or <strong>Maker</strong> for eukaryotes.</p>
<div>
<div dir="ltr"><code>prokka --outdir annotation_output --prefix genome contigs.fasta </code></div>
</div>
</li>
</ol><h4><strong>Step 6: Sharing and Archiving</strong></h4><ol>
<li>
<p><strong>Submit to Public Repositories</strong><br />Share your assembly in databases like <strong>NCBI GenBank</strong>, <strong>ENA</strong>, or <strong>DDBJ</strong>.</p>
</li>
<li>
<p><strong>Metadata Preparation</strong><br />Include detailed metadata for your submission, such as organism name, sequencing platform, and coverage.</p>
</li>
</ol><h4><strong>Best Practices</strong></h4><ul>
<li>Always perform quality checks at each stage to ensure data integrity.</li>
<li>Use multiple tools to cross-validate results when working with complex genomes.</li>
<li>Document parameters and software versions for reproducibility.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Genome assembly is a powerful process that transforms raw sequencing data into a coherent representation of an organism&rsquo;s genome. By following this step-by-step guide, you can successfully assemble genomes and uncover valuable biological insights. Whether you&rsquo;re assembling a microbial genome or tackling the complexities of a eukaryotic genome, these tools and strategies will set you on the path to success.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/4099/sequencing-solutions-to-world-health</guid>
	<pubDate>Thu, 29 Aug 2013 15:05:35 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/4099/sequencing-solutions-to-world-health</link>
	<title><![CDATA[Sequencing Solutions to World Health]]></title>
	<description><![CDATA[<p>"<em>New technology that quickly, easily and economically reveals the genomes of viruses and pathogens transforms public health and medicine."</em></p>
<p><strong>Source</strong>: Life technologies</p><p>Address of the bookmark: <a href="http://www.lifetechnologies.com/global/en/home/communities-social/blog/blogs/sequencing-solutions-to-world-health.html?cid=social_blogseries_20130829_11098264" rel="nofollow">http://www.lifetechnologies.com/global/en/home/communities-social/blog/blogs/sequencing-solutions-to-world-health.html?cid=social_blogseries_20130829_11098264</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/2518/genome-browsers</guid>
	<pubDate>Fri, 16 Aug 2013 19:04:47 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/2518/genome-browsers</link>
	<title><![CDATA[Genome Browsers]]></title>
	<description><![CDATA[<p>Genome Browser is the platform/database used for searching and retreiving sequences and annotation of genomes belong to various eukaryotes, prokaryotes, etc.</p><p>Following are the weblink for different available browsers:</p><p><a href="http://www.ensembl.org/index.html">http://www.ensembl.org/index.html</a></p><p><a href="http://ensemblgenomes.org/">http://ensemblgenomes.org/</a></p><p><a href="http://genome.ucsc.edu/">http://genome.ucsc.edu/</a></p><p><a href="http://www.ncbi.nlm.nih.gov/genome">http://www.ncbi.nlm.nih.gov/genome</a></p><p><a href="http://www.ebi.ac.uk/genomes/">http://www.ebi.ac.uk/genomes/</a></p><p><a href="http://flybase.org/">http://flybase.org/</a></p><p><a href="http://cmr.jcvi.org/tigr-scripts/CMR/CmrHomePage.cgi">http://cmr.jcvi.org/tigr-scripts/CMR/CmrHomePage.cgi</a></p><p><a href="http://www.sanger.ac.uk/resources/databases/">http://www.sanger.ac.uk/resources/databases/</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/4208/latest-paper-on-comparison-of-mapping-tools</guid>
	<pubDate>Tue, 03 Sep 2013 18:00:38 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/4208/latest-paper-on-comparison-of-mapping-tools</link>
	<title><![CDATA[Latest paper on comparison of mapping tools]]></title>
	<description><![CDATA[<p>A. Hatem, D. Bozdag, A. E. Toland, U. V. Catalyurek "Benchmarking short sequence mapping tools" BMC Bioinformatics, 14(1):184, 2013.</p>
<p>http://bmi.osu.edu/hpc/software/benchmark/</p>
<p><a href="http://bmi.osu.edu/hpc/software/pmap/pmap.html">http://bmi.osu.edu/hpc/software/pmap/pmap.html</a></p>
<p>Other similiar papers:</p>
<p><a href="http://online.liebertpub.com/doi/pdf/10.1089/cmb.2012.0022">http://online.liebertpub.com/doi/pdf/10.1089/cmb.2012.0022</a></p>
<p><a href="http://bioinformatics.oxfordjournals.org/content/28/24/3169">http://bioinformatics.oxfordjournals.org/content/28/24/3169</a></p>
<p>Some new Mapping tool links:<a href="http://bmi.osu.edu/hpc/software/benchmark/"></a></p>
<p><strong>GSNAP</strong></p>
<p><a href="http://research-pub.gene.com/gmap/"></a><a href="http://research-pub.gene.com/gmap/">http://research-pub.gene.com/gmap/</a></p>
<p><strong>RMAP</strong></p>
<p><a href="http://rulai.cshl.edu/rmap/"></a><a href="http://rulai.cshl.edu/rmap/">http://rulai.cshl.edu/rmap/</a></p>
<p><strong>mrsFAST</strong></p>
<p><a href="http://mrsfast.sourceforge.net/Home"></a><a href="http://mrsfast.sourceforge.net/Home">http://mrsfast.sourceforge.net/Home</a></p>
<p><a href="http://sourceforge.net/projects/mrsfast/files/mrsfast-ultra-3.1.0/">http://sourceforge.net/projects/mrsfast/files/mrsfast-ultra-3.1.0/</a></p>
<p><strong>BFAST</strong></p>
<p><a href="http://sourceforge.net/apps/mediawiki/bfast/index.php?title=Main_Page">http://sourceforge.net/apps/mediawiki/bfast/index.php?title=Main_Page</a></p>
<p><strong>SHRiMP (for&nbsp;AB SOLiD color-space reads)</strong></p>
<p><a href="http://compbio.cs.toronto.edu/shrimp/">http://compbio.cs.toronto.edu/shrimp/</a></p>
<p><strong>RazerA 3</strong></p>
<p><a href="http://www.seqan.de/projects/razers/">http://www.seqan.de/projects/razers/</a></p><p>Address of the bookmark: <a href="http://www.biomedcentral.com/1471-2105/14/184" rel="nofollow">http://www.biomedcentral.com/1471-2105/14/184</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/9400/largest-genome-sequenced</guid>
	<pubDate>Fri, 21 Mar 2014 13:57:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/9400/largest-genome-sequenced</link>
	<title><![CDATA[Largest Genome Sequenced]]></title>
	<description><![CDATA[<p>The enormous size of the <strong>loblolly pine genome</strong> having <strong>22 billion base pairs</strong> compared to only 3 billion in the human genome. In other words, it is&nbsp;<strong>seven times</strong> larger than a human&rsquo;s and also the largest and the most complete&nbsp;<strong>conifer<a href="http://en.wikipedia.org/wiki/Pinophyta" target="_blank"></a></strong>&nbsp;genome ever sequenced.</p>
<p><strong>Related Paper:</strong></p>
<p>http://genomebiology.com/2014/15/3/R59/abstract</p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="http://www.news.ucdavis.edu/search/news_detail.lasso?id=10859" rel="nofollow">http://www.news.ucdavis.edu/search/news_detail.lasso?id=10859</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/10243/new-rna-seq-tool</guid>
	<pubDate>Fri, 25 Apr 2014 10:59:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/10243/new-rna-seq-tool</link>
	<title><![CDATA[New RNA Seq tool]]></title>
	<description><![CDATA[<p>"<span>By removing the time-consuming step of read mapping, the authors reported, Sailfish able to provide quantification estimates 20&ndash;30 times faster than current methods without loss of accuracy."</span></p>
<p><span>Tool link:</span></p>
<p><span>http://www.cs.cmu.edu/~ckingsf/software/sailfish/</span></p>
<p><span></span></p><p>Address of the bookmark: <a href="http://www.genengnews.com/gen-news-highlights/lightweight-algorithms-sail-through-rna-sequencing-data/81249765/" rel="nofollow">http://www.genengnews.com/gen-news-highlights/lightweight-algorithms-sail-through-rna-sequencing-data/81249765/</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/10966/genxpro-gmbh</guid>
	<pubDate>Thu, 22 May 2014 07:18:35 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/10966/genxpro-gmbh</link>
	<title><![CDATA[GenXPro GmbH]]></title>
	<description><![CDATA[<p><strong>GenXPro</strong>&nbsp;GMbH is service provider for entire spectrum of nucleotide-based information&nbsp;of any biological sample. By combining intelligent data reduction techniques and&nbsp;latest next generation sequencing technologies, our service portfolio provides most accurate and cost efficient solutions for&nbsp;transcriptomic-, genomic- or epigenomic research.</p><p><span><span><strong><span>GENXPRO GMBH</span>,&nbsp;</strong></span></span><span>ALTENH&Ouml;FERALLEE 3,&nbsp;</span><span>60438 FRANKFURT MAIN,&nbsp;</span><span>GERMANY</span></p><p><span><span><strong>Website</strong></span>:&nbsp;<a href="http://www.genxpro.info/products_and_services/"></a><a href="http://www.genxpro.info/products_and_services/">http://www.genxpro.info/products_and_services/</a></span></p><p><span><strong>PHONE</strong>: +49 (0)69- 95 73 97 10,&nbsp;FAX: +49 (0)69- 95 73 97 06</span></p><p><span>EMAIL: info@genxpro.de</span></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/13415/genomics-and-sequencing-approach-for-identification-of-biomarkers-to-assess-the-efficacy-of-tgf-%CE%B2ri-inhibitors-of-liver-cancer-in-vivo</guid>
	<pubDate>Tue, 05 Aug 2014 13:55:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/13415/genomics-and-sequencing-approach-for-identification-of-biomarkers-to-assess-the-efficacy-of-tgf-%CE%B2ri-inhibitors-of-liver-cancer-in-vivo</link>
	<title><![CDATA[Genomics and sequencing approach for identification of biomarkers to assess the efficacy of TGF-βRI inhibitors (of liver cancer) in vivo]]></title>
	<description><![CDATA[<p>Liver cancer is third leading cause of deaths and fourth most frequent occuring cancer worldwide. There are multiple signaling pathways responsible for causing cancer amongst which TGFb is most important cytokine whose signaling pathway promote cancer. However, main problem is to cure this cancer at late stage where we still have no treatment strategy to tackle this deadly cancer. &nbsp;Hence we need to find out new therapeutic target. One way is to look the relationships between mRNA, methylation and miRNA data of patients with different pathological conditions (cancer vs control either with inhibitor/not). MiRNA is small RNA molecules known to inhibit mRNA expression of particular gene by binding improperly to 3'UTR region of a gene and hence block binding of TF /translation of gene. CpG regions is known to located at promoter region of gene (5' UTR) and usually hypomethylated which allow to gene to transcribe and translate however sometime this region become hyper-methylated thats prevent expression of host gene. Thus , integration of these three data reveal new targets and pathways important for causing or preventing cancer and also reveal biomarker thats check the effects of inhibitor on signaling pathway underlying liver cancer.</p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/13415" length="26423" type="image/jpeg" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/23174/scaffolding-of-a-bacterial-genome-using-minion-nanopore-sequencing</guid>
	<pubDate>Tue, 07 Jul 2015 16:59:25 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/23174/scaffolding-of-a-bacterial-genome-using-minion-nanopore-sequencing</link>
	<title><![CDATA[Scaffolding of a bacterial genome using MinION nanopore sequencing]]></title>
	<description><![CDATA[<p><span>Second generation sequencing has revolutionized genomic studies. However, most genomes contain repeated DNA elements that are longer than the read lengths achievable with typical sequencers, so the genomic order of several generated contigs cannot be easily resolved. A new generation of sequencers offering substantially longer reads is emerging, notably the Pacific Biosciences (PacBio) RS II system and the MinION system, released in early 2014 by Oxford Nanopore Technologies through an early access program.</span></p><p>Address of the bookmark: <a href="http://www.nature.com/srep/2015/150707/srep11996/full/srep11996.html" rel="nofollow">http://www.nature.com/srep/2015/150707/srep11996/full/srep11996.html</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/29407/live-webinar-on-rna-seq-data-analysis-on-9-nov-2016</guid>
	<pubDate>Wed, 19 Oct 2016 05:25:27 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/29407/live-webinar-on-rna-seq-data-analysis-on-9-nov-2016</link>
	<title><![CDATA[Live Webinar on RNA-Seq Data Analysis on 9 Nov 2016]]></title>
	<description><![CDATA[<p><strong><a href="http://www.strand-ngs.com/webinar_registration">Live Webinar on RNA-Seq Data Analysis</a></strong></p><p><a href="http://www.strand-ngs.com/webinar_registration">Abstract: </a>Strand NGS supports an extensive workflow for the analysis and visualization of RNA-Seq data. The workflow includes Transcriptome / Genome alignment, Differential expression analysis with Statistical approach and Splicing events detection. Strand NGS also supports novel discovery like identification of novel genes, exons and Novel splice junctions, alongside it can also detect gene fusion events. Further downstream analysis such as GO and pathway analysis can be performed on the set of interesting genes. The product has an option to create pipelines for time consuming jobs which automates analysis and leaves more time for end data interpretation. This webinar will give an overview of the features in the RNA-Seq data analysis workflow in Strand NGS and also highlights on parameters within each feature that can be optimized depending on datasets and analysis needs.</p><p><a href="http://www.strand-ngs.com/webinar_registration">Speaker:</a> Mr. Sugandan Sivamani, Senior Application Scientist, Strand Life Sciences</p><p>Date: 9th Nov, <a href="http://www.strand-ngs.com/webinar_registration">Session 1</a> for SAPK/ APFO: 2:30 PM IST Date: 9th Nov, <a href="http://www.strand-ngs.com/webinar_registration">Session 2</a> for AFO/ EMEA: 9:00 AM PST</p><p>Register here <a href="http://www.strand-ngs.com/webinar_registration">http://www.strand-ngs.com/webinar_registration</a></p>]]></description>
	<dc:creator>Strand</dc:creator>
</item>

</channel>
</rss>