<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43620?offset=210</link>
	<atom:link href="https://bioinformaticsonline.com/related/43620?offset=210" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</guid>
	<pubDate>Fri, 13 Dec 2024 11:35:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</link>
	<title><![CDATA[Step-by-Step Guide to Running Genome Assembly]]></title>
	<description><![CDATA[<p>Genome assembly is a critical process in bioinformatics, enabling the reconstruction of an organism's genome from short DNA sequence reads. Whether you&rsquo;re working on a new microbial genome or a complex eukaryotic organism, this guide will walk you through the steps of genome assembly using state-of-the-art tools and best practices.</p><h4><strong>What is Genome Assembly?</strong></h4><p>Genome assembly involves piecing together short DNA sequence reads generated by sequencing platforms (e.g., Illumina, PacBio, Oxford Nanopore) into longer, contiguous sequences called contigs. This can be performed as:</p><ul>
<li><strong>De Novo Assembly</strong>: Without a reference genome.</li>
<li><strong>Reference-Guided Assembly</strong>: Using a reference genome to guide the assembly process.</li>
</ul><h4><strong>Step 1: Preparing Your Data</strong></h4><p>Before starting the assembly, ensure that your raw sequencing data is high quality.</p><ol>
<li>
<p><strong>Input Data</strong></p>
<ul>
<li><strong>Short Reads</strong>: Illumina sequencing generates short, accurate reads ideal for scaffolding.</li>
<li><strong>Long Reads</strong>: PacBio and Nanopore sequencing provide long reads for resolving repetitive regions.</li>
</ul>
</li>
<li>
<p><strong>Quality Control (QC)</strong><br />Use tools like <strong>FastQC</strong> or <strong>MultiQC</strong> to assess the quality of your reads:</p>
<div>
<div dir="ltr"><code>fastqc reads.fastq multiqc . </code></div>
</div>
<p>Look for issues like low-quality bases, adapter contamination, or overrepresented sequences.</p>
</li>
<li>
<p><strong>Read Trimming and Filtering</strong><br />Trim low-quality bases and adapters using <strong>Trimmomatic</strong> or <strong>Cutadapt</strong>:</p>
<div>
<div dir="ltr"><code>trimmomatic PE reads_R1.fastq reads_R2.fastq trimmed_R1.fastq trimmed_R2.fastq \ ILLUMINACLIP:adapters.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36 </code></div>
</div>
</li>
</ol><h4><strong>Step 2: Choosing an Assembly Strategy</strong></h4><p>Select an assembly strategy based on your data type:</p><ul>
<li>
<p><strong>Short-Read Assemblers</strong>:</p>
<ul>
<li>SPAdes: Popular for microbial genomes.</li>
<li>Velvet: Fast for smaller genomes.</li>
</ul>
</li>
<li>
<p><strong>Long-Read Assemblers</strong>:</p>
<ul>
<li>Canu: Ideal for long-read datasets.</li>
<li>Flye: Versatile for small and large genomes.</li>
</ul>
</li>
<li>
<p><strong>Hybrid Assemblers</strong>:</p>
<ul>
<li>MaSuRCA: Combines short and long reads.</li>
<li>Unicycler: Optimized for bacterial genomes.</li>
</ul>
</li>
</ul><h4><strong>Step 3: Running the Assembly</strong></h4><h5><strong>3.1. SPAdes (Short-Read Assembly)</strong></h5><p>SPAdes is an excellent choice for small genomes, such as bacteria.</p><div><div dir="ltr"><code>spades.py -1 trimmed_R1.fastq -2 trimmed_R2.fastq -o spades_output </code></div></div><p>The output includes assembled contigs (<code>contigs.fasta</code>) and scaffolds (<code>scaffolds.fasta</code>).</p><h5><strong>3.2. Canu (Long-Read Assembly)</strong></h5><p>Canu is designed for high-error long reads from PacBio or Nanopore.</p><div><div dir="ltr"><code>canu -p genome -d canu_output genomeSize=4.7m -nanopore-raw reads.fastq </code></div></div><p>The output will be in <code>canu_output/genome.contigs.fasta</code>.</p><h5><strong>3.3. Hybrid Assembly with Unicycler</strong></h5><p>Unicycler combines short and long reads for improved assemblies.</p><div><div dir="ltr"><code>unicycler -1 trimmed_R1.fastq -2 trimmed_R2.fastq -l long_reads.fastq -o unicycler_output </code></div></div><h4><strong>Step 4: Assessing Assembly Quality</strong></h4><p>After assembly, evaluate its quality using the following tools:</p><ol>
<li>
<p><strong>QUAST</strong><br />QUAST generates assembly statistics, such as N50, genome size, and GC content:</p>
<div>
<div dir="ltr"><code>quast contigs.fasta -o quast_output </code></div>
</div>
</li>
<li>
<p><strong>BUSCO</strong><br />BUSCO checks genome completeness by identifying conserved genes:</p>
<div>
<div dir="ltr"><code>busco -i contigs.fasta -o busco_output -l fungi_odb10 -m genome </code></div>
</div>
</li>
<li>
<p><strong>Assembly Graph Visualization</strong><br />Visualize assembly graphs with <strong>Bandage</strong>:</p>
<div>
<div dir="ltr"><code>Bandage load assembly_graph.gfa </code></div>
</div>
</li>
</ol><hr><h4><strong>Step 5: Post-Assembly Steps</strong></h4><ol>
<li>
<p><strong>Polishing</strong><br />Improve assembly accuracy using tools like <strong>Pilon</strong> (for short reads) or <strong>Racon</strong> (for long reads).</p>
<div>
<div dir="ltr"><code>racon long_reads.fasta mapped_reads.sam contigs.fasta &gt; polished_contigs.fasta </code></div>
</div>
</li>
<li>
<p><strong>Scaffolding</strong><br />Link contigs into scaffolds using tools like <strong>SSPACE</strong> or <strong>Opera-LG</strong> if required.</p>
</li>
<li>
<p><strong>Annotation</strong><br />Annotate the assembled genome using <strong>Prokka</strong> for prokaryotes or <strong>Maker</strong> for eukaryotes.</p>
<div>
<div dir="ltr"><code>prokka --outdir annotation_output --prefix genome contigs.fasta </code></div>
</div>
</li>
</ol><h4><strong>Step 6: Sharing and Archiving</strong></h4><ol>
<li>
<p><strong>Submit to Public Repositories</strong><br />Share your assembly in databases like <strong>NCBI GenBank</strong>, <strong>ENA</strong>, or <strong>DDBJ</strong>.</p>
</li>
<li>
<p><strong>Metadata Preparation</strong><br />Include detailed metadata for your submission, such as organism name, sequencing platform, and coverage.</p>
</li>
</ol><h4><strong>Best Practices</strong></h4><ul>
<li>Always perform quality checks at each stage to ensure data integrity.</li>
<li>Use multiple tools to cross-validate results when working with complex genomes.</li>
<li>Document parameters and software versions for reproducibility.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Genome assembly is a powerful process that transforms raw sequencing data into a coherent representation of an organism&rsquo;s genome. By following this step-by-step guide, you can successfully assemble genomes and uncover valuable biological insights. Whether you&rsquo;re assembling a microbial genome or tackling the complexities of a eukaryotic genome, these tools and strategies will set you on the path to success.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44775/genomic-architecture-surrounding-the-fusion-site-of-human-chromosome-2</guid>
	<pubDate>Tue, 04 Mar 2025 12:26:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44775/genomic-architecture-surrounding-the-fusion-site-of-human-chromosome-2</link>
	<title><![CDATA[Genomic architecture surrounding the fusion site of human chromosome 2]]></title>
	<description><![CDATA[<p>The article <strong>"Genomic Structure and Evolution of the Ancestral Chromosome Fusion Site in 2q13&ndash;2q14.1 and Paralogous Regions on Other Human Chromosomes (https://pmc.ncbi.nlm.nih.gov/articles/PMC187548/)"</strong> explores the genomic architecture surrounding the fusion site of human chromosome 2. This fusion event is a key evolutionary marker distinguishing humans from other great apes, as humans have 46 chromosomes while chimpanzees, gorillas, and orangutans possess 48. The fusion occurred through an end-to-end joining of two ancestral chromosomes, which remain separate in nonhuman primates.</p><h3><strong>Key Findings:</strong></h3><ol>
<li>
<p><strong>Chromosomal Fusion and Its Molecular Signature:</strong></p>
<ul>
<li>The fusion site is located at <strong>2q13&ndash;2q14.1</strong> and is characterized by <strong>degenerate telomeric sequences</strong> appearing interstitially, indicating the historical head-to-head joining of ancestral chromosomes.</li>
<li>Despite being a signature of a past fusion event, these telomeric repeats are no longer functional and have undergone sequence degradation over time.</li>
</ul>
</li>
<li>
<p><strong>Extensive Duplications in the Surrounding Genomic Region:</strong></p>
<ul>
<li>The study identifies <strong>large-scale segmental duplications</strong> flanking the fusion site, with several of these regions duplicated and scattered across multiple chromosomes.</li>
<li>These duplications are predominantly located in <strong>subtelomeric and pericentromeric regions</strong>, suggesting their role in genomic instability and chromosomal evolution.</li>
</ul>
</li>
<li>
<p><strong>Paralogous Regions and Their Evolutionary Relationships:</strong></p>
<ul>
<li>A <strong>168-kilobase (kb) segment</strong> near the fusion site has <strong>98%&ndash;99% sequence identity</strong> with three regions on <strong>chromosome 9 (9pter, 9p11.2, and 9q13)</strong>.</li>
<li>Another <strong>67-kb region distal to the fusion site</strong> shows a high degree of homology to sequences in <strong>chromosome 22qter</strong>.</li>
<li>Additionally, a <strong>100-kb segment</strong> exhibits <strong>96% sequence identity</strong> with a region in <strong>chromosome 2q11.2</strong>.</li>
</ul>
</li>
<li>
<p><strong>Comparative Genomics and Evolutionary Implications:</strong></p>
<ul>
<li>By comparing the duplicated sequences and their arrangement in primates, the researchers traced the order of duplication events leading to their present distribution.</li>
<li>The presence of specific repetitive elements within these duplicated segments serves as <strong>evolutionary markers</strong> that help infer their historical rearrangements.</li>
<li>Some of these <strong>duplicated regions are associated with chromosomal inversion breakpoints</strong>, potentially contributing to evolutionary changes in primates.</li>
<li>Recurrent <strong>structural rearrangements</strong> in these regions have been linked to human chromosomal disorders.</li>
</ul>
</li>
</ol><h3><strong>Conclusions and Implications:</strong></h3><ul>
<li>The findings provide valuable insights into <strong>the structural evolution of human chromosome 2</strong>, which played a crucial role in human speciation.</li>
<li>Understanding these <strong>segmental duplications</strong> and their evolutionary trajectories sheds light on <strong>genomic instability</strong>, which may contribute to <strong>human genetic diseases</strong>.</li>
<li>The study highlights how large-scale chromosomal rearrangements, such as fusion and duplication, have influenced the <strong>evolutionary divergence of humans</strong> from other primates.</li>
</ul><p>This research advances our understanding of <strong>human genome evolution</strong> and offers a foundation for studying the effects of <strong>structural variants in genetic disorders</strong>.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/2791/ncbi-psi-blast-tutorial</guid>
	<pubDate>Fri, 23 Aug 2013 02:25:02 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/2791/ncbi-psi-blast-tutorial</link>
	<title><![CDATA[NCBI PSI-BLAST Tutorial]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/T3kHEieyylk" frameborder="0" allowfullscreen></iframe>http:--www.biotechnology.jhu.edu-
Tutorial for PSI-BLAST, an extension of BLAST that uses matrix algebra. BLAST is a cornerstone bioinformatics tool at NCBI. BLAST is the
Basic Local Alignment Search tool and will protein and DNA sequences that
are related to a sequence that the user provides.]]></description>
	
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/16685/webinar-blast-in-the-cloud</guid>
	<pubDate>Mon, 15 Sep 2014 17:29:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/16685/webinar-blast-in-the-cloud</link>
	<title><![CDATA[Webinar: BLAST in the Cloud]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/wLm-RfdcvnU" frameborder="0" allowfullscreen></iframe>Presented July 30, 2014 and covering: an NCBI BLAST AMI at Amazon Web Services; introduction to AWS and setting up an instance; running command line BLAST and using the BLAST URL API via the AMI; and answers to attendee questions.]]></description>
	
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/27344/orffinder-with-smart-blast</guid>
	<pubDate>Tue, 17 May 2016 01:43:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/27344/orffinder-with-smart-blast</link>
	<title><![CDATA[ORFfinder with smart BLAST]]></title>
	<description><![CDATA[<p><span>ORF Finder</span></p><p><span><a href="http://www.ncbi.nlm.nih.gov/orffinder">ORFfinder</a><span>&nbsp;is a graphical analysis tool for finding open reading frames (ORFs). We&rsquo;ve been working on a few updates, and we&rsquo;d like to find out what you think about them. Read on to find out what you can do with the new ORFfinder.</span></span></p><p>Smart BLAST (https://ncbiinsights.ncbi.nlm.nih.gov/2015/07/29/smartblast/)</p><p>Select one or a group of ORFs and BLAST several databases at once, and use the newly developed&nbsp;<a href="http://blast.ncbi.nlm.nih.gov/smartblast/">SmartBLAST</a>&nbsp;to verify protein names.&nbsp;Looking for the traditional results from&nbsp;<a href="http://blast.ncbi.nlm.nih.gov/Blast.cgi">BLAST</a>? They&rsquo;re there too.</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/38226/ncbi-to-assist-in-virus-hunting-data-science-hackathon</guid>
	<pubDate>Thu, 15 Nov 2018 12:55:01 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/38226/ncbi-to-assist-in-virus-hunting-data-science-hackathon</link>
	<title><![CDATA[NCBI to assist in Virus Hunting Data Science Hackathon]]></title>
	<description><![CDATA[<p>NCBI Hackathon are pleased to announce the second installment of the&nbsp;<a href="https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/30/ncbi-southern-california-genomics-hackathon-january/" target="_blank">SoCal Bioinformatics Hackathon</a>. From January 9-11, 2019, the&nbsp;<a href="https://www.ncbi.nlm.nih.gov/" target="_blank">NCBI</a>&nbsp;will help run a bioinformatics hackathon in Southern California hosted by the&nbsp;<a href="http://www.csrc.sdsu.edu/" target="_blank">Computational Sciences Research Center</a>&nbsp;at&nbsp;<a href="http://www.sdsu.edu/" target="_blank">San Diego State University</a>!</p><p><span>NCBI Hackathon</span>&nbsp;specifically looking for folks who have experience in computational virus hunting or adjacent fields to identify known, taxonomically-definable and novel viruses from a few hundred thousand metagenomic datasets that we&rsquo;ll put on cloud infrastructure. This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for virological analyses from high-throughput experiments. If this describes you, please&nbsp;<a href="https://goo.gl/forms/kDnSG0IAZD62XQRe2" target="_blank">apply</a>! The event is open to anyone selected for the hackathon and willing to travel to SDSU (see below).</p><p>https://ncbiinsights.ncbi.nlm.nih.gov/2018/11/09/ncbi-sdsu-virus-hunting-data-science-hackathon-january-2019/</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/42370/ncbi-blast-have-added-new-columns-to-the-descriptions</guid>
	<pubDate>Tue, 01 Dec 2020 09:56:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/42370/ncbi-blast-have-added-new-columns-to-the-descriptions</link>
	<title><![CDATA[NCBI BLAST have added new columns to the Descriptions]]></title>
	<description><![CDATA[<p><span>NCBI BLAST have added new columns to the Descriptions Table for web BLAST output. The new columns are&nbsp; Scientific Name, Common Name, Taxid, and Accession Length. Common Name and Accession Length are now part of the default display. You can click 'Select columns' or 'Manage columns' to add or remove columns from the display Your preferences will be saved for your next visit to BLAST, and when you download your results, whatever columns you have displayed will be saved. See the NCBI Insights post (</span><a href="https://go.usa.gov/x7fPE" target="_blank">https://go.usa.gov/x7fPE</a><span>) for more details.</span></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/2839/look-up-a-biological-numbers</guid>
	<pubDate>Fri, 23 Aug 2013 03:27:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/2839/look-up-a-biological-numbers</link>
	<title><![CDATA[Look up a biological numbers]]></title>
	<description><![CDATA[<p><strong>Did you ever need to look up a number</strong><span>&nbsp;like the volume of a cell or the cellular concentration of ATP, only to find yourself spending much more time than you wanted on the Internet or flipping through textbooks - all without much success?&nbsp;</span><br><br><span>Well, it didn&rsquo;t happen only to you. It is often surprising how difficult it can be to find concrete biological numbers, even for properties that have been measured numerous times. To help solve this for one and all, BioNumbers (</span><strong>the database of key numbers in molecular biology</strong><span>) was created. Along with the numbers, you'll find the relevant&nbsp;</span><strong>references to the original literature</strong><span>, useful comments, and related numbers.&nbsp;</span></p>
<p><span><span>To cite BioNumbers please refer to: Milo et al. Nucl. Acids Res. (2010) 38: D750-D753. When using a specific entry from the database it is highly recommended that you also specify the BioNumbers 6 digit ID, e.g. "BNID 100986, Milo et al 2010".&nbsp;</span></span></p><p>Address of the bookmark: <a href="http://bionumbers.hms.harvard.edu/" rel="nofollow">http://bionumbers.hms.harvard.edu/</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/4209/enzyme-portal</guid>
	<pubDate>Tue, 03 Sep 2013 18:06:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/4209/enzyme-portal</link>
	<title><![CDATA[Enzyme Portal]]></title>
	<description><![CDATA[<p><span>Enzyme Portal-&nbsp;To look for information about the biology of a protein with enzymatic activity.</span></p>
<p><span>The enzyme portal integrates many resources, most of them hosted by EBI and also external ones such as BioPortal. Its main goal is to provide information about enzymes in a suitable format, with a usable interface designed for intended users. Instead of reinventing the wheel, it makes use of available and reliable resources to that end.</span></p>
<p><span><strong>Related Literature</strong>:</span></p>
<p><span><a href="http://nar.oxfordjournals.org/content/41/D1/D773.full">http://nar.oxfordjournals.org/content/41/D1/D773.full</a></span></p>
<p><span><a href="http://www.biomedcentral.com/1471-2105/14/103">http://www.biomedcentral.com/1471-2105/14/103</a></span></p><p>Address of the bookmark: <a href="http://www.ebi.ac.uk/enzymeportal/" rel="nofollow">http://www.ebi.ac.uk/enzymeportal/</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39281/humcfs-a-database-of-fragile-sites-in-human-chromosomes</guid>
	<pubDate>Sun, 21 Apr 2019 20:17:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39281/humcfs-a-database-of-fragile-sites-in-human-chromosomes</link>
	<title><![CDATA[HumCFS: a database of fragile sites in human chromosomes]]></title>
	<description><![CDATA[<p>Fragile sites are specific chromosomal region that exhibit an increased frequency of chromosdomal breakge when cells are exposed to replicative stress. Since from the discovery of chromosomal fragile sites/regions (CFS), several line of evidence suggests their involvement in human pathologies and they have been recognized as a preferential site for integration of exogenous oncogenic DNA viruses and hotspots for chromosomal re-arrangement. There is large gap in our knowledge of human CFS region as knowledge about CFS are unequally distributed in literature, which impose a problem in studying these region. In order to address these issues, we develop this platform HumCFS, which provides comprehensive information about experimentally identified CFS at a single source.</p>
<p>https://link.springer.com/epdf/10.1186/s12864-018-5330-5?author_access_token=ICASEpyMAQaxLlKw--fyCG_BpE1tBhCbnbw3BuzI2RMA57KLmXk5bZabRUiDQzRFHXd6hjm4kWSiLV3mU5XVMitqXUwFMSo4x5vbfty0EDQ9PW1sd1h923_TYXkvJ5niSwAyZ7BklJ0ujFAFhcKtjw%3D%3D</p><p>Address of the bookmark: <a href="https://webs.iiitd.edu.in/raghava/humcfs/" rel="nofollow">https://webs.iiitd.edu.in/raghava/humcfs/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>

</channel>
</rss>