<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/35923?offset=30</link>
	<atom:link href="https://bioinformaticsonline.com/related/35923?offset=30" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/22770/blast-updated</guid>
	<pubDate>Tue, 16 Jun 2015 16:55:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/22770/blast-updated</link>
	<title><![CDATA[BLAST+ updated !!!]]></title>
	<description><![CDATA[<p>A new version (2.2.31) of the stand-alone BLAST executables (Linux, Windows and MacOSX on <a href="ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST">FTP</a>) is now available. New features include support for BLAST-XML2 specification (information <a href="ftp://ftp.ncbi.nlm.nih.gov/blast/documents/NEWXML/xml2.pdf">here</a>) and JSON BLAST output format, as well as several bug fixes and improvements. The BLAST AMI at AWS will also be updated to 2.2.31 (see this BLAST Help page for more <a href="http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&amp;PAGE_TYPE=BlastDocs&amp;DOC_TYPE=CloudBlast">information</a>). For a full list of improvements, see the <a href="http://www.ncbi.nlm.nih.gov/books/NBK131777">release notes</a>.</p><p>More at http://www.ncbi.nlm.nih.gov/news/06-16-2015-blast-plus-update/?</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31526/sequenceserver</guid>
	<pubDate>Fri, 10 Mar 2017 08:51:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31526/sequenceserver</link>
	<title><![CDATA[sequenceserver]]></title>
	<description><![CDATA[<p><span>SequenceServer lets you rapidly set up a BLAST+ server with an intuitive user interface for use locally or over the web.</span></p>
<p><span><span>More at&nbsp;</span><a href="http://sequenceserver.com/">http://sequenceserver.com</a><span>.</span></span></p><p>Address of the bookmark: <a href="https://github.com/wurmlab/sequenceserver" rel="nofollow">https://github.com/wurmlab/sequenceserver</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/39865/blast-nr-version-5-database-nr-v5</guid>
	<pubDate>Fri, 23 Aug 2019 11:35:35 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/39865/blast-nr-version-5-database-nr-v5</link>
	<title><![CDATA[BLAST nr version 5 database, (nr_v5)]]></title>
	<description><![CDATA[<p>NCBI have made changes the nr version 5 database, (nr_v5), to facilitate better search results and improved performance by reducing the number of redundant titles in the nr_v5 database used by webBLAST, which is also available for&nbsp;BLAST+ users.</p><p><span style="text-decoration: underline;"></span></p><p>The changes in nr preserve the taxonomic diversity of the entries in the database while reducing the number of titles for identical sequences. GenPept accessions are still accessible via&nbsp;<a href="http://www.ncbi.nlm.nih.gov/protein/$GENBANK_ACCESSION" target="_blank">www.ncbi.nlm.nih.gov/protein/$GENBANK_ACCESSION</a>&nbsp;or the IPG website&nbsp;<a href="https://www.ncbi.nlm.nih.gov/ipg/" target="_blank">https://www.ncbi.nlm.nih.gov/ipg/</a>.<span style="text-decoration: underline;"></span><span style="text-decoration: underline;"></span></p><p>The "Identical Proteins" link in the alignments section of the webBLAST results takes you to a full list of all accessions associated with a sequence.</p><p><span style="text-decoration: underline;"></span></p><p>For&nbsp;BLAST+ users downloading nr_v5: the database is now approximately 50% smaller, resulting in faster downloads and&nbsp;BLAST&nbsp;searches, and smaller disk space requirements. The database is downloadable at: &nbsp;<a href="ftp://ftp.ncbi.nlm.nih.gov/blast/db/v5/" target="_blank">ftp://ftp.ncbi.nlm.nih.gov/blast/db/v5/</a></p><p><span style="text-decoration: underline;"></span></p><p>For&nbsp;BLAST+ there is a cleanup script to help you manage the transition to this smaller database. The script removes unused database volumes:&nbsp;<a href="ftp://ftp.ncbi.nlm.nih.gov/blast/temp/cleanup-blastdb-volumes.py" target="_blank">ftp://ftp.ncbi.nlm.nih.gov/blast/temp/cleanup-blastdb-volumes.py</a></p><p><span style="text-decoration: underline;"></span></p><p>Here are the new rules on how we keep titles in nr_v5:</p><p><span style="text-decoration: underline;"></span></p><p>1.&nbsp;&nbsp;&nbsp; We keep all refseq, swissprot, pir and PDB titles.<span style="text-decoration: underline;"></span><span style="text-decoration: underline;"></span></p><p>2.&nbsp; &nbsp;&nbsp;We keep any GenPept titles with a TAXID that has not already been seen in the record.<span style="text-decoration: underline;"></span><span style="text-decoration: underline;"></span></p><p>3.&nbsp; &nbsp;&nbsp;We keep at least five GenPept titles regardless of whether the TAXIDS have been seen before or not in this record.</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/41586/primer-blast</guid>
	<pubDate>Tue, 28 Apr 2020 00:28:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/41586/primer-blast</link>
	<title><![CDATA[Primer BLAST !]]></title>
	<description><![CDATA[<p>BLAST team added a new feature (Max 3' match), shown in Figure 1, to Primer-BLAST that limits the length of 3' exon matches when designing exon-exon spanning primers. This makes it less likely that primers specifically designed to amplify transcripts will also amplify genomic DNA contamination in expression assays. See the NCBI Insights post (<a href="https://go.usa.gov/xvUT4" target="_blank"><span>https://go.usa.gov/xvUT4</span></a>) for more details.</p><p>&nbsp;</p><p><span>If you have any questions or concerns, please contact&nbsp;<a href="mailto:blast-help@ncbi.nlm.nih.gov" target="_blank" title="Follow link">blast-help@ncbi.nlm.nih.gov<sup><span><img src="https://mail.google.com/mail/u/0?ui=2&amp;ik=024a8aa0b9&amp;attid=0.1&amp;permmsgid=msg-f:1665129030912557674&amp;th=171bba0808bbc26a&amp;view=fimg&amp;sz=s0-l75-ft&amp;attbid=ANGjdJ-yC7WlxAuBOITc1ND1AN0YIdrtaQ3utEJuH_vnvOTM3uh8Wwn652wjlqDQ6HJOKApVPRJNpBRVd3H_AisXJXRWtzl0Y9alARMC05_yINEwa2lkBGoA7Q93-GU&amp;disp=emb" width="13" height="12" alt="image" style="border: 0px;"></span></sup></a></span></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/44515/cleaner-blast-databases-for-more-accurate-results</guid>
	<pubDate>Tue, 23 Apr 2024 01:23:08 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/44515/cleaner-blast-databases-for-more-accurate-results</link>
	<title><![CDATA[Cleaner BLAST Databases for More Accurate Results]]></title>
	<description><![CDATA[<p>Do you use&nbsp;<a href="https://blast.ncbi.nlm.nih.gov/Blast.cgi?utm_source=ncbi_insights&amp;utm_medium=referral&amp;utm_campaign=blast-cleaner-20240422">BLAST</a><span style="font-size: 12.8px; font-weight: normal;">&nbsp;to identify a sequence or the evolutionary scope of a gene? That can be challenging if contaminated and misclassified sequences are in the BLAST databases and show up in your search results. To address</span><span style="font-size: 12.8px; font-weight: normal;">&nbsp;this problem</span><span style="font-size: 12.8px; font-weight: normal;">, we now use the NCBI quality assurance tools listed below to systematically remove these misleading sequences from the default nucleotide (nt) and protein (nr) BLAST databases.</span><span style="font-size: 12.8px; font-weight: normal;">&nbsp;</span></p><div><ul>
<li><a href="https://github.com/ncbi/fcs">Foreign Contamination Screen tool for genome cross-species screening (FCS-GX)</a>&nbsp;detects contamination from foreign organisms in genomes and other sequences using the genome cross-species aligner (GX)&nbsp;</li>
<li><a href="https://ncbiinsights.ncbi.nlm.nih.gov/2022/05/27/ani-for-assembly-validation?utm_source=ncbi_insights&amp;utm_medium=referral&amp;utm_campaign=blast-cleaner-20240422">Average Nucleotide Identity (ANI)</a>&nbsp;evaluates the taxonomic classification of prokaryotic genome assemblies. Sequences from genomes marked up as &lsquo;unverified source organism&rsquo; are considered suspect and removed.&nbsp;</li>
</ul><p>Ref&nbsp;https://ncbiinsights.ncbi.nlm.nih.gov/2024/04/22/cleaner-blast-databases-more-accurate-results/</p></div>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43424/rest-api</guid>
	<pubDate>Mon, 04 Oct 2021 12:46:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43424/rest-api</link>
	<title><![CDATA[REST API]]></title>
	<description><![CDATA[<h3 id="PSIBLASTHelpandDocumentation-RESTAPI">REST API</h3><p>The&nbsp;<a href="https://www.ebi.ac.uk/seqdb/confluence/pages/viewpage.action?pageId=68165098">Representational State Transfer (REST)</a>&nbsp;sample clients are provided for a number of programming languages. For details of how to use these clients,&nbsp;<a href="https://github.com/ebi-wp/webservice-clients">download</a>&nbsp;the client and run the program without any arguments.</p><div><table><colgroup><col><col><col></colgroup>
<thead>
<tr><th scope="col">
<div>Language</div>
</th><th scope="col">
<div>Download</div>
</th><th scope="col">
<div>Requirements</div>
</th></tr>
</thead>
<tbody>
<tr><th>Perl</th>
<td><a href="https://raw.githubusercontent.com/ebi-wp/webservice-clients/master/perl/psiblast.pl">psiblast.pl</a></td>
<td><a href="http://search.cpan.org/perldoc?LWP">LWP</a>&nbsp;and&nbsp;<a href="http://search.cpan.org/perldoc?XML::Simple">XML::Simple</a></td>
</tr>
<tr><th colspan="1">
<h4 id="PSIBLASTHelpandDocumentation-Python">Python</h4>
</th>
<td colspan="1">
<p><a href="https://raw.githubusercontent.com/ebi-wp/webservice-clients/master/python/psiblast.py">psiblast.py</a></p>
</td>
<td colspan="1"><a href="https://pypi.python.org/pypi/xmltramp2/3.0.10" title="https://pypi.python.org/pypi/xmltramp2/3.0.10">xmltramp2</a></td>
</tr>
</tbody>
</table></div><p>For details see&nbsp;<a href="https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Environment+setup+for+REST+Web+Services">Environment setup for REST Web Services</a>&nbsp;and&nbsp;<a href="https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Examples+for+Perl+REST+Web+Services+Clients">Examples for Perl REST Web Services Clients</a>&nbsp;pages.</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36893/beap-blast-extension-and-assembly-program</guid>
	<pubDate>Mon, 11 Jun 2018 04:52:56 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36893/beap-blast-extension-and-assembly-program</link>
	<title><![CDATA[BEAP: Blast Extension and Assembly Program]]></title>
	<description><![CDATA[The Blast Extension and Assembly Program (BEAP) is a computer program that uses a short starting DNA fragment, often a EST or partial gene segment, as "primer", to recursively blast nucleotide databases in an attempt to obtain all sequences that overlaps, directly or indirectly, with the "primer" therefore help to "extend" the length of the original sequence for constructing a "full length" sequence for functional analysis, or at least to obtain neighboring regions of the segment for SNP discovery and linkage disequilibrium analysis. The confidence of assembling the resulting sequences is achieved by using a known genome, such as human genome, as a reference.
 
https://www.animalgenome.org/tools/beap/<p>Address of the bookmark: <a href="https://www.animalgenome.org/tools/beap/" rel="nofollow">https://www.animalgenome.org/tools/beap/</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/17176/arvados</guid>
	<pubDate>Sat, 20 Sep 2014 16:54:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/17176/arvados</link>
	<title><![CDATA[Arvados]]></title>
	<description><![CDATA[<p>Arvados is a free and open&nbsp;source bioinformatics&nbsp;platform for genomic and&nbsp;biomedical data. User can&nbsp;Store | Organize | Compute | Share the data for free.&nbsp;</p>
<p><img src="https://arvados.org/images/dax.png" width="400" height="535" alt="image" style="border: 0px;"></p><p>Address of the bookmark: <a href="https://arvados.org/" rel="nofollow">https://arvados.org/</a></p>]]></description>
	<dc:creator>Martin Jones</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</guid>
	<pubDate>Fri, 13 Dec 2024 11:35:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44722/step-by-step-guide-to-running-genome-assembly</link>
	<title><![CDATA[Step-by-Step Guide to Running Genome Assembly]]></title>
	<description><![CDATA[<p>Genome assembly is a critical process in bioinformatics, enabling the reconstruction of an organism's genome from short DNA sequence reads. Whether you&rsquo;re working on a new microbial genome or a complex eukaryotic organism, this guide will walk you through the steps of genome assembly using state-of-the-art tools and best practices.</p><h4><strong>What is Genome Assembly?</strong></h4><p>Genome assembly involves piecing together short DNA sequence reads generated by sequencing platforms (e.g., Illumina, PacBio, Oxford Nanopore) into longer, contiguous sequences called contigs. This can be performed as:</p><ul>
<li><strong>De Novo Assembly</strong>: Without a reference genome.</li>
<li><strong>Reference-Guided Assembly</strong>: Using a reference genome to guide the assembly process.</li>
</ul><h4><strong>Step 1: Preparing Your Data</strong></h4><p>Before starting the assembly, ensure that your raw sequencing data is high quality.</p><ol>
<li>
<p><strong>Input Data</strong></p>
<ul>
<li><strong>Short Reads</strong>: Illumina sequencing generates short, accurate reads ideal for scaffolding.</li>
<li><strong>Long Reads</strong>: PacBio and Nanopore sequencing provide long reads for resolving repetitive regions.</li>
</ul>
</li>
<li>
<p><strong>Quality Control (QC)</strong><br />Use tools like <strong>FastQC</strong> or <strong>MultiQC</strong> to assess the quality of your reads:</p>
<div>
<div dir="ltr"><code>fastqc reads.fastq multiqc . </code></div>
</div>
<p>Look for issues like low-quality bases, adapter contamination, or overrepresented sequences.</p>
</li>
<li>
<p><strong>Read Trimming and Filtering</strong><br />Trim low-quality bases and adapters using <strong>Trimmomatic</strong> or <strong>Cutadapt</strong>:</p>
<div>
<div dir="ltr"><code>trimmomatic PE reads_R1.fastq reads_R2.fastq trimmed_R1.fastq trimmed_R2.fastq \ ILLUMINACLIP:adapters.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36 </code></div>
</div>
</li>
</ol><h4><strong>Step 2: Choosing an Assembly Strategy</strong></h4><p>Select an assembly strategy based on your data type:</p><ul>
<li>
<p><strong>Short-Read Assemblers</strong>:</p>
<ul>
<li>SPAdes: Popular for microbial genomes.</li>
<li>Velvet: Fast for smaller genomes.</li>
</ul>
</li>
<li>
<p><strong>Long-Read Assemblers</strong>:</p>
<ul>
<li>Canu: Ideal for long-read datasets.</li>
<li>Flye: Versatile for small and large genomes.</li>
</ul>
</li>
<li>
<p><strong>Hybrid Assemblers</strong>:</p>
<ul>
<li>MaSuRCA: Combines short and long reads.</li>
<li>Unicycler: Optimized for bacterial genomes.</li>
</ul>
</li>
</ul><h4><strong>Step 3: Running the Assembly</strong></h4><h5><strong>3.1. SPAdes (Short-Read Assembly)</strong></h5><p>SPAdes is an excellent choice for small genomes, such as bacteria.</p><div><div dir="ltr"><code>spades.py -1 trimmed_R1.fastq -2 trimmed_R2.fastq -o spades_output </code></div></div><p>The output includes assembled contigs (<code>contigs.fasta</code>) and scaffolds (<code>scaffolds.fasta</code>).</p><h5><strong>3.2. Canu (Long-Read Assembly)</strong></h5><p>Canu is designed for high-error long reads from PacBio or Nanopore.</p><div><div dir="ltr"><code>canu -p genome -d canu_output genomeSize=4.7m -nanopore-raw reads.fastq </code></div></div><p>The output will be in <code>canu_output/genome.contigs.fasta</code>.</p><h5><strong>3.3. Hybrid Assembly with Unicycler</strong></h5><p>Unicycler combines short and long reads for improved assemblies.</p><div><div dir="ltr"><code>unicycler -1 trimmed_R1.fastq -2 trimmed_R2.fastq -l long_reads.fastq -o unicycler_output </code></div></div><h4><strong>Step 4: Assessing Assembly Quality</strong></h4><p>After assembly, evaluate its quality using the following tools:</p><ol>
<li>
<p><strong>QUAST</strong><br />QUAST generates assembly statistics, such as N50, genome size, and GC content:</p>
<div>
<div dir="ltr"><code>quast contigs.fasta -o quast_output </code></div>
</div>
</li>
<li>
<p><strong>BUSCO</strong><br />BUSCO checks genome completeness by identifying conserved genes:</p>
<div>
<div dir="ltr"><code>busco -i contigs.fasta -o busco_output -l fungi_odb10 -m genome </code></div>
</div>
</li>
<li>
<p><strong>Assembly Graph Visualization</strong><br />Visualize assembly graphs with <strong>Bandage</strong>:</p>
<div>
<div dir="ltr"><code>Bandage load assembly_graph.gfa </code></div>
</div>
</li>
</ol><hr><h4><strong>Step 5: Post-Assembly Steps</strong></h4><ol>
<li>
<p><strong>Polishing</strong><br />Improve assembly accuracy using tools like <strong>Pilon</strong> (for short reads) or <strong>Racon</strong> (for long reads).</p>
<div>
<div dir="ltr"><code>racon long_reads.fasta mapped_reads.sam contigs.fasta &gt; polished_contigs.fasta </code></div>
</div>
</li>
<li>
<p><strong>Scaffolding</strong><br />Link contigs into scaffolds using tools like <strong>SSPACE</strong> or <strong>Opera-LG</strong> if required.</p>
</li>
<li>
<p><strong>Annotation</strong><br />Annotate the assembled genome using <strong>Prokka</strong> for prokaryotes or <strong>Maker</strong> for eukaryotes.</p>
<div>
<div dir="ltr"><code>prokka --outdir annotation_output --prefix genome contigs.fasta </code></div>
</div>
</li>
</ol><h4><strong>Step 6: Sharing and Archiving</strong></h4><ol>
<li>
<p><strong>Submit to Public Repositories</strong><br />Share your assembly in databases like <strong>NCBI GenBank</strong>, <strong>ENA</strong>, or <strong>DDBJ</strong>.</p>
</li>
<li>
<p><strong>Metadata Preparation</strong><br />Include detailed metadata for your submission, such as organism name, sequencing platform, and coverage.</p>
</li>
</ol><h4><strong>Best Practices</strong></h4><ul>
<li>Always perform quality checks at each stage to ensure data integrity.</li>
<li>Use multiple tools to cross-validate results when working with complex genomes.</li>
<li>Document parameters and software versions for reproducibility.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Genome assembly is a powerful process that transforms raw sequencing data into a coherent representation of an organism&rsquo;s genome. By following this step-by-step guide, you can successfully assemble genomes and uncover valuable biological insights. Whether you&rsquo;re assembling a microbial genome or tackling the complexities of a eukaryotic genome, these tools and strategies will set you on the path to success.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40754/understanding-your-reads-and-mapping</guid>
	<pubDate>Wed, 29 Jan 2020 06:29:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40754/understanding-your-reads-and-mapping</link>
	<title><![CDATA[Understanding your reads and mapping !]]></title>
	<description><![CDATA[<p>One of the best tutorial for beginners ...</p>
<p>https://bioinformatics-core-shared-training.github.io/cruk-summer-school-2017/Day1/Session4-seqIntro.html</p><p>Address of the bookmark: <a href="https://bioinformatics-core-shared-training.github.io/cruk-summer-school-2017/Day1/Session4-seqIntro.html" rel="nofollow">https://bioinformatics-core-shared-training.github.io/cruk-summer-school-2017/Day1/Session4-seqIntro.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>