<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/42530?offset=0</link>
	<atom:link href="https://bioinformaticsonline.com/related/42530?offset=0" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34246/unicycler-hybrid-assembly-pipeline-for-bacterial-genomes</guid>
	<pubDate>Fri, 10 Nov 2017 03:58:27 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34246/unicycler-hybrid-assembly-pipeline-for-bacterial-genomes</link>
	<title><![CDATA[Unicycler: Hybrid assembly pipeline for bacterial genomes]]></title>
	<description><![CDATA[<p><span>Unicycler is an assembly pipeline for bacterial genomes. It can assemble&nbsp;</span><a href="http://www.illumina.com/">Illumina</a><span>-only read sets where it functions as a&nbsp;</span><a href="http://cab.spbu.ru/software/spades/">SPAdes</a><span>-optimiser. It can also assembly long-read-only sets (</span><a href="http://www.pacb.com/">PacBio</a><span>&nbsp;or&nbsp;</span><a href="https://nanoporetech.com/">Nanopore</a><span>) where it runs a&nbsp;</span><a href="https://github.com/lh3/miniasm">miniasm</a><span>+</span><a href="https://github.com/isovic/racon">Racon</a><span>&nbsp;pipeline. For the best possible assemblies, give it both Illumina reads&nbsp;</span><em>and</em><span>&nbsp;long reads, and it will conduct a hybrid assembly.</span></p><p>Address of the bookmark: <a href="https://github.com/rrwick/Unicycler" rel="nofollow">https://github.com/rrwick/Unicycler</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44561/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</guid>
	<pubDate>Sat, 08 Jun 2024 16:25:08 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44561/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</link>
	<title><![CDATA[Bactopia: a flexible pipeline for complete analysis of bacterial genomes]]></title>
	<description><![CDATA[<p>Bactopia is a flexible pipeline for complete analysis of bacterial genomes. The goal of Bactopia is process your data with a broad set of tools, so that you can get to the fun part of analyses quicker!</p>
<p>Bactopia was inspired by&nbsp;<a href="https://staphopia.github.io/">Staphopia</a>, a workflow we (Tim Read and myself) released that is targeted towards&nbsp;<em>Staphylococcus aureus</em>&nbsp;genomes. Using what we learned from Staphopia and user feedback, Bactopia was developed from scratch with usability, portability, and speed in mind from the start.</p>
<p>Bactopia uses&nbsp;<a href="https://www.nextflow.io/">Nextflow</a>&nbsp;to manage the workflow, allowing for support of many types of environments (e.g. cluster or cloud). Bactopia allows for the usage of many public datasets as well as your own datasets to further enhance the analysis of your sequencing. Bactopia only uses software packages available from&nbsp;<a href="https://bioconda.github.io/">Bioconda</a>&nbsp;and&nbsp;<a href="https://conda-forge.org/">Conda-Forge</a>&nbsp;to make installation as simple as possible for&nbsp;<em>all</em>&nbsp;users.</p>
<p>To highlight the use of&nbsp;<a href="https://bactopia.github.io/latest/full-guide/">Bactopia</a>&nbsp;and&nbsp;<a href="https://bactopia.github.io/latest/bactopia-tools/">Bactopia Tools</a>, we performed an analysis of 1,664 public&nbsp;<em>Lactobacillus</em>&nbsp;genomes, focusing on&nbsp;<em>Lactobacillus crispatus</em>, a species that is a common part of the human vaginal microbiome. The results from this analysis are published in mSystems under the title:&nbsp;<em><a href="https://doi.org/10.1128/mSystems.00190-20">Bactopia: a flexible pipeline for complete analysis of bacterial genomes</a></em></p>
<p><a href="https://bactopia.github.io/latest/assets/bactopia-workflow.png"><img src="https://bactopia.github.io/latest/assets/bactopia-workflow.png" alt="Bactopia Workflow" style="border: 0px;"></a></p><p>Address of the bookmark: <a href="https://bactopia.github.io/latest/" rel="nofollow">https://bactopia.github.io/latest/</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44539/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</guid>
	<pubDate>Wed, 15 May 2024 14:36:12 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44539/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</link>
	<title><![CDATA[Bactopia: a Flexible Pipeline for Complete Analysis of Bacterial Genomes]]></title>
	<description><![CDATA[<p dir="auto">Bactopia is a flexible pipeline for complete analysis of bacterial genomes. The goal of Bactopia is to process your data with a broad set of tools, so that you can get to the fun part of analyses quicker!</p>
<p dir="auto">Bactopia can be split into two main parts:&nbsp;<a href="https://bactopia.github.io/latest/beginners-guide/">Bactopia Analysis Pipeline</a>, and&nbsp;<a href="https://bactopia.github.io/latest/bactopia-tools/">Bactopia Tools</a>.</p>
<p dir="auto">Bactopia Analysis Pipeline is the main&nbsp;<em>per-isolate</em>&nbsp;workflow in Bactopia. Built with&nbsp;<a href="https://www.nextflow.io/">Nextflow</a>, input FASTQs (local or available from SRA/ENA) are put through numerous analyses including: quality control, assembly, annotation, minmer sketch queries, sequence typing, and more.</p>
<p dir="auto"><a href="https://github.com/bactopia/bactopia/blob/master/data/bactopia-workflow.png" target="_blank"><img src="https://github.com/bactopia/bactopia/raw/master/data/bactopia-workflow.png" alt="Bactopia Overview" style="border: 0px;"></a></p>
<p dir="auto">Bactopia Tools are a set a independent workflows fo</p><p>Address of the bookmark: <a href="https://github.com/bactopia/bactopia" rel="nofollow">https://github.com/bactopia/bactopia</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36800/genomemapper-simultaneous-alignment-of-short-reads-against-multiple-genomes</guid>
	<pubDate>Fri, 25 May 2018 09:29:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36800/genomemapper-simultaneous-alignment-of-short-reads-against-multiple-genomes</link>
	<title><![CDATA[GenomeMapper: Simultaneous alignment of short reads against multiple genomes]]></title>
	<description><![CDATA[GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. It can be used to align against multiple genomes simulanteously or against a single reference. If you are unsure which one is the appropriate GenomeMapper, you might want to use the latter

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2768987/<p>Address of the bookmark: <a href="http://1001genomes.org/software/genomemapper.html" rel="nofollow">http://1001genomes.org/software/genomemapper.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26906/paired-end-assembler-for-dna-sequences</guid>
	<pubDate>Wed, 06 Apr 2016 05:25:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26906/paired-end-assembler-for-dna-sequences</link>
	<title><![CDATA[PAired-eND Assembler for DNA sequences]]></title>
	<description><![CDATA[<p>PANDASEQ is a program to align Illumina reads, optionally with PCR primers embedded in the sequence, and reconstruct an overlapping sequence.</p>
<p>&nbsp;</p>
<p>More at https://github.com/neufeld/pandaseq</p><p>Address of the bookmark: <a href="https://github.com/neufeld/pandaseq" rel="nofollow">https://github.com/neufeld/pandaseq</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</guid>
	<pubDate>Sat, 16 Jan 2021 21:42:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</link>
	<title><![CDATA[Protocol for De novo Genome Assembly using Illumina Reads]]></title>
	<description><![CDATA[<p>In this protocol, we address and describe the de novo assembly method for small to medium-sized genomes.</p><p><strong>What is de novo genome assembly?<br /></strong>The method of taking a large number of short DNA sequences and placing them back together to create a reflection of the original chromosomes from which the DNA originated relates to genome assembly. No previous knowledge of the source DNA sequence length, structure or composition is inferred by De novo genome assemblies. The DNA of the target organism is split up into millions of tiny parts and read on a sequencing computer in a genome sequencing experiment. Depending on the sequencing system used, these "reads" range from 20 to 1000 nucleotide base pairs (bp) in length. Usually, length reads of 36 - 150 bp are produced for Illumina style short read sequencing. These reads can be either &ldquo;single ended&rdquo; as described above or &ldquo;paired end.&rdquo;</p><p><strong>Why genome assembly?</strong><br />In basic research into why and how they live, as well as in applied topics, identifying the DNA sequence of an organism is useful. Awareness of a DNA sequence may be useful in virtually any biological research because of the relevance of DNA to living things. For example, it may be used in medicine to classify, diagnose and eventually improve genetic disorder therapies. Similarly, pathogens study can lead to treatments for infectious diseases.</p><p><strong>Raw NGS data</strong><br />Reads can be saved as a Fasta file as text or in a FastQ file with their attributes.&nbsp;FastQ is the most common read file format since this is what the Illumina sequencing pipeline creates. This will henceforth be the subject of our conversation.</p><p><strong>In a nutshell the protocol:</strong> <br />Get the sequence file(s) read from the sequencing machine (s). <br />Look at the readings - have an idea of what you have and what the standard is like. <br />If required, raw data cleanup/quality trimming. <br />Choose an adequate parameter set for assembly. <br />Assemble the data into scaffolds/contigs. <br />Examine the assembly performance and determine the efficiency of the assembly.</p><p><strong>Read Quality Control:</strong><br />Check the qualiy with fastQC.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42540/install-fastqc-using-conda</p><p>Quality trimming/cleanup of read files.<br />This function trims adapters, barcodes and other contaminants from the reads.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42542/trimmomatic-command</p><p><strong>Genome Assembly:</strong><br />The object of this portion of the protocol is to explain the method of assembling the reads trimmed by quality into draft contigs.</p><blockquote><p>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o result_of_spades_assembly_all_illumina</p></blockquote><p>A significant range of short-read assemblers are available. Everyone with strengths and disadvantages of their own. <br /><em>Some of the assemblers available include:</em><br />Velvet<br />SOAP-denovo<br />MIRA<br />ALLPATHS</p><p>Next step is to assess the suitability and what to do with a draft package of contiguous details for the remainder of the study now.&nbsp;Few stuff you can note about the contigs you just created:&nbsp;They're the draft Contigs. Any mis-assemblies can occur.</p><p><strong>Mis-assembly checking and assembly metric tools:</strong><br />QUAST - Quality assessment tool for genome assembly http://bioinf.spbau.ru/quast<br />Mauve assembly metrics - http://code.google.com/p/ngopt/wiki/How_To_Score_Genome_Assemblies_with_Mauve<br />InGAP-SV - https://sites.google.com/site/nextgengenomics/ingap and http://ingap.sourceforge.net/<br />inGAP is also useful for finding structural variants between genomes from read mappings.</p><p><strong>Genome finishing tools:</strong><br />Semi-automated gap fillers:<br />Gap filler - http://www.baseclear.com/landingpages/basetools-a-wide-range-of-bioinformatics-solutions/gapfiller/</p><p>IMAGE (V2) - http://sourceforge.net/apps/mediawiki/image2/index.php?title=Main_Page</p><p><strong>Genome visualisers and editors:</strong><br />Artemis - http://www.sanger.ac.uk/resources/software/artemis/<br />IGV - http://www.broadinstitute.org/igv/</p><p><strong>Automated and semi automated annotation tools:</strong><br />Prokka - https://github.com/tseemann/prokka<br />RAST - http://www.nmpdr.org/FIG/wiki/view.cgi/FIG/RapidAnnotationServer<br />JCVI Annotation Service - http://www.jcvi.org/cms/research/projects/annotation-service/</p><p><strong>Frequent command use for the analysis are at:</strong></p><p>https://bioinformaticsonline.com/blog/view/38765/list-of-tools-frequently-used-while-genome-assembly<br />https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36630/frequent-paired-end-reads-pe-2x100-mapping-command-lines</guid>
	<pubDate>Tue, 15 May 2018 08:59:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36630/frequent-paired-end-reads-pe-2x100-mapping-command-lines</link>
	<title><![CDATA[Frequent Paired-end reads (PE 2x100) mapping command lines]]></title>
	<description><![CDATA[
<p>bowtie2 -x hs37m -X 650 -q -1 r1.fq -2 r2.fq -S r12.bowtie2.sam  </p>

<p>bwa aln hs37m.fa r1.fq &gt; r1.sai &amp;&amp; bwa aln hs37m.fa r2.fq &gt; r2.sai \  <br />    &amp;&amp; bwa sampe hs37m r1.sai r2.sai r1.fq r2.fq &gt; r12.bwa.sam  </p>

<p>bwa bwasw ../index/bwa/hs37m.fa r12.fq &gt; r12.bwasw.sam  </p>

<p>gsnap -A sam -d hs37m r1.fq r2.fq &gt; r12.gsnap.sam  </p>

<p>novoalign -r Random -o SAM -f r1.fq r2.fq -i 500 50 -d hs37m-k14s3.novo &gt; r12.novo.sam  </p>

<p>smalt map -f samsoft -i 650 -o r12.smalt-k20s13.sam hs37m-k20s13 r1.fq r2.fq  </p>

<p>stampy.py -g hs37m -h hs37m -o r12.stampy.sam -M r1.fq,r2.fq  </p>

<p>soap -D hs37m.fa.index -a r1.fq -b r2.fq -l 32 -g 3 -u dummy -2 dummy -o r12.soap</p>
]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37563/colormap-correcting-long-reads-by-mapping-short-reads</guid>
	<pubDate>Mon, 20 Aug 2018 14:17:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37563/colormap-correcting-long-reads-by-mapping-short-reads</link>
	<title><![CDATA[CoLoRMap: Correcting Long Reads by Mapping short reads]]></title>
	<description><![CDATA[<p><span>Second generation sequencing technologies paved the way to an exceptional increase in the number of sequenced genomes, both prokaryotic and eukaryotic. However, short reads are difficult to assemble and often lead to highly fragmented assemblies. The recent developments in long reads sequencing methods offer a promising way to address this issue. However, so far long reads are characterized by a high error rate, and assembling from long reads require a high depth of coverage. This motivates the development of hybrid approaches that leverage the high quality of short reads to correct errors in long reads.We introduce CoLoRMap, a hybrid method for correcting noisy long reads, such as the ones produced by PacBio sequencing technology, using high-quality Illumina paired-end reads mapped onto the long reads. Our algorithm is based on two novel ideas: using a classical shortest path algorithm to find a sequence of overlapping short reads that minimizes the edit score to a long read and extending corrected regions by local assembly of unmapped mates of mapped short reads. Our results on bacterial, fungal and insect data sets show that CoLoRMap compares well with existing hybrid correction methods.The source code of CoLoRMap is freely available for non-commercial use at https://github.com/sfu-compbio/colormap</span></p>
<p><span>ehaghshe@sfu.ca or cedric.chauve@sfu.ca</span></p><p>Address of the bookmark: <a href="https://github.com/sfu-compbio/colormap" rel="nofollow">https://github.com/sfu-compbio/colormap</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39867/gepard-allows-the-calculation-of-dotplots-even-for-large-sequences-like-chromosomes-or-bacterial-genomes</guid>
	<pubDate>Mon, 26 Aug 2019 11:38:30 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39867/gepard-allows-the-calculation-of-dotplots-even-for-large-sequences-like-chromosomes-or-bacterial-genomes</link>
	<title><![CDATA[Gepard: allows the calculation of dotplots even for large sequences like chromosomes or bacterial genomes]]></title>
	<description><![CDATA[<p>Gepard (German: "cheetah", Backronym for "GEnome PAir - Rapid Dotter") allows the calculation of dotplots even for large sequences like chromosomes or bacterial genomes. Reference: Krumsiek J, Arnold R, Rattei T. Gepard: A rapid and sensitive tool for creating dotplots on genome scale. Bioinformatics 2007; 23(8): 1026-8. PMID:&nbsp;<a href="http://www.ncbi.nlm.nih.gov/pubmed/17309896" target="_blank">17309896</a></p>
<p><a href="http://cube.univie.ac.at/gepard">http://cube.univie.ac.at/gepard</a></p><p>Address of the bookmark: <a href="https://github.com/univieCUBE/gepard" rel="nofollow">https://github.com/univieCUBE/gepard</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26909/sequence-assembly-with-mira-4</guid>
	<pubDate>Wed, 06 Apr 2016 08:21:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26909/sequence-assembly-with-mira-4</link>
	<title><![CDATA[Sequence assembly with MIRA 4]]></title>
	<description><![CDATA[<p>MIRA is a multi-pass DNA sequence data assembler/mapper for whole genome and EST/RNASeq projects. MIRA assembles/maps reads gained by</p>
<div>
<ul>
<li>
<p>electrophoresis sequencing (aka Sanger sequencing)</p>
</li>
<li>
<p>454 pyro-sequencing (GS20, FLX or Titanium)</p>
</li>
<li>
<p>Ion Torrent</p>
</li>
<li>
<p>Solexa (Illumina) sequencing</p>
</li>
<li>
<p>(in development) Pacific Biosciences sequencing</p>
</li>
</ul>
</div>
<p>into contiguous sequences (called <span><em>contigs</em></span>). One can use the sequences of different sequencing technologies either in a single assembly run (a <span><em>true hybrid assembly</em></span>) or by mapping one type of data to an assembly of other sequencing type (a <span><em>semi-hybrid assembly (or mapping)</em></span>) or by mapping a data against consensus sequences of other assemblies (a <span><em>simple mapping</em></span>).</p>
<p>The MIRA acronym stands for <span><strong>M</strong></span>imicking <span><strong>I</strong></span>ntelligent <span><strong>R</strong></span>ead <span><strong>A</strong></span>ssembly and the program pretty well does what its acronym says (well, most of the time anyway). It is the Swiss army knife of sequence assembly that I've used and developed during the past 14 years to get assembly jobs I work on done efficiently - and especially accurately. That is, without me actually putting too much manual work into it.</p>
<p>More at http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html</p><p>Address of the bookmark: <a href="http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html" rel="nofollow">http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html</a></p>]]></description>
	<dc:creator>Priya Singh</dc:creator>
</item>

</channel>
</rss>