<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/34528?offset=160</link>
	<atom:link href="https://bioinformaticsonline.com/related/34528?offset=160" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/42559/sample-bandage-input-file-for-visual-analysis</guid>
	<pubDate>Wed, 06 Jan 2021 03:51:50 -0600</pubDate>
	<link>https://bioinformaticsonline.com/file/view/42559/sample-bandage-input-file-for-visual-analysis</link>
	<title><![CDATA[Sample bandage input file for visual analysis]]></title>
	<description><![CDATA[<p>Sample bandage input file for visual analysis ...</p>]]></description>
	<dc:creator>Jit</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/42559" length="112199" type="text/plain" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42267/hapsolo-an-optimization-approach-for-removing-secondary-haplotigs-during-diploid-genome-assembly-and-scaffolding</guid>
	<pubDate>Mon, 26 Oct 2020 21:23:36 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42267/hapsolo-an-optimization-approach-for-removing-secondary-haplotigs-during-diploid-genome-assembly-and-scaffolding</link>
	<title><![CDATA[HapSolo: An optimization approach for removing secondary haplotigs during diploid genome assembly and scaffolding.]]></title>
	<description><![CDATA[<p><span>Despite marked recent improvements in long-read sequencing technology, the assembly of diploid genomes remains a difficult task. A major obstacle is distinguishing between alternative contigs that represent highly heterozygous regions. If primary and secondary contigs are not properly identified, the primary assembly will overrepresent both the size and complexity of the genome, which complicates downstream analysis such as scaffolding.</span></p>
<p><span>More at&nbsp;https://github.com/esolares/HapSolo</span></p><p>Address of the bookmark: <a href="https://github.com/esolares/HapSolo" rel="nofollow">https://github.com/esolares/HapSolo</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43057/hapsolo-an-optimization-approach-for-removing-secondary-haplotigs-during-diploid-genome-assembly-and-scaffolding</guid>
	<pubDate>Sat, 08 May 2021 21:25:00 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43057/hapsolo-an-optimization-approach-for-removing-secondary-haplotigs-during-diploid-genome-assembly-and-scaffolding</link>
	<title><![CDATA[HapSolo: An optimization approach for removing secondary haplotigs during diploid genome assembly and scaffolding]]></title>
	<description><![CDATA[<p><span>HapSolo, that identifies secondary contigs and defines a primary assembly based on multiple pairwise contig alignment metrics. HapSolo evaluates candidate primary assemblies using BUSCO scores and then distinguishes among candidate assemblies using a cost function. The cost function can be defined by the user but by default considers the number of missing, duplicated and single BUSCO genes within the assembly. HapSolo performs hill climbing to minimize cost over thousands of candidate assemblies.&nbsp;</span></p><p>Address of the bookmark: <a href="https://github.com/esolares/HapSolo" rel="nofollow">https://github.com/esolares/HapSolo</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43634/illumina-based-assembly-pipeline-steps</guid>
	<pubDate>Fri, 10 Dec 2021 06:22:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43634/illumina-based-assembly-pipeline-steps</link>
	<title><![CDATA[Illumina based assembly pipeline steps !]]></title>
	<description><![CDATA[<h3 id="illumina">Illumina<a href="https://nf-co.re/viralrecon#illumina"><span></span></a></h3><ol>
<li>Merge re-sequenced FastQ files (<a href="http://www.linfo.org/cat.html"><code>cat</code></a>)</li>
<li>Read QC (<a href="https://www.bioinformatics.babraham.ac.uk/projects/fastqc/"><code>FastQC</code></a>)</li>
<li>Adapter trimming (<a href="https://github.com/OpenGene/fastp"><code>fastp</code></a>)</li>
<li>Removal of host reads (<a href="http://ccb.jhu.edu/software/kraken2/"><code>Kraken 2</code></a>; <em>optional</em>)</li>
<li>Variant calling<ol>
<li>Read alignment (<a href="http://bowtie-bio.sourceforge.net/bowtie2/index.shtml"><code>Bowtie 2</code></a>)</li>
<li>Sort and index alignments (<a href="https://sourceforge.net/projects/samtools/files/samtools/"><code>SAMtools</code></a>)</li>
<li>Primer sequence removal (<a href="https://github.com/andersen-lab/ivar"><code>iVar</code></a>; <em>amplicon data only</em>)</li>
<li>Duplicate read marking (<a href="https://broadinstitute.github.io/picard/"><code>picard</code></a>; <em>optional</em>)</li>
<li>Alignment-level QC (<a href="https://broadinstitute.github.io/picard/"><code>picard</code></a>, <a href="https://sourceforge.net/projects/samtools/files/samtools/"><code>SAMtools</code></a>)</li>
<li>Genome-wide and amplicon coverage QC plots (<a href="https://github.com/brentp/mosdepth/"><code>mosdepth</code></a>)</li>
<li>Choice of multiple variant calling and consensus sequence generation routes (<a href="https://github.com/andersen-lab/ivar"><code>iVar variants and consensus</code></a>; <em>default for amplicon data</em> <em>||</em> <a href="http://samtools.github.io/bcftools/bcftools.html"><code>BCFTools</code></a>, <a href="https://github.com/arq5x/bedtools2/"><code>BEDTools</code></a>; <em>default for metagenomics data</em>)
<ul>
<li>Variant annotation (<a href="http://snpeff.sourceforge.net/SnpEff.html"><code>SnpEff</code></a>, <a href="http://snpeff.sourceforge.net/SnpSift.html"><code>SnpSift</code></a>)</li>
<li>Consensus assessment report (<a href="http://quast.sourceforge.net/quast"><code>QUAST</code></a>)</li>
<li>Lineage analysis (<a href="https://github.com/cov-lineages/pangolin"><code>Pangolin</code></a>)</li>
<li>Clade assignment, mutation calling and sequence quality checks (<a href="https://github.com/nextstrain/nextclade"><code>Nextclade</code></a>)</li>
<li>Individual variant screenshots with annotation tracks (<a href="https://asciigenome.readthedocs.io/en/latest/"><code>ASCIIGenome</code></a>)</li>
</ul>
</li>
<li>Intersect variants across callers (<a href="http://samtools.github.io/bcftools/bcftools.html"><code>BCFTools</code></a>)</li>
</ol></li>
<li><em>De novo</em> assembly<ol>
<li>Primer trimming (<a href="https://cutadapt.readthedocs.io/en/stable/guide.html"><code>Cutadapt</code></a>; <em>amplicon data only</em>)</li>
<li>Choice of multiple assembly tools (<a href="http://cab.spbu.ru/software/spades/"><code>SPAdes</code></a> <em>||</em> <a href="https://github.com/rrwick/Unicycler"><code>Unicycler</code></a> <em>||</em> <a href="https://github.com/GATB/minia"><code>minia</code></a>)
<ul>
<li>Blast to reference genome (<a href="https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch"><code>blastn</code></a>)</li>
<li>Contiguate assembly (<a href="https://www.sanger.ac.uk/science/tools/pagit"><code>ABACAS</code></a>)</li>
<li>Assembly report (<a href="https://github.com/BU-ISCIII/plasmidID"><code>PlasmidID</code></a>)</li>
<li>Assembly assessment report (<a href="http://quast.sourceforge.net/quast"><code>QUAST</code></a>)</li>
</ul>
</li>
</ol></li>
<li>Present QC and visualisation for raw read, alignment, assembly and variant calling results (<a href="http://multiqc.info/"><code>MultiQC</code></a>)</li>
</ol>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43850/merfin-improved-variant-filtering-assembly-evaluation-and-polishing-via-k-mer-validation</guid>
	<pubDate>Sun, 03 Apr 2022 20:35:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43850/merfin-improved-variant-filtering-assembly-evaluation-and-polishing-via-k-mer-validation</link>
	<title><![CDATA[Merfin: improved variant filtering, assembly evaluation and polishing via k-mer validation]]></title>
	<description><![CDATA[<p><span>Merfin, a&nbsp;</span><em>k</em><span>-mer based variant-filtering algorithm for improved accuracy in genotyping and genome assembly polishing. Merfin evaluates each variant based on the expected&nbsp;</span><em>k</em><span>-mer multiplicity in the reads, independently of the quality of the read alignment and variant caller&rsquo;s internal score. Merfin increased the precision of genotyped calls in several benchmarks, improved consensus accuracy and reduced frameshift errors when applied to human and nonhuman assemblies built from Pacific Biosciences HiFi and continuous long reads or Oxford Nanopore reads, including the first complete human genome. Moreover, we introduce assembly quality and completeness metrics that account for the expected genomic copy numbers.</span></p>
<p><span>More at&nbsp;https://www.nature.com/articles/s41592-022-01445-y</span></p>
<p><img src="https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41592-022-01445-y/MediaObjects/41592_2022_1445_Fig1_HTML.png" alt="image" style="border: 0px; border: 0px;"></p><p>Address of the bookmark: <a href="https://github.com/arangrhie/merfin" rel="nofollow">https://github.com/arangrhie/merfin</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26306/busco</guid>
	<pubDate>Sun, 07 Feb 2016 16:02:39 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26306/busco</link>
	<title><![CDATA[BUSCO]]></title>
	<description><![CDATA[<p>Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs</p>
<p>More at http://busco.ezlab.org/</p><p>Address of the bookmark: <a href="http://busco.ezlab.org/" rel="nofollow">http://busco.ezlab.org/</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27113/picard</guid>
	<pubDate>Fri, 29 Apr 2016 08:21:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27113/picard</link>
	<title><![CDATA[Picard]]></title>
	<description><![CDATA[<p>Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. These file formats are defined in the <a href="http://samtools.github.io/hts-specs/">Hts-specs</a> repository. See especially the <a href="http://samtools.github.io/hts-specs/SAMv1.pdf">SAM specification</a> and the <a href="http://samtools.github.io/hts-specs/VCFv4.3.pdf">VCF specification</a>.</p>
<p>Note that the information on this page is targeted at end-users. For developers, the source code, building instructions and implementation/development resources are available on <a href="https://github.com/broadinstitute/picard">GitHub</a>.</p>
<p>The Picard toolkit is open-source under the <a href="https://tldrlegal.com/license/mit-license">MIT license</a> and free for all uses.</p>
<p>Enjoy!</p><p>Address of the bookmark: <a href="http://broadinstitute.github.io/picard/" rel="nofollow">http://broadinstitute.github.io/picard/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27076/ale-a-generic-assembly-likelihood-evaluation-framework-for-assessing-the-accuracy-of-genome-and-metagenome-assemblies</guid>
	<pubDate>Tue, 26 Apr 2016 03:38:43 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27076/ale-a-generic-assembly-likelihood-evaluation-framework-for-assessing-the-accuracy-of-genome-and-metagenome-assemblies</link>
	<title><![CDATA[ALE: a Generic Assembly Likelihood Evaluation Framework for Assessing the Accuracy of Genome and Metagenome Assemblies]]></title>
	<description><![CDATA[<p>Assembly Likelihood Evaluation (ALE) framework that overcomes these limitations, systematically evaluating the accuracy of an assembly in a reference-independent manner using rigorous statistical methods. This framework is comprehensive, and integrates read quality, mate pair orientation and insert length (for paired-end reads), sequencing coverage, read alignment and k-mer frequency. ALE pinpoints synthetic errors in both single and metagenomic assemblies, including single-base errors, insertions/deletions, genome rearrangements and chimeric assemblies presented in metagenomes. At the genome level with real-world data, ALE identifies three large misassemblies from the Spirochaeta smaragdinae finished genome, which were all independently validated by Pacific Biosciences sequencing. At the single-base level with Illumina data, ALE recovers 215 of 222 (97%) single nucleotide variants in a training set from a GC-rich Rhodobacter sphaeroides genome. Using real Pacific Biosciences data, ALE identifies 12 of 12 synthetic errors in a Lambda Phage genome, surpassing even Pacific Biosciences' own variant caller, EviCons. In summary, the ALE framework provides a comprehensive, reference-independent and statistically rigorous measure of single genome and metagenome assembly accuracy, which can be used to identify misassemblies or to optimize the assembly process.</p>
<p>More at&nbsp;http://www.ncbi.nlm.nih.gov/pubmed/23303509</p><p>Address of the bookmark: <a href="http://sc932.github.io/ALE/about.html" rel="nofollow">http://sc932.github.io/ALE/about.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29957/record</guid>
	<pubDate>Fri, 25 Nov 2016 08:23:36 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29957/record</link>
	<title><![CDATA[RECORD]]></title>
	<description><![CDATA[<p>Background. Next-generation sequencing technologies are now producing multiple times the genome size in total reads from a single experiment. This is enough information to reconstruct at least some of the differences between the individual genome studied in the experiment and the reference genome of the species. However, in most typical protocols, this information is disregarded and the reference genome is used. Results. We provide a new approach that allows researchers to reconstruct genomes very closely related to the reference genome (e.g., mutants of the same species) directly from the reads used in the experiment. Our approach applies de novo assembly software to experimental reads and so-called pseudoreads and uses the resulting contigs to generate a modified reference sequence. In this way, it can very quickly, and at no additional sequencing cost, generate new, modified reference sequence that is closer to the actual sequenced genome and has a full coverage. In this paper, we describe our approach and test its implementation called RECORD. We evaluate RECORD on both simulated and real data. We made our software publicly available on sourceforge. Conclusion. Our tests show that on closely related sequences RECORD outperforms more general assisted-assembly software.</p>
<p>More at&nbsp;https://sourceforge.net/projects/record-genome-assembler/files/</p><p>Address of the bookmark: <a href="https://www.ncbi.nlm.nih.gov/pubmed/26558255" rel="nofollow">https://www.ncbi.nlm.nih.gov/pubmed/26558255</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30076/sga-string-graph-assembler</guid>
	<pubDate>Thu, 08 Dec 2016 05:08:59 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30076/sga-string-graph-assembler</link>
	<title><![CDATA[SGA: String Graph Assembler]]></title>
	<description><![CDATA[<p><span>SGA is a de novo genome assembler based on the concept of string graphs. The major goal of SGA is to be very memory efficient, which is achieved by using a compressed representation of DNA sequence reads.</span></p>
<p><span>More at</span></p>
<p><span>https://github.com/jts/sga</span></p>
<p>SGA dependencies:<br> -google sparse hash library (http://code.google.com/p/google-sparsehash/)<br> -the bamtools library (https://github.com/pezmaster31/bamtools)<br> -zlib (http://www.zlib.net/)<br> -(optional but suggested) the jemalloc memory allocator (http://www.canonware.com/jemalloc/download.html)</p><p>Address of the bookmark: <a href="https://github.com/jts/sga" rel="nofollow">https://github.com/jts/sga</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>