<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37957?offset=20</link>
	<atom:link href="https://bioinformaticsonline.com/related/37957?offset=20" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28809/kissplice</guid>
	<pubDate>Tue, 16 Aug 2016 08:34:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28809/kissplice</link>
	<title><![CDATA[KisSplice]]></title>
	<description><![CDATA[<p>KisSplice is a software that enables to analyse RNA-seq data with or without a reference genome. It is an exact local transcriptome assembler that allows to identify SNPs, indels and alternative splicing events. It can deal with an arbitrary number of biological conditions, and will quantify each variant in each condition. It has been tested on Illumina datasets of up to 1G reads. Its memory consumption is around 5Gb for 100M reads.</p>
<p>KisSplice is not a full-length transcriptome assembler. This means that it will output the variable regions of the transcripts, not reconstruct them entirely.</p>
<p>KisSplice comes as a workflow, with several possible post-treatments meant to facilitate the analysis of the results. The choice of the post-treatment depends on the availability of a reference genome/transcriptome and on the need to perform a differential analysis, as summarised in the following table.</p><p>Address of the bookmark: <a href="http://kissplice.prabi.fr/" rel="nofollow">http://kissplice.prabi.fr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36865/perga-a-paired-end-read-guided-de-novo-assembler-for-extending-contigs-using-svm-and-look-ahead-approach</guid>
	<pubDate>Tue, 05 Jun 2018 09:57:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36865/perga-a-paired-end-read-guided-de-novo-assembler-for-extending-contigs-using-svm-and-look-ahead-approach</link>
	<title><![CDATA[PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM and Look Ahead Approach]]></title>
	<description><![CDATA[PERGA - Paired End Reads Guided Assembler

PERGA is a novel sequence reads guided de novo assembly approach which adopts greedy-like prediction strategy for assembling reads to contigs and scaffolds. Instead of using single-end reads to construct contig, PERGA uses paired-end reads and different read overlap sizes from O ≥ Omax to Omin to resolve the gaps and branches. Moreover, by constructing a decision model using machine learning approach based on branch features, PERGA can determine the correct extension in 99.7% of cases. PERGA will try to extend the contigs by all feasible nucleotides and determine if these multiple extensions due to sequencing errors or repeats by using looking ahead technology, and it also try to separate the different repeats of nearby genomic regions to make the assembly result more longer and accurate.

The simulated E.coli paired-end reads data are generated using GemSim (KE McElroy, F Luciani, T Thomas. Gemsim: General, Error-Model Based Simulator of Next-Generation Sequencing Data. BMC Genomics 2012, 13:74), with coverage 50x, 60x, 100x, read lengths 100-bp, and can be downloaded from https://github.com/zhuxiao/data_PERGA.<p>Address of the bookmark: <a href="https://github.com/hitbio/PERGA" rel="nofollow">https://github.com/hitbio/PERGA</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36890/price-paired-read-iterative-contig-extension-a-de-novo-genome-assembler-implemented-in-c</guid>
	<pubDate>Mon, 11 Jun 2018 03:08:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36890/price-paired-read-iterative-contig-extension-a-de-novo-genome-assembler-implemented-in-c</link>
	<title><![CDATA[PRICE (Paired-Read Iterative Contig Extension), a de novo genome assembler implemented in C++.]]></title>
	<description><![CDATA[We are pleased to release PRICE (Paired-Read Iterative Contig Extension), a de novo genome assembler implemented in C++. Its name describes the strategy that it implements for genome assembly: PRICE uses paired-read information to iteratively increase the size of existing contigs. Initially, those contigs can be individual reads from a subset of the paired-read dataset, non-paired reads from sequencing technologies that provide non-paired data, or contigs that were output from a prior run of PRICE or any other assembler.

http://derisilab.ucsf.edu/software/price/<p>Address of the bookmark: <a href="http://derisilab.ucsf.edu/software/price/" rel="nofollow">http://derisilab.ucsf.edu/software/price/</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39903/integrative-meta-assembly-pipeline-imap-chromosome-level-genome-assembler-combining-multiple-de-novo-assemblies</guid>
	<pubDate>Sat, 31 Aug 2019 11:30:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39903/integrative-meta-assembly-pipeline-imap-chromosome-level-genome-assembler-combining-multiple-de-novo-assemblies</link>
	<title><![CDATA[Integrative Meta-Assembly Pipeline (IMAP): Chromosome-level genome assembler combining multiple de novo assemblies]]></title>
	<description><![CDATA[<p><span>Chromosome-level genome assembler combining multiple de novo assemblies</span></p>
<p><span><a href="https://github.com/jkimlab/IMAP">https://github.com/jkimlab/IMAP</a></span></p><p>Address of the bookmark: <a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0221858" rel="nofollow">https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0221858</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</guid>
	<pubDate>Sat, 16 Jan 2021 21:42:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</link>
	<title><![CDATA[Protocol for De novo Genome Assembly using Illumina Reads]]></title>
	<description><![CDATA[<p>In this protocol, we address and describe the de novo assembly method for small to medium-sized genomes.</p><p><strong>What is de novo genome assembly?<br /></strong>The method of taking a large number of short DNA sequences and placing them back together to create a reflection of the original chromosomes from which the DNA originated relates to genome assembly. No previous knowledge of the source DNA sequence length, structure or composition is inferred by De novo genome assemblies. The DNA of the target organism is split up into millions of tiny parts and read on a sequencing computer in a genome sequencing experiment. Depending on the sequencing system used, these "reads" range from 20 to 1000 nucleotide base pairs (bp) in length. Usually, length reads of 36 - 150 bp are produced for Illumina style short read sequencing. These reads can be either &ldquo;single ended&rdquo; as described above or &ldquo;paired end.&rdquo;</p><p><strong>Why genome assembly?</strong><br />In basic research into why and how they live, as well as in applied topics, identifying the DNA sequence of an organism is useful. Awareness of a DNA sequence may be useful in virtually any biological research because of the relevance of DNA to living things. For example, it may be used in medicine to classify, diagnose and eventually improve genetic disorder therapies. Similarly, pathogens study can lead to treatments for infectious diseases.</p><p><strong>Raw NGS data</strong><br />Reads can be saved as a Fasta file as text or in a FastQ file with their attributes.&nbsp;FastQ is the most common read file format since this is what the Illumina sequencing pipeline creates. This will henceforth be the subject of our conversation.</p><p><strong>In a nutshell the protocol:</strong> <br />Get the sequence file(s) read from the sequencing machine (s). <br />Look at the readings - have an idea of what you have and what the standard is like. <br />If required, raw data cleanup/quality trimming. <br />Choose an adequate parameter set for assembly. <br />Assemble the data into scaffolds/contigs. <br />Examine the assembly performance and determine the efficiency of the assembly.</p><p><strong>Read Quality Control:</strong><br />Check the qualiy with fastQC.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42540/install-fastqc-using-conda</p><p>Quality trimming/cleanup of read files.<br />This function trims adapters, barcodes and other contaminants from the reads.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42542/trimmomatic-command</p><p><strong>Genome Assembly:</strong><br />The object of this portion of the protocol is to explain the method of assembling the reads trimmed by quality into draft contigs.</p><blockquote><p>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o result_of_spades_assembly_all_illumina</p></blockquote><p>A significant range of short-read assemblers are available. Everyone with strengths and disadvantages of their own. <br /><em>Some of the assemblers available include:</em><br />Velvet<br />SOAP-denovo<br />MIRA<br />ALLPATHS</p><p>Next step is to assess the suitability and what to do with a draft package of contiguous details for the remainder of the study now.&nbsp;Few stuff you can note about the contigs you just created:&nbsp;They're the draft Contigs. Any mis-assemblies can occur.</p><p><strong>Mis-assembly checking and assembly metric tools:</strong><br />QUAST - Quality assessment tool for genome assembly http://bioinf.spbau.ru/quast<br />Mauve assembly metrics - http://code.google.com/p/ngopt/wiki/How_To_Score_Genome_Assemblies_with_Mauve<br />InGAP-SV - https://sites.google.com/site/nextgengenomics/ingap and http://ingap.sourceforge.net/<br />inGAP is also useful for finding structural variants between genomes from read mappings.</p><p><strong>Genome finishing tools:</strong><br />Semi-automated gap fillers:<br />Gap filler - http://www.baseclear.com/landingpages/basetools-a-wide-range-of-bioinformatics-solutions/gapfiller/</p><p>IMAGE (V2) - http://sourceforge.net/apps/mediawiki/image2/index.php?title=Main_Page</p><p><strong>Genome visualisers and editors:</strong><br />Artemis - http://www.sanger.ac.uk/resources/software/artemis/<br />IGV - http://www.broadinstitute.org/igv/</p><p><strong>Automated and semi automated annotation tools:</strong><br />Prokka - https://github.com/tseemann/prokka<br />RAST - http://www.nmpdr.org/FIG/wiki/view.cgi/FIG/RapidAnnotationServer<br />JCVI Annotation Service - http://www.jcvi.org/cms/research/projects/annotation-service/</p><p><strong>Frequent command use for the analysis are at:</strong></p><p>https://bioinformaticsonline.com/blog/view/38765/list-of-tools-frequently-used-while-genome-assembly<br />https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38055/ancestral-genomes-a-resource-for-reconstructed-ancestral-genes-and-genomes-across-the-tree-of-life</guid>
	<pubDate>Fri, 02 Nov 2018 08:16:27 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38055/ancestral-genomes-a-resource-for-reconstructed-ancestral-genes-and-genomes-across-the-tree-of-life</link>
	<title><![CDATA[Ancestral Genomes: a resource for reconstructed ancestral genes and genomes across the tree of life]]></title>
	<description><![CDATA[<p><span>&nbsp;Ancestral Genomes (</span><a href="http://ancestralgenomes.org/" target="">http://ancestralgenomes.org</a><span>) is a resource for comprehensive reconstructions of these &lsquo;fossil genomes&rsquo;. Comprehensive sets of protein-coding genes have been reconstructed for 78 genomes of now-extinct species that were the common ancestors of extant species from across the tree of life.&nbsp;</span></p><p>Address of the bookmark: <a href="http://ancestralgenomes.org/" rel="nofollow">http://ancestralgenomes.org/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35619/tallymer-method-to-compute-k-mer-frequencies-and-its-application-to-annotate-large-repetitive-plant-genomes</guid>
	<pubDate>Thu, 15 Feb 2018 10:21:02 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35619/tallymer-method-to-compute-k-mer-frequencies-and-its-application-to-annotate-large-repetitive-plant-genomes</link>
	<title><![CDATA[Tallymer: method to compute K-mer frequencies and its application to annotate large repetitive plant genomes]]></title>
	<description><![CDATA[<p>Tallymer is based on enhanced suffix arrays. This gives a much larger flexibility concerning the choice of the&nbsp;<span>k</span>-mer size. Tallymer can process large data sizes of several billion bases. We used it in a variety of applications to study the genomes of maize and other plant species. In particular, Tallymer was used to index a set whole genome shotgun sequences from maize (B73) (total size 10<sup>9</sup>&nbsp;bp).&nbsp;<br>Tallymer was effective in a variety of applications to aid genome annotation in maize, despite limitations imposed by the relatively low coverage of sequence available.</p>
<p>A manual can be found&nbsp;<a href="https://www.zbh.uni-hamburg.de/fileadmin/gi/tallymer/tallymer.pdf" target="_blank" title="tallymer.pdf (111 KB)">here</a>.</p><p>Address of the bookmark: <a href="https://www.zbh.uni-hamburg.de/forschung/arbeitsgruppe-genominformatik/software/tallymer.html" rel="nofollow">https://www.zbh.uni-hamburg.de/forschung/arbeitsgruppe-genominformatik/software/tallymer.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36800/genomemapper-simultaneous-alignment-of-short-reads-against-multiple-genomes</guid>
	<pubDate>Fri, 25 May 2018 09:29:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36800/genomemapper-simultaneous-alignment-of-short-reads-against-multiple-genomes</link>
	<title><![CDATA[GenomeMapper: Simultaneous alignment of short reads against multiple genomes]]></title>
	<description><![CDATA[GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. It can be used to align against multiple genomes simulanteously or against a single reference. If you are unsure which one is the appropriate GenomeMapper, you might want to use the latter

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2768987/<p>Address of the bookmark: <a href="http://1001genomes.org/software/genomemapper.html" rel="nofollow">http://1001genomes.org/software/genomemapper.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41592/refka-a-fast-and-efficient-long-read-genome-assembly-approach-for-large-and-complex-genomes</guid>
	<pubDate>Fri, 01 May 2020 03:00:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41592/refka-a-fast-and-efficient-long-read-genome-assembly-approach-for-large-and-complex-genomes</link>
	<title><![CDATA[RefKA: A fast and efficient long-read genome assembly approach for large and complex genomes]]></title>
	<description><![CDATA[<p><span>RefKA, a reference-based approach for long read genome assembly. This approach relies on breaking up a closely related reference genome into bins, aligning k-mers unique to each bin with PacBio reads, and then assembling each bin in parallel followed by a final bin-stitching step.</span></p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://github.com/AppliedBioinformatics/RefKA" rel="nofollow">https://github.com/AppliedBioinformatics/RefKA</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36884/halc-high-throughput-algorithm-for-long-read-error-correction</guid>
	<pubDate>Fri, 08 Jun 2018 10:47:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36884/halc-high-throughput-algorithm-for-long-read-error-correction</link>
	<title><![CDATA[HALC: High throughput algorithm for long read error correction]]></title>
	<description><![CDATA[HALC, a high throughput algorithm for long read error correction. HALC aligns the long reads to short read contigs from the same species with a relatively low identity requirement so that a long read region can be aligned to at least one contig region, including its true genome region’s repeats in the contigs sufficiently similar to it (similar repeat based alignment approach)

HALC was able to obtain 6.7-41.1% higher throughput than the existing algorithms while maintaining comparable accuracy. The HALC corrected long reads can thus result in 11.4-60.7% longer assembled contigs than the existing algorithms.<p>Address of the bookmark: <a href="https://github.com/lanl001/halc" rel="nofollow">https://github.com/lanl001/halc</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>