<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/31568?offset=310</link>
	<atom:link href="https://bioinformaticsonline.com/related/31568?offset=310" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43736/odgi-optimized-dynamic-genomegraph-implementation</guid>
	<pubDate>Tue, 01 Feb 2022 23:42:21 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43736/odgi-optimized-dynamic-genomegraph-implementation</link>
	<title><![CDATA[odgi: optimized dynamic genome/graph implementation]]></title>
	<description><![CDATA[<p dir="auto"><code>odgi</code>&nbsp;provides an efficient and succinct dynamic DNA sequence graph model, as well as a host of algorithms that allow the use of such graphs in bioinformatic analyses.</p>
<p dir="auto">Careful encoding of graph entities allows&nbsp;<code>odgi</code>&nbsp;to efficiently compute and transform&nbsp;<a href="https://pangenome.github.io/">pangenomes</a>&nbsp;with minimal overheads.&nbsp;<code>odgi</code>&nbsp;implements a dynamic data structure that leveraged multi-core CPUs and can be updated on the fly.</p>
<p dir="auto">The edges and path steps are recorded as deltas between the current node id and the target node id, where the node id corresponds to the rank in the global array of nodes. Graphs built from biological data sets tend to have local partial order and, when sorted, the deltas be small. This allows them to be compressed with a variable length integer representation, resulting in a small in-memory footprint at the cost of packing and unpacking.</p>
<p dir="auto">The RAM and computational savings are substantial. In partially ordered regions of the graph, most deltas will require only a single byte.</p><p>Address of the bookmark: <a href="https://github.com/pangenome/odgi" rel="nofollow">https://github.com/pangenome/odgi</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/19636/google-genomics</guid>
	<pubDate>Thu, 18 Dec 2014 11:05:42 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/19636/google-genomics</link>
	<title><![CDATA[Google Genomics]]></title>
	<description><![CDATA[<ul>
<li>
<p><strong>Explore genetic variation interactively.</strong> Compare entire cohorts in seconds with SQL-like queries. Compute transition/transversion ratios, genome-wide association, allelic frequency and more.</p>
</li>
<li>
<p><strong>Process big genomic data easily.</strong> Run batch analyses like principal component analysis and Hardy-Weinberg equilibrium on as many samples as you like, in minutes or hours, with just a little code.</p>
</li>
<li>
<p><strong>Use Google's infrastructure and big data expertise.</strong> Store one genome or a million using Google Genomics and take advantage of the same infrastructure that powers Search, Maps, YouTube, Gmail and Drive.</p>
</li>
<li>
<p><strong>Support emerging global standards.</strong> Google Genomics is implementing the API defined by the Global Alliance for Genomics and Health for visualization, analysis and more. Compliant software can access Google Genomics, local servers, or any other implementation.</p>
</li>
</ul><p>Address of the bookmark: <a href="https://cloud.google.com/genomics/" rel="nofollow">https://cloud.google.com/genomics/</a></p>]]></description>
	<dc:creator>Tenzin Paul</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/19980/seqloc-06</guid>
	<pubDate>Sun, 28 Dec 2014 12:51:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/19980/seqloc-06</link>
	<title><![CDATA[seqloc 0.6]]></title>
	<description><![CDATA[<p>The <code>Bio.SeqLoc</code> modules in <code>seqloc</code> are designed to represent positions and locations (ranges of positions) on sequences, particularly nucleotide sequences. My original motivation for writing these packages was handing the locations of genes in eukaryotic genomes.</p>
<p>Handle sequence locations for bioinformatics http://www.ingolia-lab.org/seqloc-tutorial.html</p><p>Address of the bookmark: <a href="http://www.stackage.org/snapshot/nightly-2014-12-28/package/seqloc-0.6" rel="nofollow">http://www.stackage.org/snapshot/nightly-2014-12-28/package/seqloc-0.6</a></p>]]></description>
	<dc:creator>Gudiya Pal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/20504/chromevol</guid>
	<pubDate>Sun, 25 Jan 2015 00:33:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/20504/chromevol</link>
	<title><![CDATA[ChromEvol]]></title>
	<description><![CDATA[<p>Chromosome number is a remarkably dynamic feature of eukaryotic evolution. Chromosome numbers can change by a duplication of the whole genome (a process termed polyploidy), or by single chromosome changes (ascending dysploidy via, e.g., chromosome fission or descending dysploidy via, e.g., chromosome fusion).<br> Of the various mechanisms of chromosome number change, polyploidy has received significant attention because of the impact such an event may have on the organism.<br> ChromEvol implements a series of likelihood models for the evolution of chromosome numbers. By comparing the fit of the different models to biological data, it may be possible to gain insight regarding the pathways by which the evolution of chromosome number proceeds. For each model, the program estimates the rates for the possible transitions assumed by the model, infers the set of ancestral chromosome numbers, and estimates the location along the tree for which polyploidy events (and other chromosome number changes) occurred. For further methodological details, see the publications and manual on the Downloads page.</p>
<p>http://www.tau.ac.il/~itaymay/cp/chromEvol/about.html</p><p>Address of the bookmark: <a href="http://www.tau.ac.il/~itaymay/cp/chromEvol/downloads.html" rel="nofollow">http://www.tau.ac.il/~itaymay/cp/chromEvol/downloads.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36478/the-marvel-assembler</guid>
	<pubDate>Fri, 04 May 2018 19:18:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36478/the-marvel-assembler</link>
	<title><![CDATA[The MARVEL assembler]]></title>
	<description><![CDATA[<p><span>MARVEL consists of a set of tools that facilitate the overlapping, patching, correction and assembly of noisy (not so noisy ones as well) long reads.</span></p>
<p>The assembly process can be summarized as follows:</p>
<ol>
<li>overlap</li>
<li>patch reads</li>
<li>overlap (again)</li>
<li>scrubbing</li>
<li>assembly graph construction and touring</li>
<li>optional read correction</li>
<li>fasta file creation</li>
</ol><p>Address of the bookmark: <a href="https://github.com/schloi/MARVEL" rel="nofollow">https://github.com/schloi/MARVEL</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36597/gappadder-a-sensitive-approach-for-closing-gaps-on-draft-genomes-with-short-sequence-reads</guid>
	<pubDate>Mon, 14 May 2018 05:25:48 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36597/gappadder-a-sensitive-approach-for-closing-gaps-on-draft-genomes-with-short-sequence-reads</link>
	<title><![CDATA[GAPPadder: A Sensitive Approach for Closing Gaps on Draft Genomes with Short Sequence Reads]]></title>
	<description><![CDATA[<p><span>This software is provided ``as is&rdquo; without warranty of any kind. In no event shall the author be held responsible for any damage resulting from the use of this software. The program package, including source codes, executables, and this documentation, is distributed free of charge. If you use this program in a publication, please cite the following reference:</span><br><span>Chong Chu, Xin Li, and Yufeng Wu. "GAPPadder: A Sensitive Approach for Closing Gaps on Draft Genomes with Short Sequence Reads." bioRxiv (2017): 125534.</span></p><p>Address of the bookmark: <a href="https://github.com/Reedwarbler/GAPPadder" rel="nofollow">https://github.com/Reedwarbler/GAPPadder</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/37927/you-cant-hide-from-genome-hackers</guid>
	<pubDate>Sat, 13 Oct 2018 14:17:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/37927/you-cant-hide-from-genome-hackers</link>
	<title><![CDATA[You can't hide from Genome Hackers]]></title>
	<description><![CDATA[<p><span>Young computational biologist named Yaniv Erlich shocked the research world by showing it was possible to&nbsp;</span><a href="https://www.wired.com/2013/01/your-genome-could-reveal-your-identity/">unmask the identities</a><span>&nbsp;of people listed in anonymous genetic databases using&nbsp;</span><a href="http://science.sciencemag.org/content/339/6117/321" target="_blank">only an Internet connection</a></p><p>Paper: http://science.sciencemag.org/content/early/2018/10/10/science.aau4832</p><p>More at&nbsp;https://www.wired.com/story/genome-hackers-show-no-ones-dna-is-anonymous-anymore/</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</guid>
	<pubDate>Sat, 16 Jan 2021 21:42:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</link>
	<title><![CDATA[Protocol for De novo Genome Assembly using Illumina Reads]]></title>
	<description><![CDATA[<p>In this protocol, we address and describe the de novo assembly method for small to medium-sized genomes.</p><p><strong>What is de novo genome assembly?<br /></strong>The method of taking a large number of short DNA sequences and placing them back together to create a reflection of the original chromosomes from which the DNA originated relates to genome assembly. No previous knowledge of the source DNA sequence length, structure or composition is inferred by De novo genome assemblies. The DNA of the target organism is split up into millions of tiny parts and read on a sequencing computer in a genome sequencing experiment. Depending on the sequencing system used, these "reads" range from 20 to 1000 nucleotide base pairs (bp) in length. Usually, length reads of 36 - 150 bp are produced for Illumina style short read sequencing. These reads can be either &ldquo;single ended&rdquo; as described above or &ldquo;paired end.&rdquo;</p><p><strong>Why genome assembly?</strong><br />In basic research into why and how they live, as well as in applied topics, identifying the DNA sequence of an organism is useful. Awareness of a DNA sequence may be useful in virtually any biological research because of the relevance of DNA to living things. For example, it may be used in medicine to classify, diagnose and eventually improve genetic disorder therapies. Similarly, pathogens study can lead to treatments for infectious diseases.</p><p><strong>Raw NGS data</strong><br />Reads can be saved as a Fasta file as text or in a FastQ file with their attributes.&nbsp;FastQ is the most common read file format since this is what the Illumina sequencing pipeline creates. This will henceforth be the subject of our conversation.</p><p><strong>In a nutshell the protocol:</strong> <br />Get the sequence file(s) read from the sequencing machine (s). <br />Look at the readings - have an idea of what you have and what the standard is like. <br />If required, raw data cleanup/quality trimming. <br />Choose an adequate parameter set for assembly. <br />Assemble the data into scaffolds/contigs. <br />Examine the assembly performance and determine the efficiency of the assembly.</p><p><strong>Read Quality Control:</strong><br />Check the qualiy with fastQC.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42540/install-fastqc-using-conda</p><p>Quality trimming/cleanup of read files.<br />This function trims adapters, barcodes and other contaminants from the reads.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42542/trimmomatic-command</p><p><strong>Genome Assembly:</strong><br />The object of this portion of the protocol is to explain the method of assembling the reads trimmed by quality into draft contigs.</p><blockquote><p>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o result_of_spades_assembly_all_illumina</p></blockquote><p>A significant range of short-read assemblers are available. Everyone with strengths and disadvantages of their own. <br /><em>Some of the assemblers available include:</em><br />Velvet<br />SOAP-denovo<br />MIRA<br />ALLPATHS</p><p>Next step is to assess the suitability and what to do with a draft package of contiguous details for the remainder of the study now.&nbsp;Few stuff you can note about the contigs you just created:&nbsp;They're the draft Contigs. Any mis-assemblies can occur.</p><p><strong>Mis-assembly checking and assembly metric tools:</strong><br />QUAST - Quality assessment tool for genome assembly http://bioinf.spbau.ru/quast<br />Mauve assembly metrics - http://code.google.com/p/ngopt/wiki/How_To_Score_Genome_Assemblies_with_Mauve<br />InGAP-SV - https://sites.google.com/site/nextgengenomics/ingap and http://ingap.sourceforge.net/<br />inGAP is also useful for finding structural variants between genomes from read mappings.</p><p><strong>Genome finishing tools:</strong><br />Semi-automated gap fillers:<br />Gap filler - http://www.baseclear.com/landingpages/basetools-a-wide-range-of-bioinformatics-solutions/gapfiller/</p><p>IMAGE (V2) - http://sourceforge.net/apps/mediawiki/image2/index.php?title=Main_Page</p><p><strong>Genome visualisers and editors:</strong><br />Artemis - http://www.sanger.ac.uk/resources/software/artemis/<br />IGV - http://www.broadinstitute.org/igv/</p><p><strong>Automated and semi automated annotation tools:</strong><br />Prokka - https://github.com/tseemann/prokka<br />RAST - http://www.nmpdr.org/FIG/wiki/view.cgi/FIG/RapidAnnotationServer<br />JCVI Annotation Service - http://www.jcvi.org/cms/research/projects/annotation-service/</p><p><strong>Frequent command use for the analysis are at:</strong></p><p>https://bioinformaticsonline.com/blog/view/38765/list-of-tools-frequently-used-while-genome-assembly<br />https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40516/nextdenovo-string-graph-based-de-novo-assembler-for-tgs-long-reads</guid>
	<pubDate>Sun, 05 Jan 2020 04:08:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40516/nextdenovo-string-graph-based-de-novo-assembler-for-tgs-long-reads</link>
	<title><![CDATA[NextDenovo: string graph-based de novo assembler for TGS long reads]]></title>
	<description><![CDATA[<p>NextDenovo is a string graph-based<span>&nbsp;</span><em>de novo</em><span>&nbsp;</span>assembler for TGS long reads. It uses a "correct-then-assemble" strategy similar to canu, but requires significantly less computing resources and storages. After assembly, the per-base error rate is about 97-98%, to further improve single base accuracy, please use<span>&nbsp;</span><a href="https://github.com/Nextomics/NextPolish">NextPolish</a>.</p>
<p>NextDenovo contains two core modules: NextCorrect and NextGraph. NextCorrect can be used to correct TGS long reads with approximately 15% sequencing errors, and NextGraph can be used to construct a string graph with corrected reads. It also contains a modified version of<span>&nbsp;</span><a href="https://github.com/lh3/minimap2">minimap2</a><span>&nbsp;</span>for adapting input and output and producing more sensitive and accurate dovetail overlaps, and some useful utilities (see<span>&nbsp;</span><a href="https://github.com/Nextomics/NextDenovo/blob/master/doc/UTILITY.md">here</a><span>&nbsp;</span>for more details).</p><p>Address of the bookmark: <a href="https://github.com/Nextomics/NextDenovo" rel="nofollow">https://github.com/Nextomics/NextDenovo</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</guid>
	<pubDate>Fri, 19 Oct 2018 07:25:21 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37957/base-a-practical-de-novo-assembler-for-large-genomes-using-long-ngs-reads</link>
	<title><![CDATA[BASE: a practical de novo assembler for large genomes using long NGS reads]]></title>
	<description><![CDATA[<p><span>new&nbsp;</span><em>de novo</em><span>&nbsp;assembler called BASE. It enhances the classic seed-extension approach by indexing the reads efficiently to generate adaptive seeds that have high probability to appear uniquely in the genome. Such seeds form the basis for BASE to build extension trees and then to use reverse validation to remove the branches based on read coverage and paired-end information, resulting in high-quality consensus sequences of reads sharing the seeds. Such consensus sequences are then extended to contigs.</span></p><p>Address of the bookmark: <a href="https://github.com/dhlbh/BASE" rel="nofollow">https://github.com/dhlbh/BASE</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>