<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43728?offset=100</link>
	<atom:link href="https://bioinformaticsonline.com/related/43728?offset=100" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/37905/phased-human-genome-assembly</guid>
	<pubDate>Mon, 08 Oct 2018 09:10:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/37905/phased-human-genome-assembly</link>
	<title><![CDATA[Phased Human Genome Assembly !]]></title>
	<description><![CDATA[<p>The new publicly available assembly (PacBio&nbsp;<a href="https://www.globenewswire.com/Tracker?data=IM2cKfZgtHafORdb9VSstujBjyW-aIzFILCtXNAkcY_yqVmxdjvG01R_FZQC7zLxs-alqquXwsW6MG98G9-g-ym8Nue2pmUZMtkIg3FIat2mYbJ-z2Ra367GlinbO13x" target="_blank" title=""><span style="text-decoration: underline;">HG00733</span></a>) has the fewest gaps of any human genome assembly, with more than half of the genome contained in gapless sequence at least 27 Mb long. The primary contig assembly is 2.89 Gb long and consists of 865 contigs that were assembled with PacBio data generated with the company&rsquo;s Sequel<span>&reg;</span>&nbsp;System. Using the&nbsp;<a href="https://www.globenewswire.com/Tracker?data=jOa6mE1Y5r8VbU1CaCgx1A0HsoVzJ7waxOiDKgvmKL6cwJq_eH4nWrGj2vLkNpxHl1-5CH4htDB4113PXT8WU60hvHQ-KKpvAwQwveEGvz3N4d0q7QHSa_X97LW8_9xEiYqfsc4d24ca-IpVYZsf7Ue-XL7fSIIZw_EHK-F96t1aaQNRcD-z1PP5qvlZbVwX" target="_blank" title=""><span style="text-decoration: underline;">FALCON-Unzip assembler</span></a>, maternal and paternal haplotypes were resolved over more than 80% of the genome. Maternal and paternal haplotype blocks were then further phased using Hi-C technology and the&nbsp;<a href="https://www.globenewswire.com/Tracker?data=jOa6mE1Y5r8VbU1CaCgx1IrQmRcKvNQm83FLTqQE6OGzutM-fEggnm4Z-nsniK0D_YmDKS_UKWE0NHtHbgvbL973Y2-9NhrWhYKizXQ4lpiTvlqPf1UZdjqVs7BDjISgDnovv8foYw8es8jQzAg5Xfq1CH36NOnWQgA_X04XSvyEEEj0q801Im6cV5M5K4eL15vb_ZgUayccOvDY_fc6lxxPAAAyA4h16-zUN44Y81KdujciCrJrv5xynMIXEjRsaIKCf6eCX_Q1j_uZlN5TD0MVr6HulTYG8lGgyL0x-eQ=" target="_blank" title=""><span style="text-decoration: underline;">FALCON-Phase method</span></a>developed in collaboration with Phase Genomics. The genome was then&nbsp;<em>de novo</em>&nbsp;scaffolded using Phase Genomics&rsquo;&nbsp;<a href="https://www.globenewswire.com/Tracker?data=4wcqEWHJpCHRJARQkC0oVkYT9htT14iVebujxcW1nMpAjmigHGQ46ObCGetRfyaZm1ADIHaV1-30B9izTAhjJ-efhFlxorUxs08kdV-9AAzQyuHJ9S7wxnRRnyegsTZd" target="_blank" title=""><span style="text-decoration: underline;">Proximo Hi-C platform</span></a>, resulting in the first chromosome-scale diploid assembly of a single individual accomplished with only two technologies. More specific details about the assembly are included on the PacBio blog.</p><p>The data are available using NCBI accession IDs: BioProject: (<a href="https://www.globenewswire.com/Tracker?data=YZtCuhY2wu5H0yIso9jtUufPXbwyHh1QOZ1jBggGpK5NtXaU_JGC9X39F3uHZ96uVmu6hW5OB2Qq805hUEW2OhSNCm630yFiEF6_nsAwYB0=" target="_blank" title=""><span style="text-decoration: underline;">PRJNA483067</span></a>), assembly: [<a href="https://www.globenewswire.com/Tracker?data=CEXZ7E56JOsRgfH4Wq3r5LVbv4QH_UIekV9idYBys9l8K7pFft824jmYWNzJqK7lQ9fMbaAtbURpm8gM7zqUbpPUrydFwrkJGGtG-NBHctjyjddiFY-p06xZPm2mHXE2" target="_blank" title=""><span style="text-decoration: underline;">RBJD00000000</span></a>] and sequence data (<a href="https://www.globenewswire.com/Tracker?data=pELP2RpqTqTRaPF9yN1N7GZYlQmTxpY0aW-B8xaNw6iyD-Lylw7X3UzMDK3YS4AIYgLtD13em2XsbzOwKhXuNbI4Ks6-LSyXl1_yVdFoB0U=" target="_blank" title=""><span style="text-decoration: underline;">SRP155659</span></a>).</p><p><span>Additional Resources</span></p><ul>
<li><a href="http://globenewswire.com/Tracker?data=zXpdadphSgIAIEWeq46yRPm5-TU0H7wTkL48ue4I9GsaHd5mJyMb9PgXgAsElREkLOCOdWdJ8uW9DHB-LyQ7xhzbd97Qis6CuAlqD0ubGgY%3D" target="_blank" title=""><span style="text-decoration: underline;">Interactive map</span></a>&nbsp;showcasing global initiatives underway to generate reference-quality human genome assemblies for diverse populations</li>
<li><a href="http://globenewswire.com/Tracker?data=EQ8NIaaa8k1Nw1MPRJYIHYrqgsDy92kU8W0siJdGQhq5IJ0dcb890PFFm-C1SrAlFf0xkxUVRxZefFK5ebhoIzmS-6OjR1G9sTxOkCOwRHCAZWmHL-e7uGSuZYcw1VsDp8AeDWO0RwcepMMB6hAoR6BBCJDiJVVZtdFlWBn2uxs%3D" target="_blank" title=""><span style="text-decoration: underline;">BioReport Podcast</span></a>&nbsp;on the value of ethnic-specific reference genomes</li>
<li><em>Nature Reviews Genetics</em>&nbsp;paper from NHGRI:&nbsp;<a href="http://globenewswire.com/Tracker?data=dffu-wPD_JX1_KVeCA6VFy-kP1tlAUbn7d85saXD59dnnJfT2BE3N_Rbm6kT4BvifA_XEs49ioa75cy4HyFi90RA_LRa2QFF6Y4mr-dcoMucljZw0K4JNDZuwWkWPE51cVC2Lqq3E3C1aZ8un6Bq3i-OO_NiVH0hh23hUw4wC84%3D" target="_blank" title=""><span style="text-decoration: underline;">Prioritizing&nbsp;diversity&nbsp;in human genomics research</span></a></li>
<li>Article in&nbsp;<em>The Journal of Precision Medicine</em>: &ldquo;<a href="http://globenewswire.com/Tracker?data=yokLqO2TCBLCdj6uZl-GYbqcGMWBerBYjSPrLMumNrWF2p5XlXq9yl5p-1b5xx3Ckfn5ZjQWkdhxLttbiNae5gccUCP-9RWPUqvTu9MuU9zgJ1c8e14lAladCuEOiVZ2oVRiqssPtLu9hgQWw4ad5EUxZemevsHE4BHC6IiFmMZ6DS6ApwZu-IonFgCFBIcjWOpitQthDASosfaqkMi9LsKgLU9F0WGVJDDOzHXpddhjfCUdEEJ7xC1p8uh9TSiCZgZV6XPlUJSe8n0C_9TtOw%3D%3D" target="_blank" title=""><span style="text-decoration: underline;">Minority Report &ndash; Ethnic Diversity and the Real Promise for Precision Medicine</span></a>&rdquo;</li>
<li>Article&nbsp;in&nbsp;<em>Bio-IT World</em>: &ldquo;<a href="http://globenewswire.com/Tracker?data=rLp1pKetctTPitNEnRjOVDZ3Cvw3FUdL6_ybXncvhjR4ksOrX3y6HUK8WtLlKHT7XZzq_woUjZ-uw20YNvsP0GZAmy5lVqETt27oBLi02wFtTH_6ubELIHtBu8vfVyKnqKp-YhosFG5K7y0RUtzmNjOAlCYPAeVXabn2a2AiSePxUXA_tSy_g79hjYm63x9dPN9oFQGYedOsyHD_ls8DKw%3D%3D" target="_blank" title=""><span style="text-decoration: underline;">Genomic Data Standards Are a Necessity</span></a>&rdquo;</li>
<li>NHGRI Project Award:&nbsp;<a href="http://globenewswire.com/Tracker?data=FbqTEeRffJ88lFryYX6MiOefXvIXFdZDAyW4nrFoYNHaJyMEYIcb7I4BIcEQmxzsKOjrlf9F8irfRJeJLOqG8KFsl-kvkhakUkg3BfYdKGnpLzKYyWbUFR0aKMeEXirHBi7oDLEUSDO45qxANwxyee-pqZXfzAIwF1Wcuaf7EIzNqRqmBUJ3TyNyI05lwAo9gDKmApMnJo5VxPj5P_6rY8lisuv1PNSAh_kJPOuhVBk%3D" target="_blank" title=""><span style="text-decoration: underline;">High Quality Human and Non-Human Primate Genome Assemblies</span></a></li>
</ul><p>More details are available on the PacBio website:</p><ul>
<li>Blog post:&nbsp;<a href="http://globenewswire.com/Tracker?data=ycj-ujgsKzVyljNa11buVmIS5tk9B733VsFZEw77nBXo-IkBvcoG16dN9vuTiY3nm2G5dJZS5Iva3w_znrEtJVDuU8cVlFpozY2ibinKwrMGxkXZVSqW8_uD8fbySRjM5Q_cjuPU22ARFSSLCc9vHJx9WHnb9Rza-qPbuWgewa0rWWStq2fQY5mLpeaQf5fcDJnyQkvDAMI3fauXdzyThg%3D%3D" target="_blank" title=""><span style="text-decoration: underline;">Data Release: Highest-Quality, Most Contiguous Individual Human Genome Assembly to Date</span></a></li>
<li>Blog post:&nbsp;<a href="http://globenewswire.com/Tracker?data=GlZZ9nyp5mDSjJPPfhVD1-dZ_W2l8s0eAUox3TQs949zyGjzO7dx9xodyvyqerdqPC-G3ZhdPEs9xNhJwflrwgHPYQL3kTofprKHBBq3O4gn9E75YUBweJw9b6tTE89sMLUQzF-vRNNDjero3mibm_uG-fSHoYBTm2ZlyEmwzZ5E9tXVd5_RjG0Xnej2E0scA0SncEItAF6Q7vdOydTV_Yr9yYT2TmKY5jtyAt6ZrNGn3McqfV9mMRkR-8dYJLqrQln9JiEkWTwUae6Blj56HyjyXKl6Dfa_CyNuy4r-EWU%3D" target="_blank" title=""><span style="text-decoration: underline;">For Reference-Grade Human Genome Assemblies, SMRT Sequencing Yields Optimal Results</span></a></li>
<li>Webinar: &nbsp;<a href="http://globenewswire.com/Tracker?data=xlnfDwMNLGZZvtexJYsUgMe-DV8HNrYx2QqjwIjfj40dToVtqrBi-gvhknHZmIe8GV_3WU3_9LIlP6GzG3ZoajnDIpwECzdMV5Vyy8Ast4Y2AiHJckf7rBhZVEU4_mV4JB0k3I9XjN2jHK8Cp5uBxyIWWqPdI6qBBdCYYhYLXUTkKpaZEV98oCfC5ET2Q7OSwUM7NieKa75yzMHwaPEYwg%3D%3D" target="_blank" title=""><span style="text-decoration: underline;">Assembling High-Quality Human Reference Genomes for Global Populations</span></a></li>
<li>FALCON-Phase&nbsp;<a href="http://globenewswire.com/Tracker?data=4Z9LDdRq3w2zYFQXEFGmz6u-Vrbfh96syfzrQMKhegLRo2PUvk7s3Xz_y1o--NuTLoCQMrHsqOEBUHIL1IPeOmhyf6Eqwdp8dv8xYo9gSVI%3D" target="_blank" title=""><span style="text-decoration: underline;">press release</span></a>&nbsp;and article&nbsp;<a href="http://globenewswire.com/Tracker?data=4Z9LDdRq3w2zYFQXEFGmz9Ts_IJqHWWrKd33x_ldJEU9mSKXpcVTTi9ioY0kVqrbrXHeCKDf4TdPnAoPJaGBK3YeZtYp-nXZacgyPESZ1XboSUZEJ9rIhDyW7bTLL5HN" target="_blank" title=""><span style="text-decoration: underline;">preprint</span></a></li>
<li>PacBio research focus webpage about&nbsp;<a href="http://globenewswire.com/Tracker?data=E-zzUkw4N01KR4muPun47qg4HX8ToDvLS4sX953hLM2wRyQZ2upkLR4WidyXTFDRLWQORpqxnkbD-CNzsOJyIfH8mJPbrLwRf04J4yjuNdem-Fulc8QIT3OCi4wx5LpqgC2ymLE0rYX5UOpbFPBgvA%3D%3D" target="_blank" title=""><span style="text-decoration: underline;">Human Population Genetics</span></a></li>
</ul><p>&nbsp;Ref:&nbsp;https://stockguru.com/2018/10/08/pacific-biosciences-releases-highest-quality-most-contiguous-individual-human-genome-assembly-to-date/</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39269/ragoo-fast-reference-guided-scaffolding-of-genome-assembly-contigs</guid>
	<pubDate>Wed, 17 Apr 2019 19:45:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39269/ragoo-fast-reference-guided-scaffolding-of-genome-assembly-contigs</link>
	<title><![CDATA[RaGOO: Fast Reference-Guided Scaffolding of Genome Assembly Contigs]]></title>
	<description><![CDATA[<p>Alonge M, Soyk S, Ramakrishnan S, Wang X, Goodwin S, Sedlazeck FJ, Lippman ZB, Schatz MC:&nbsp;<a href="https://www.biorxiv.org/content/early/2019/01/13/519637">Fast and accurate reference-guided scaffolding of draft genomes</a>.&nbsp;<em>bioRxiv</em>&nbsp;2019.</p>
<p>RaGOO is a tool for coalescing genome assembly contigs into pseudochromosomes via minimap2 alignments to a closely related reference genome. The focus of this tool is on practicality and therefore has the following features:</p>
<ol>
<li>Good performance. On a MacBook Pro using Arabidopsis data, pseudochromosome construction takes less than a minute and the whole pipeline with SV calling takes ~2 minutes.</li>
<li>Intact ordering and orienting of contigs.</li>
<li><a href="https://github.com/malonge/RaGOO/wiki/Breaking-Chimeric-Contigs">Chimeric contig correction</a></li>
<li><a href="https://github.com/malonge/RaGOO/wiki/GFF-File-Lift-Over">GFF lift-over</a></li>
<li><a href="https://github.com/malonge/RaGOO/wiki/Calling-Structural-Variants">Structural variant calling with and integrated version of Assemblytics</a></li>
<li>Confidence scores associated with the grouping, localization, and orientation for each contig.</li>
</ol><p>Address of the bookmark: <a href="https://github.com/malonge/RaGOO" rel="nofollow">https://github.com/malonge/RaGOO</a></p>]]></description>
	<dc:creator>BioJoker</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44366/mitofinder</guid>
	<pubDate>Tue, 29 Aug 2023 02:13:01 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44366/mitofinder</link>
	<title><![CDATA[MitoFinder]]></title>
	<description><![CDATA[<p dir="auto">Allio, R., Schomaker-Bastos, A., Romiguier, J., Prosdocimi, F., Nabholz, B., &amp; Delsuc, F. (2020) Mol Ecol Resour. 20, 892-905. (<a href="https://doi.org/10.1111/1755-0998.13160">publication link</a>)</p>
<p dir="auto" style="text-align: center;"><a href="https://github.com/RemiAllio/MitoFinder/blob/master/image/logo.png" target="_blank"><img src="https://github.com/RemiAllio/MitoFinder/raw/master/image/logo.png" alt="Drawing" width="250" style="border: 0px;"></a></p>
<p dir="auto"><span>Mitofinder</span>&nbsp;is a pipeline to&nbsp;<span>assemble</span>&nbsp;mitochondrial genomes and&nbsp;<span>annotate</span>&nbsp;mitochondrial genes from trimmed read sequencing data.</p>
<p dir="auto"><span>MitoFinder</span>&nbsp;is also designed to&nbsp;<span>find</span>&nbsp;and&nbsp;<span>annotate</span>&nbsp;mitochondrial sequences in existing genomic assemblies (generated from Hifi/PacBio/Nanopore/Illumina sequencing data...)</p>
<p dir="auto"><span>MitoFinder</span>&nbsp;is distributed under the&nbsp;<a href="https://github.com/RemiAllio/MitoFinder/blob/master/License/LICENSE">license</a>.</p><p>Address of the bookmark: <a href="https://github.com/RemiAllio/MitoFinder" rel="nofollow">https://github.com/RemiAllio/MitoFinder</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43652/peregrine-shimmer-genome-assembly-toolkit</guid>
	<pubDate>Thu, 16 Dec 2021 02:50:19 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43652/peregrine-shimmer-genome-assembly-toolkit</link>
	<title><![CDATA[Peregrine &amp; SHIMMER Genome Assembly Toolkit]]></title>
	<description><![CDATA[<p><span>Peregrine is a fast genome assembler for accurate long reads (length &gt; 10kb, accuracy &gt; 99%). It can assemble a human genome from 30x reads within 20 cpu hours from reads to polished consensus. It uses Sparse HIereachical MimiMizER (SHIMMER) for fast read-to-read overlaping without quadratic comparisions used in other OLC assemblers.</span></p><p>Address of the bookmark: <a href="https://github.com/cschin/Peregrine" rel="nofollow">https://github.com/cschin/Peregrine</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36635/circlator-automated-circularization-of-genome-assemblies-using-long-sequencing-reads</guid>
	<pubDate>Tue, 15 May 2018 09:42:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36635/circlator-automated-circularization-of-genome-assemblies-using-long-sequencing-reads</link>
	<title><![CDATA[Circlator: automated circularization of genome assemblies using long sequencing reads]]></title>
	<description><![CDATA[A tool to circularize genome assemblies. The algorithm and benchmarks are described in the Genome Biology manuscript. 

Citation: "Circlator: automated circularization of genome assemblies using long sequencing reads", Hunt et al, Genome Biology 2015 Dec 29;16(1):294. doi: 10.1186/s13059-015-0849-0. PMID: 26714481.<p>Address of the bookmark: <a href="http://sanger-pathogens.github.io/circlator/" rel="nofollow">http://sanger-pathogens.github.io/circlator/</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</guid>
	<pubDate>Tue, 14 Jan 2020 06:47:07 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40531/shasta-long-read-assembler</link>
	<title><![CDATA[Shasta long read assembler]]></title>
	<description><![CDATA[<p>The goal of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by&nbsp;<a href="https://nanoporetech.com/">Oxford Nanopore</a>&nbsp;flow cells.</p>
<p>Computational methods used by the Shasta assembler include:</p>
<ul>
<li>Using a&nbsp;<a href="https://en.wikipedia.org/wiki/Run-length_encoding">run-length</a>&nbsp;representation of the read sequence. This makes the assembly process more resilient to errors in homopolymer repeat counts, which are the most common type of errors in Oxford Nanopore reads.</li>
<li>Using in some phases of the computation a representation of the read sequence based on&nbsp;<em>markers</em>, a fixed subset of short k-mers (k &asymp; 10).</li>
</ul>
<p>More at&nbsp;<a href="https://chanzuckerberg.github.io/shasta/index.html">https://chanzuckerberg.github.io/shasta/index.html</a></p><p>Address of the bookmark: <a href="https://github.com/chanzuckerberg/shasta" rel="nofollow">https://github.com/chanzuckerberg/shasta</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41896/kad-assessing-genome-assemblies-using-k-mer-copies-in-assemblies-and-k-mer-abundance-in-illumina-reads</guid>
	<pubDate>Fri, 19 Jun 2020 07:34:12 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41896/kad-assessing-genome-assemblies-using-k-mer-copies-in-assemblies-and-k-mer-abundance-in-illumina-reads</link>
	<title><![CDATA[KAD: Assessing genome assemblies using K-mer copies in assemblies and K-mer abundance in Illumina reads]]></title>
	<description><![CDATA[<p>KAD is designed for evaluating the accuracy of nucleotide base quality of genome assemblies. Briefly, abundance of k-mers are quantified for both sequencing reads and assembly sequences. Comparison of the two values results in a single value per k-mer, K-mer Abundance Difference (KAD), which indicates how well the assembly matches read data for each k-mer.</p>
<p><a href="https://render.githubusercontent.com/render/math?math=KAD=log_{2}\begin{pmatrix}\frac{c%2Bm}{m(n%2B1)}\end{pmatrix}" target="_blank"><img src="https://render.githubusercontent.com/render/math?math=KAD=log_{2}\begin{pmatrix}\frac{c%2Bm}{m(n%2B1)}\end{pmatrix}" alt="image" style="border: 0px;"></a></p>
<p>where,&nbsp;<em>c</em>&nbsp;is the count of a k-mer from reads,&nbsp;<em>m</em>&nbsp;is the mode of counts of read k-mers, and&nbsp;<em>n</em>&nbsp;is the copy of the k-mer in the assembly.</p><p>Address of the bookmark: <a href="https://github.com/liu3zhenlab/KAD" rel="nofollow">https://github.com/liu3zhenlab/KAD</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43791/comparative-genomics-visualisation-tools</guid>
	<pubDate>Thu, 17 Feb 2022 05:37:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43791/comparative-genomics-visualisation-tools</link>
	<title><![CDATA[Comparative genomics visualisation tools !]]></title>
	<description><![CDATA[<p>Comparative genomics visualisation tools !</p><p>Address of the bookmark: <a href="https://cmdcolin.github.io/awesome-genome-visualization/?latest=true&amp;selected=%23BRIG&amp;tag=Comparative" rel="nofollow">https://cmdcolin.github.io/awesome-genome-visualization/?latest=true&amp;selected=%23BRIG&amp;tag=Comparative</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36880/jvarkit-java-utilities-for-bioinformatics</guid>
	<pubDate>Fri, 08 Jun 2018 09:31:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36880/jvarkit-java-utilities-for-bioinformatics</link>
	<title><![CDATA[Jvarkit : Java utilities for Bioinformatics]]></title>
	<description><![CDATA[Collection of Java tool kits for bioinformatics works:

Jvarkit : Java utilities for Bioinformatics<p>Address of the bookmark: <a href="http://lindenb.github.io/jvarkit/" rel="nofollow">http://lindenb.github.io/jvarkit/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/34707/string-graph-based-genome-assembly-software-and-tools</guid>
	<pubDate>Tue, 19 Dec 2017 17:17:38 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/34707/string-graph-based-genome-assembly-software-and-tools</link>
	<title><![CDATA[String graph based genome assembly software and tools !]]></title>
	<description><![CDATA[<p>In&nbsp;<a href="https://en.wikipedia.org/wiki/Graph_theory" title="Graph theory">graph theory</a>, a&nbsp;<strong>string graph</strong>&nbsp;is an&nbsp;<a href="https://en.wikipedia.org/wiki/Intersection_graph" title="Intersection graph">intersection graph</a>&nbsp;of&nbsp;<a href="https://en.wikipedia.org/wiki/Curve" title="Curve">curves</a>&nbsp;in the plane; each curve is called a "string".&nbsp; String graphs were first proposed by E. W. Myers in a&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/21/suppl_2/ii79.full.pdf+html">2005 publication</a>.&nbsp;In&nbsp;recent&nbsp;<a href="http://genome.cshlp.org/content/early/2012/01/22/gr.126953.111">Genome Research paper</a>&nbsp;describing an innovative approach for assembling large genomes from NGS data caught our attention for several reasons. i) it give different "string graph" prospective of long lasting genome assembly problem ii) the&nbsp;paper is coauthored by Jared Simpson, the developer of&nbsp;<a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2694472/">ABySS assembler</a>&nbsp;and Richard Durbin. iii)&nbsp;Simpson-Durbin algorithm is that it does not rely on de Bruijn graphs, and instead employs a different graph construction approach called &lsquo;string graph&rsquo;.</p><p>Following are the genome assembly tools based on string graph:</p><p>1.SGA (String Graph Assembler)&nbsp;https://github.com/jts/sga</p><p>Assembles large genomes from high coverage short read data. SGA is designed as a modular set of programs, which are used to form an assembly pipeline. SGA implements a set of assembly algorithms based on the FM-index. As the FM-index is a compressed data structure, the algorithms are very memory efficient. The SGA assembly has three distinct phases. The first phase corrects base calling errors in the reads. The second phase assembles contigs from the corrected reads. The third phase uses paired end and/or mate pair data to build scaffolds from the contigs. The output of this software is a PDF report that allows the properties of the genome and data quality to be visually explored. By providing more information to the user at the start of an assembly project, this software will help increase awareness of the factors that make a given assembly easy or difficult, assist in the selection of software and parameters and help to troubleshoot an assembly if it runs into problems.</p><p>2.&nbsp;SAGE: String-overlap Assembly of GEnomes&nbsp;https://github.com/lucian-ilie/SAGE2</p><p>SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers.</p><p>3. FSG: Fast String Graph</p><p>The new integrated assembler has been assessed on a standard benchmark, showing that fast string graph (FSG) is significantly faster than SGA while maintaining a moderate use of main memory, and showing practical advantages in running FSG on multiple threads. Moreover, we have studied the effect of coverage rates on the running times.</p><p>4.&nbsp;&nbsp;BASE&nbsp;https://github.com/dhlbh/BASE</p><p>It enhances the classic seed-extension approach by indexing the reads efficiently to generate adaptive seeds that have high probability to appear uniquely in the genome. Such seeds form the basis for BASE to build extension trees and then to use reverse validation to remove the branches based on read coverage and paired-end information, resulting in high-quality consensus sequences of reads sharing the seeds. Such consensus sequences are then extended to contigs.&nbsp;BASE is a practically efficient tool for constructing contig, with significant improvement in quality for long NGS reads. It is relatively easy to extend BASE to include scaffolding.</p><p>5.&nbsp;Fermi&nbsp;https://github.com/lh3/fermi/</p><p>Fermi is a de novo assembler with a particular focus on assembling Illumina&nbsp;short sequence reads from a mammal-sized genome. In addition to the role of a&nbsp;typical assembler, fermi also aims to preserve heterozygotes which are often&nbsp;collapsed by other assemblers. Its ultimate goal is to find a minimal set of&nbsp;unitigs to represent all the information in raw reads.</p><p>If you want to learn about String Graph assembler, please read the following papers -</p><p>i)&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/21/suppl_2/ii79.full.pdf+html">The Fragment Assembly String Graph - E. W. Myers</a></p><p>This paper describes the String Graph concept.</p><p>ii)&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/26/12/i367.full#ref-20">Efficient construction of an assembly string graph using the FM-index - Jared T. Simpson and Richard Durbin</a></p><p>This earlier paper from Simpson and Durbin</p><p>iii)&nbsp;<a href="http://genome.cshlp.org/content/early/2012/01/22/gr.126953.111">Efficient de novo assembly of large genomes using compressed data structures - Jared T. Simpson and Richard Durbin</a></p><p>&nbsp;</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>