<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/18653?offset=60</link>
	<atom:link href="https://bioinformaticsonline.com/related/18653?offset=60" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41937/merqury-evaluate-genome-assemblies-with-k-mers</guid>
	<pubDate>Fri, 03 Jul 2020 19:29:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41937/merqury-evaluate-genome-assemblies-with-k-mers</link>
	<title><![CDATA[merqury: Evaluate genome assemblies with k-mers]]></title>
	<description><![CDATA[<p><span>Often, genome assembly projects have illumina whole genome sequencing reads available for the assembled individual. The k-mer spectrum of this read set can be used for independently evaluating assembly quality without the need of a high quality reference. Merqury provides a set of tools for this purpose.</span></p>
<p><span>More at&nbsp;<a href="https://www.biorxiv.org/content/10.1101/2020.03.15.992941v1.full">https://www.biorxiv.org/content/10.1101/2020.03.15.992941v1.full</a></span></p><p>Address of the bookmark: <a href="https://github.com/marbl/merqury" rel="nofollow">https://github.com/marbl/merqury</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42267/hapsolo-an-optimization-approach-for-removing-secondary-haplotigs-during-diploid-genome-assembly-and-scaffolding</guid>
	<pubDate>Mon, 26 Oct 2020 21:23:36 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42267/hapsolo-an-optimization-approach-for-removing-secondary-haplotigs-during-diploid-genome-assembly-and-scaffolding</link>
	<title><![CDATA[HapSolo: An optimization approach for removing secondary haplotigs during diploid genome assembly and scaffolding.]]></title>
	<description><![CDATA[<p><span>Despite marked recent improvements in long-read sequencing technology, the assembly of diploid genomes remains a difficult task. A major obstacle is distinguishing between alternative contigs that represent highly heterozygous regions. If primary and secondary contigs are not properly identified, the primary assembly will overrepresent both the size and complexity of the genome, which complicates downstream analysis such as scaffolding.</span></p>
<p><span>More at&nbsp;https://github.com/esolares/HapSolo</span></p><p>Address of the bookmark: <a href="https://github.com/esolares/HapSolo" rel="nofollow">https://github.com/esolares/HapSolo</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43055/infogenomer-integrative-reconstruction-of-cancer-genome-karyotypes</guid>
	<pubDate>Wed, 05 May 2021 01:02:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43055/infogenomer-integrative-reconstruction-of-cancer-genome-karyotypes</link>
	<title><![CDATA[InfoGenomeR: Integrative reconstruction of cancer genome karyotypes]]></title>
	<description><![CDATA[<p>InfoGenomeR is the Integrative Framework for Genome Reconstruction that uses a breakpoint graph to model the connectivity among genomic segments at the genome-wide scale. InfoGenomeR integrates cancer purity and ploidy, total CNAs, allele-specific CNAs, and haplotype information to identify the optimal breakpoint graph representing cancer genomes.</p>
<p><img src="https://github.com/YeonghunL/InfoGenomeR/raw/master/doc/overview.png" alt="image" style="border: 0px; border: 0px;"></p>
<p>More at&nbsp;https://www.nature.com/articles/s41467-021-22671-6</p><p>Address of the bookmark: <a href="https://github.com/dmcblab/InfoGenomeR" rel="nofollow">https://github.com/dmcblab/InfoGenomeR</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43273/understanding-kmer</guid>
	<pubDate>Wed, 18 Aug 2021 04:27:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43273/understanding-kmer</link>
	<title><![CDATA[Understanding kmer !]]></title>
	<description><![CDATA[<p><a href="https://en.wikipedia.org/wiki/k-mer">What is a&nbsp;<em>k-mer</em>&nbsp;anyway?</a><span>&nbsp;A&nbsp;</span><em>k-mer</em><span>&nbsp;is just a sequence of&nbsp;</span><em>k</em><span>&nbsp;characters in a string (or nucleotides in a DNA sequence). Now, it is important to remember that to get&nbsp;</span><em>all k-mers</em><span>&nbsp;from a sequence you need to get the first&nbsp;</span><em>k</em><span>&nbsp;characters, then move just a single character for the start of the next&nbsp;</span><em>k-mer</em><span>&nbsp;and so on. Effectively, this will create sequences that overlap in&nbsp;</span><code>k-1</code><span>&nbsp;positions.</span></p><p>Address of the bookmark: <a href="https://bioinfologics.github.io/post/2018/09/17/k-mer-counting-part-i-introduction/" rel="nofollow">https://bioinfologics.github.io/post/2018/09/17/k-mer-counting-part-i-introduction/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43614/mitoz-a-toolkit-for-animal-mitochondrial-genome-assembly-annotation-and-visualization</guid>
	<pubDate>Tue, 30 Nov 2021 23:23:57 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43614/mitoz-a-toolkit-for-animal-mitochondrial-genome-assembly-annotation-and-visualization</link>
	<title><![CDATA[MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization]]></title>
	<description><![CDATA[<p>MitoZ, consisting of independent modules of <em>de novo</em> assembly, findMitoScaf (find Mitochondrial Scaffolds), annotation and visualization, that can generate mitogenome assembly together with annotation and visualization results from HTS raw reads.</p>
<p>https://academic.oup.com/nar/article/47/11/e63/5377471</p><p>Address of the bookmark: <a href="https://github.com/linzhi2013/MitoZ" rel="nofollow">https://github.com/linzhi2013/MitoZ</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43661/maftools</guid>
	<pubDate>Fri, 17 Dec 2021 03:18:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43661/maftools</link>
	<title><![CDATA[maftools]]></title>
	<description><![CDATA[<p>With advances in Cancer Genomics, <a href="https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/">Mutation Annotation Format</a> (MAF) is being widely accepted and used to store somatic variants detected. <a href="http://cancergenome.nih.gov">The Cancer Genome Atlas</a> Project has sequenced over 30 different cancers with sample size of each cancer type being over 200. <a href="https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files">Resulting data</a> consisting of somatic variants are stored in the form of <a href="https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/">Mutation Annotation Format</a>. This package attempts to summarize, analyze, annotate and visualize MAF files in an efficient manner from either TCGA sources or any in-house studies as long as the data is in MAF format.</p>
<p>https://www.bioconductor.org/packages/devel/bioc/vignettes/maftools/inst/doc/maftools.html</p><p>Address of the bookmark: <a href="https://github.com/PoisonAlien/maftools" rel="nofollow">https://github.com/PoisonAlien/maftools</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</guid>
	<pubDate>Mon, 31 Jan 2022 07:18:16 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/43728/short-read-assembly-using-spades</link>
	<title><![CDATA[Short-read assembly using Spades !]]></title>
	<description><![CDATA[<h2 id="short-read-assembly-a-comparison">If we only had Illumina reads, we could also assemble these using the tool Spades.</h2><p>You can try this here, or try it later on your own data.</p><h2 id="get-data">Get data</h2><p>We will use the same Illumina data as we used above:</p><ul>
<li>illumina_R1.fastq.gz: the Illumina forward reads</li>
<li>illumina_R2.fastq.gz: the Illumina reverse reads</li>
</ul><h2 id="assemble">Assemble</h2><p>Run Spades:</p><div><pre>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o spades_assembly_all_illumina
</pre></div><ul>
<li><code>-1</code>&nbsp;is input file of forward reads</li>
<li><code>-2</code>&nbsp;is input file of reverse reads</li>
<li><code>--careful</code>&nbsp;minimizes mismatches and short indels</li>
<li><code>--cov-cutoff auto</code>&nbsp;computes the coverage threshold (rather than the default setting, &ldquo;off&rdquo;)</li>
<li><code>-o</code>&nbsp;is the output directory</li>
</ul><h2 id="results">Results</h2><p>Move into the output directory and look at the contigs:</p><div><pre>infoseq contigs.fasta</pre></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43801/smudgeplot-inference-of-ploidy-and-heterozygosity-structure-using-whole-genome-sequencing-data</guid>
	<pubDate>Fri, 25 Feb 2022 04:42:09 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43801/smudgeplot-inference-of-ploidy-and-heterozygosity-structure-using-whole-genome-sequencing-data</link>
	<title><![CDATA[Smudgeplot: Inference of ploidy and heterozygosity structure using whole genome sequencing data]]></title>
	<description><![CDATA[<p dir="auto">This tool extracts heterozygous kmer pairs from kmer count databases and performs gymnastics with them. We are able to disentangle genome structure by comparing the sum of kmer pair coverages (CovA + CovB) to their relative coverage (CovB / (CovA + CovB)). Such an approach also allows us to analyze obscure genomes with duplications, various ploidy levels, etc.</p>
<p dir="auto">Smudgeplots are computed from raw or even better from trimmed reads and show the haplotype structure using heterozygous kmer pairs. For example:</p>
<p dir="auto"><a href="https://user-images.githubusercontent.com/8181573/45959760-f1032d00-c01a-11e8-8576-ff0512c33da9.png" target="_blank"><img src="https://user-images.githubusercontent.com/8181573/45959760-f1032d00-c01a-11e8-8576-ff0512c33da9.png" alt="smudgeexample" style="border: 0px;"></a></p><p>Address of the bookmark: <a href="https://github.com/KamilSJaron/smudgeplot" rel="nofollow">https://github.com/KamilSJaron/smudgeplot</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43867/genomeqc-a-quality-assessment-tool-for-genome-assemblies-and-gene-structure-annotations</guid>
	<pubDate>Thu, 19 May 2022 04:29:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43867/genomeqc-a-quality-assessment-tool-for-genome-assemblies-and-gene-structure-annotations</link>
	<title><![CDATA[GenomeQC: a quality assessment tool for genome assemblies and gene structure annotations]]></title>
	<description><![CDATA[<p><span>The GenomeQC web application is implemented in R/Shiny version 1.5.9 and Python 3.6 and is freely available at&nbsp;</span><a href="https://genomeqc.maizegdb.org/">https://genomeqc.maizegdb.org/</a><span>&nbsp;under the GPL license. All source code and a containerized version of the GenomeQC pipeline is available in the GitHub repository&nbsp;</span><a href="https://github.com/HuffordLab/GenomeQC">https://github.com/HuffordLab/GenomeQC</a><span>.</span></p>
<p>https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-6568-2</p><p>Address of the bookmark: <a href="https://github.com/HuffordLab/GenomeQC" rel="nofollow">https://github.com/HuffordLab/GenomeQC</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44352/bioinformatics-tools-for-genome-assembly</guid>
	<pubDate>Mon, 24 Jul 2023 07:04:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44352/bioinformatics-tools-for-genome-assembly</link>
	<title><![CDATA[Bioinformatics tools for genome assembly !]]></title>
	<description><![CDATA[<p>There are numerous genome assembly tools available, each with its strengths and weaknesses. Here is a list of some widely used genome assembly tools as of my last update in September 2021:</p><ol>
<li>
<p><span>SPAdes:</span> An assembler specifically designed for single-cell and multi-cell bacterial genomes, as well as small eukaryotic genomes.</p>
</li>
<li>
<p><span>ABySS:</span> A parallelized assembler for large genomes that uses de Bruijn graphs.</p>
</li>
<li>
<p><span>Velvet:</span> Another de Bruijn graph-based assembler optimized for short-read sequencing data.</p>
</li>
<li>
<p><span>SOAPdenovo:</span> A de Bruijn graph-based assembler designed for short reads, widely used for assembling large and complex genomes.</p>
</li>
<li>
<p><span>MaSuRCA:</span> A hybrid assembler that combines data from multiple sequencing technologies, such as Illumina and PacBio.</p>
</li>
<li>
<p><span>Canu:</span> A long-read assembler optimized for PacBio and Oxford Nanopore sequencing data.</p>
</li>
<li>
<p><span>Flye:</span> A long-read assembler suitable for bacterial and small eukaryotic genomes.</p>
</li>
<li>
<p><span>SMARTdenovo:</span> An assembler designed for long reads, particularly suited for PacBio data.</p>
</li>
<li>
<p><span>SPAdes Long Read (SPAdesLR):</span> An extension of SPAdes for long-read data, such as those from PacBio or Nanopore.</p>
</li>
<li>
<p><span>Minia:</span> An assembler optimized for low memory consumption, suitable for small and medium-sized genomes.</p>
</li>
<li>
<p><span>Unicycler:</span> A hybrid assembler that combines short and long reads for circular bacterial genome assembly.</p>
</li>
<li>
<p><span>wtdbg2:</span> A de Bruijn graph assembler for long reads, efficient for very large genomes.</p>
</li>
<li>
<p><span>Shasta:</span> A long-read assembler that uses the Overlap-Layout-Consensus approach, suitable for PacBio and Nanopore data.</p>
</li>
<li>
<p><span>Sparc:</span> An assembler designed to handle noisy long reads from Nanopore sequencing.</p>
</li>
<li>
<p><span>CANA:</span> An assembler for metagenomic data, particularly for complex and diverse microbial communities.</p>
</li>
<li>
<p><span>Ra</span> Assembler: A metagenome assembler for long reads, designed for highly complex metagenomic samples.</p>
</li>
</ol><p>Please note that the field of bioinformatics is constantly evolving, and new assembly tools may have emerged since my last update. Additionally, the performance of these tools can vary depending on the characteristics of the sequencing data and the genome being assembled. When selecting an assembly tool, consider the specific requirements of your project, the available data types, and the computational resources at your disposal. Always refer to the respective tool's documentation and publications for the most up-to-date information and recommendations.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>

</channel>
</rss>