<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/39213?offset=380</link>
	<atom:link href="https://bioinformaticsonline.com/related/39213?offset=380" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42271/mcclintock-meta-pipeline-to-identify-transposable-element-insertions-using-next-generation-sequencing-data</guid>
	<pubDate>Tue, 27 Oct 2020 00:21:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42271/mcclintock-meta-pipeline-to-identify-transposable-element-insertions-using-next-generation-sequencing-data</link>
	<title><![CDATA[McClintock: Meta-pipeline to identify transposable element insertions using next generation sequencing data]]></title>
	<description><![CDATA[<p><span>an integrated bioinformatics pipeline for the detection of TE insertions in whole-genome shotgun data, called McClintock (</span><a href="https://github.com/bergmanlab/mcclintock">https://github.com/bergmanlab/mcclintock</a><span>), which automatically runs and standardizes output for multiple TE detection methods. We demonstrate the utility of McClintock by evaluating six TE detection methods using simulated and real genome data from the model microbial eukaryote,&nbsp;</span><em>Saccharomyces cerevisiae</em><span>.&nbsp;</span></p><p>Address of the bookmark: <a href="https://github.com/bergmanlab/mcclintock" rel="nofollow">https://github.com/bergmanlab/mcclintock</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34463/single-cell-rnaseq-data-analysis-tutorial</guid>
	<pubDate>Mon, 27 Nov 2017 16:24:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34463/single-cell-rnaseq-data-analysis-tutorial</link>
	<title><![CDATA[Single Cell RNAseq data analysis tutorial !!]]></title>
	<description><![CDATA[<ul>
<li>A major breakthrough (replaced microarrays) in the late 00&rsquo;s and has been widely used since</li>
<li>Measures the&nbsp;average expression level&nbsp;for each gene across a large population of input cells</li>
<li>Useful for comparative transcriptomics, e.g.&nbsp;samples of the same tissue from different species</li>
<li>Useful for quantifying expression signatures from ensembles, e.g.&nbsp;in disease studies</li>
<li>Insufficient&nbsp;for studying heterogeneous systems, e.g.&nbsp;early development studies, complex tissues (brain)</li>
<li>Does&nbsp;not&nbsp;provide insights into the stochastic nature of gene expression</li>
</ul><p>Following are the useful links:</p><p><a href="http://hemberg-lab.github.io/scRNA.seq.course/scRNA-seq-course.pdf" target="_blank">Single Cell RNAseq data analysis Tutorial</a></p><p><a href="https://f1000research.com/articles/5-2122/v2" target="_blank">A step-by-step workflow for low-level analysis of single-cell RNA-seq data</a></p><p><a href="https://www.bioconductor.org/help/workflows/simpleSingleCell/" target="_blank">A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor</a></p><p>SCell: single-cell RNA-seq analysis software</p><p><a href="https://github.com/diazlab/SCell">https://github.com/diazlab/SCell</a></p><p>Beta-Poisson model for single-cell RNA-seq data analyses</p><p><a href="https://github.com/nghiavtr/BPSC">https://github.com/nghiavtr/BPSC</a></p><p>Sincera: A Computational Pipeline for Single Cell RNA-Seq Profiling Analysis</p><p><a href="https://research.cchmc.org/pbge/sincera.html">https://research.cchmc.org/pbge/sincera.html</a></p><p>SC3 &ndash; consensus clustering of single-cell RNA-Seq data</p><p><a href="http://biorxiv.org/content/early/2016/09/02/036558">http://biorxiv.org/content/early/2016/09/02/036558</a></p><p>Citrus: A toolkit for single cell sequencing analysis</p><p><a href="http://biorxiv.org/content/early/2016/09/14/045070">http://biorxiv.org/content/early/2016/09/14/045070</a></p><p>Single-Cell Resolution of Temporal Gene Expression during Heart Development</p><p><a href="http://www.cell.com/developmental-cell/fulltext/S1534-5807%2816%2930682-7">http://www.cell.com/developmental-cell/fulltext/S1534-5807(16)30682-7</a></p><p>Scalable latent-factor models applied to single-cell RNA-seq data separate biological drivers from confounding effects</p><p><a href="http://biorxiv.org/content/early/2016/11/15/087775">http://biorxiv.org/content/early/2016/11/15/087775</a></p><p>Single cell transcriptomes identify human islet cell signatures and reveal cell-type-specific expression changes in type 2 diabetes</p><p><a href="http://genome.cshlp.org/content/early/2016/11/18/gr.212720.116.abstract">http://genome.cshlp.org/content/early/2016/11/18/gr.212720.116.abstract</a></p><p>SCODE: An efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation</p><p><a href="http://biorxiv.org/content/early/2016/11/21/088856">http://biorxiv.org/content/early/2016/11/21/088856</a></p><p>SCOUP is a probabilistic model to analyze single-cell expression data during differentiation</p><p><a href="https://github.com/hmatsu1226/SCOUP">https://github.com/hmatsu1226/SCOUP</a></p><p>scLVM is a modelling framework for single-cell RNA-seq data</p><p><a href="https://github.com/PMBio/scLVM">https://github.com/PMBio/scLVM</a></p><p>Selective Locally linear Inference of Cellular Expression Relationships (SLICER) algorithm for inferring cell trajectories</p><p><a href="https://github.com/jw156605/SLICER">https://github.com/jw156605/SLICER</a></p><p>SinQC: A Method and Tool to Control Single-cell RNA-seq Data Quality</p><p><a href="http://www.morgridge.net/SinQC.html">http://www.morgridge.net/SinQC.html</a></p><p>TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis</p><p><a href="https://github.com/zji90/TSCAN">https://github.com/zji90/TSCAN</a></p><p>Visualization and cellular hierarchy inference of single-cell data using SPADE</p><p><a href="http://www.nature.com/nprot/journal/v11/n7/full/nprot.2016.066.html">http://www.nature.com/nprot/journal/v11/n7/full/nprot.2016.066.html</a></p><p>OEFinder: Identify ordering effect genes in single cell RNA-seq data</p><p><a href="https://github.com/lengning/OEFinder">https://github.com/lengning/OEFinder</a></p>]]></description>
	<dc:creator>Robert M Willioms</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/40497/artificial-intelligence-is-more-accurate-than-doctors-in-diagnosing-breast-cancer</guid>
	<pubDate>Wed, 01 Jan 2020 22:12:34 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/40497/artificial-intelligence-is-more-accurate-than-doctors-in-diagnosing-breast-cancer</link>
	<title><![CDATA[Artificial intelligence is more accurate than doctors in diagnosing breast cancer]]></title>
	<description><![CDATA[<p>Artificial intelligence is more accurate than doctors in diagnosing breast cancer from mammograms, a study in the journal Nature suggests.</p><p>An international team, including researchers from&nbsp;<a href="https://health.google/" target="_blank">Google Health</a>&nbsp;and&nbsp;<a href="https://www.imperial.ac.uk/news/183293/research-collaboration-aims-improve-breast-cancer/" target="_blank">Imperial College London</a>, designed and trained a computer model on X-ray images from nearly 29,000 women.</p><p>The algorithm&nbsp;<a href="https://nature.com/articles/s41586-019-1799-6" target="_blank">outperformed six radiologists</a>&nbsp;in reading mammograms.</p><p>AI was still as good as two doctors working together.</p><p>Unlike humans, AI is tireless. Experts say it could improve detection. Read More:&nbsp;<a href="https://www.bbc.com/news/health-50857759" target="_blank">https://www.bbc.com/news/health-50857759</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44894/dna2bit-an-ultra-fast-and-accurate-genomic-distance-estimation-software</guid>
	<pubDate>Sun, 31 Aug 2025 06:24:58 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44894/dna2bit-an-ultra-fast-and-accurate-genomic-distance-estimation-software</link>
	<title><![CDATA[dna2bit: an ultra-fast and accurate genomic distance estimation software]]></title>
	<description><![CDATA[<p><span>dna2bit is a software tool developed in C++11, leveraging the capabilities of OpenMP for parallel computing and the popcount technique for efficient bit manipulation. It has been thoroughly tested using the g++ and clang compilers on both Linux and MacOS platforms.</span></p><p>Address of the bookmark: <a href="https://github.com/lijuzeng/dna2bit" rel="nofollow">https://github.com/lijuzeng/dna2bit</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/44515/cleaner-blast-databases-for-more-accurate-results</guid>
	<pubDate>Tue, 23 Apr 2024 01:23:08 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/44515/cleaner-blast-databases-for-more-accurate-results</link>
	<title><![CDATA[Cleaner BLAST Databases for More Accurate Results]]></title>
	<description><![CDATA[<p>Do you use&nbsp;<a href="https://blast.ncbi.nlm.nih.gov/Blast.cgi?utm_source=ncbi_insights&amp;utm_medium=referral&amp;utm_campaign=blast-cleaner-20240422">BLAST</a><span style="font-size: 12.8px; font-weight: normal;">&nbsp;to identify a sequence or the evolutionary scope of a gene? That can be challenging if contaminated and misclassified sequences are in the BLAST databases and show up in your search results. To address</span><span style="font-size: 12.8px; font-weight: normal;">&nbsp;this problem</span><span style="font-size: 12.8px; font-weight: normal;">, we now use the NCBI quality assurance tools listed below to systematically remove these misleading sequences from the default nucleotide (nt) and protein (nr) BLAST databases.</span><span style="font-size: 12.8px; font-weight: normal;">&nbsp;</span></p><div><ul>
<li><a href="https://github.com/ncbi/fcs">Foreign Contamination Screen tool for genome cross-species screening (FCS-GX)</a>&nbsp;detects contamination from foreign organisms in genomes and other sequences using the genome cross-species aligner (GX)&nbsp;</li>
<li><a href="https://ncbiinsights.ncbi.nlm.nih.gov/2022/05/27/ani-for-assembly-validation?utm_source=ncbi_insights&amp;utm_medium=referral&amp;utm_campaign=blast-cleaner-20240422">Average Nucleotide Identity (ANI)</a>&nbsp;evaluates the taxonomic classification of prokaryotic genome assemblies. Sequences from genomes marked up as &lsquo;unverified source organism&rsquo; are considered suspect and removed.&nbsp;</li>
</ul><p>Ref&nbsp;https://ncbiinsights.ncbi.nlm.nih.gov/2024/04/22/cleaner-blast-databases-more-accurate-results/</p></div>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40895/tadpole-an-assembler-error-corrector-and-read-extender</guid>
	<pubDate>Tue, 04 Feb 2020 23:35:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40895/tadpole-an-assembler-error-corrector-and-read-extender</link>
	<title><![CDATA[Tadpole: an assembler, error-corrector, and read-extender]]></title>
	<description><![CDATA[<p><span>Tadpole is a kmer-based assembler, with additional capabilities of error-correcting and extending reads. It does not do any complicated graph analysis or scaffolding, and therefore, is not particularly good for diploid organisms.&nbsp;</span><span>Tadpole is very conservative and optimized for correctness rather than length; which is to say, it stops at every branch, and condenses every repeat. Also, it does not currently do scaffolding.</span></p>
<p>&nbsp;</p>
<p><span><span>To error-correct reads:</span><br><strong>tadpole.sh in=reads.fq out=corrected.fq mode=correct</strong><br><br><span>To extend reads by 50bp in each direction:</span><br><strong>tadpole.sh in=reads.fq out=extended.fq mode=extend el=50 er=50</strong><br><br><span>To error-correct and extend at the same time, using a kmer length of 62:</span><br><strong>tadpole.sh in=reads.fq out=extended.fq mode=extend el=50 er=50 k=62 ecc=t</strong></span></p>
<p>&nbsp;</p>
<p>More at&nbsp;<a href="http://seqanswers.com/forums/showthread.php?t=61445">http://seqanswers.com/forums/showthread.php?t=61445</a></p><p>Address of the bookmark: <a href="https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/tadpole-guide/" rel="nofollow">https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/tadpole-guide/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33847/omega2-metagenome-assembly-pipeline</guid>
	<pubDate>Mon, 10 Jul 2017 05:56:07 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33847/omega2-metagenome-assembly-pipeline</link>
	<title><![CDATA[Omega2: metagenome assembly pipeline]]></title>
	<description><![CDATA[<p><span>Omega found overlaps between reads using a prefix/suffix hash table. The overlap graph of reads was simplified by removing transitive edges and trimming short branches. Unitigs were generated based on minimum cost flow analysis of the overlap graph and then merged to contigs and scaffolds using mate-pair information. In comparison with three de Bruijn graph assemblers (SOAPdenovo, IDBA-UD and MetaVelvet), Omega provided comparable overall performance on a HiSeq 100-bp dataset and superior performance on a MiSeq 300-bp dataset. In comparison with Celera on the MiSeq dataset, Omega provided more continuous assemblies overall using a fraction of the computing time of existing overlap-layout-consensus assemblers. This indicates Omega can more efficiently assemble longer Illumina reads, and at deeper coverage, for metagenomic datasets.</span></p><p>Address of the bookmark: <a href="http://omega.omicsbio.org/" rel="nofollow">http://omega.omicsbio.org/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/34707/string-graph-based-genome-assembly-software-and-tools</guid>
	<pubDate>Tue, 19 Dec 2017 17:17:38 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/34707/string-graph-based-genome-assembly-software-and-tools</link>
	<title><![CDATA[String graph based genome assembly software and tools !]]></title>
	<description><![CDATA[<p>In&nbsp;<a href="https://en.wikipedia.org/wiki/Graph_theory" title="Graph theory">graph theory</a>, a&nbsp;<strong>string graph</strong>&nbsp;is an&nbsp;<a href="https://en.wikipedia.org/wiki/Intersection_graph" title="Intersection graph">intersection graph</a>&nbsp;of&nbsp;<a href="https://en.wikipedia.org/wiki/Curve" title="Curve">curves</a>&nbsp;in the plane; each curve is called a "string".&nbsp; String graphs were first proposed by E. W. Myers in a&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/21/suppl_2/ii79.full.pdf+html">2005 publication</a>.&nbsp;In&nbsp;recent&nbsp;<a href="http://genome.cshlp.org/content/early/2012/01/22/gr.126953.111">Genome Research paper</a>&nbsp;describing an innovative approach for assembling large genomes from NGS data caught our attention for several reasons. i) it give different "string graph" prospective of long lasting genome assembly problem ii) the&nbsp;paper is coauthored by Jared Simpson, the developer of&nbsp;<a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2694472/">ABySS assembler</a>&nbsp;and Richard Durbin. iii)&nbsp;Simpson-Durbin algorithm is that it does not rely on de Bruijn graphs, and instead employs a different graph construction approach called &lsquo;string graph&rsquo;.</p><p>Following are the genome assembly tools based on string graph:</p><p>1.SGA (String Graph Assembler)&nbsp;https://github.com/jts/sga</p><p>Assembles large genomes from high coverage short read data. SGA is designed as a modular set of programs, which are used to form an assembly pipeline. SGA implements a set of assembly algorithms based on the FM-index. As the FM-index is a compressed data structure, the algorithms are very memory efficient. The SGA assembly has three distinct phases. The first phase corrects base calling errors in the reads. The second phase assembles contigs from the corrected reads. The third phase uses paired end and/or mate pair data to build scaffolds from the contigs. The output of this software is a PDF report that allows the properties of the genome and data quality to be visually explored. By providing more information to the user at the start of an assembly project, this software will help increase awareness of the factors that make a given assembly easy or difficult, assist in the selection of software and parameters and help to troubleshoot an assembly if it runs into problems.</p><p>2.&nbsp;SAGE: String-overlap Assembly of GEnomes&nbsp;https://github.com/lucian-ilie/SAGE2</p><p>SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers.</p><p>3. FSG: Fast String Graph</p><p>The new integrated assembler has been assessed on a standard benchmark, showing that fast string graph (FSG) is significantly faster than SGA while maintaining a moderate use of main memory, and showing practical advantages in running FSG on multiple threads. Moreover, we have studied the effect of coverage rates on the running times.</p><p>4.&nbsp;&nbsp;BASE&nbsp;https://github.com/dhlbh/BASE</p><p>It enhances the classic seed-extension approach by indexing the reads efficiently to generate adaptive seeds that have high probability to appear uniquely in the genome. Such seeds form the basis for BASE to build extension trees and then to use reverse validation to remove the branches based on read coverage and paired-end information, resulting in high-quality consensus sequences of reads sharing the seeds. Such consensus sequences are then extended to contigs.&nbsp;BASE is a practically efficient tool for constructing contig, with significant improvement in quality for long NGS reads. It is relatively easy to extend BASE to include scaffolding.</p><p>5.&nbsp;Fermi&nbsp;https://github.com/lh3/fermi/</p><p>Fermi is a de novo assembler with a particular focus on assembling Illumina&nbsp;short sequence reads from a mammal-sized genome. In addition to the role of a&nbsp;typical assembler, fermi also aims to preserve heterozygotes which are often&nbsp;collapsed by other assemblers. Its ultimate goal is to find a minimal set of&nbsp;unitigs to represent all the information in raw reads.</p><p>If you want to learn about String Graph assembler, please read the following papers -</p><p>i)&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/21/suppl_2/ii79.full.pdf+html">The Fragment Assembly String Graph - E. W. Myers</a></p><p>This paper describes the String Graph concept.</p><p>ii)&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/26/12/i367.full#ref-20">Efficient construction of an assembly string graph using the FM-index - Jared T. Simpson and Richard Durbin</a></p><p>This earlier paper from Simpson and Durbin</p><p>iii)&nbsp;<a href="http://genome.cshlp.org/content/early/2012/01/22/gr.126953.111">Efficient de novo assembly of large genomes using compressed data structures - Jared T. Simpson and Richard Durbin</a></p><p>&nbsp;</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38801/genome-assembly-forensics-finding-the-elusive-mis-assembly</guid>
	<pubDate>Sat, 26 Jan 2019 18:02:01 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38801/genome-assembly-forensics-finding-the-elusive-mis-assembly</link>
	<title><![CDATA[Genome assembly forensics: finding the elusive mis-assembly]]></title>
	<description><![CDATA[<p><span>We present the first collection of tools aimed at automated genome assembly validation. This work formalizes several mechanisms for detecting mis-assemblies, and describes their implementation in our automated validation pipeline, called&nbsp;</span><em>amosvalidate</em><span>. We demonstrate the application of our pipeline in both bacterial and eukaryotic genome assemblies, and highlight several assembly errors in both draft and finished genomes. The software described is compatible with common assembly formats and is released, open-source, at&nbsp;</span><a href="http://amos.sourceforge.net/" target="_blank">http://amos.sourceforge.net</a><span>.</span></p>
<p>https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2397507/&nbsp;</p>
<p>http://amos.sourceforge.net/wiki/index.php/AMOS</p><p>Address of the bookmark: <a href="http://amos.sourceforge.net/wiki/index.php/AMOS" rel="nofollow">http://amos.sourceforge.net/wiki/index.php/AMOS</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39253/gmass-a-novel-measure-for-genomeassembly-structural-similarity</guid>
	<pubDate>Sun, 14 Apr 2019 20:35:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39253/gmass-a-novel-measure-for-genomeassembly-structural-similarity</link>
	<title><![CDATA[GMASS: a novel measure for genomeassembly structural similarity]]></title>
	<description><![CDATA[<div id="Abstract">
<div id="ASec3">
<p id="Par3">The GMASS score is a novel measure for representing structural similarity between two assemblies. It will contribute to the understanding of assembly output and developing de novo assemblers.</p>
<p><a href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2710-z">https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2710-z</a></p>
</div>
</div><p>Address of the bookmark: <a href="http://bioinfo.konkuk.ac.kr/GMASS/htdocs/syncircos.php" rel="nofollow">http://bioinfo.konkuk.ac.kr/GMASS/htdocs/syncircos.php</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>

</channel>
</rss>