<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44896?offset=130</link>
	<atom:link href="https://bioinformaticsonline.com/related/44896?offset=130" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36621/hapcut2-robust-and-accurate-haplotype-assembly-for-diverse-sequencing-technologies</guid>
	<pubDate>Tue, 15 May 2018 07:35:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36621/hapcut2-robust-and-accurate-haplotype-assembly-for-diverse-sequencing-technologies</link>
	<title><![CDATA[HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies]]></title>
	<description><![CDATA[HapCUT2 is a maximum-likelihood-based tool for assembling haplotypes from DNA sequence reads, designed to "just work" with excellent speed and accuracy. We found that previously described haplotype assembly methods are specialized for specific read technologies or protocols, with slow or inaccurate performance on others. With this in mind, HapCUT2 is designed for speed and accuracy across diverse sequencing technologies, including but not limited to:

NGS short reads (Illumina HiSeq)
clone-based sequencing (Fosmid or BAC clones)
SMRT reads (PacBio)
Oxford Nanopore reads
10X Genomics Linked-Reads
proximity-ligation (Hi-C) reads
high-coverage sequencing (&gt;40x coverage-per-SNP) using above technologies
combinations of the above technologies (e.g. scaffold long reads with Hi-C reads)
See below for specific examples of command line options and best practices for some of these technologies.

NOTE: At this time HapCUT2 is for diploid organisms only. VCF input should contain diploid variants.

If you use HapCUT2 in your research, please cite:

Edge, P., Bafna, V. &amp; Bansal, V. HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res. gr.213462.116 (2016). doi:10.1101/gr.213462.116<p>Address of the bookmark: <a href="https://github.com/vibansal/HapCUT2" rel="nofollow">https://github.com/vibansal/HapCUT2</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/40497/artificial-intelligence-is-more-accurate-than-doctors-in-diagnosing-breast-cancer</guid>
	<pubDate>Wed, 01 Jan 2020 22:12:34 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/40497/artificial-intelligence-is-more-accurate-than-doctors-in-diagnosing-breast-cancer</link>
	<title><![CDATA[Artificial intelligence is more accurate than doctors in diagnosing breast cancer]]></title>
	<description><![CDATA[<p>Artificial intelligence is more accurate than doctors in diagnosing breast cancer from mammograms, a study in the journal Nature suggests.</p><p>An international team, including researchers from&nbsp;<a href="https://health.google/" target="_blank">Google Health</a>&nbsp;and&nbsp;<a href="https://www.imperial.ac.uk/news/183293/research-collaboration-aims-improve-breast-cancer/" target="_blank">Imperial College London</a>, designed and trained a computer model on X-ray images from nearly 29,000 women.</p><p>The algorithm&nbsp;<a href="https://nature.com/articles/s41586-019-1799-6" target="_blank">outperformed six radiologists</a>&nbsp;in reading mammograms.</p><p>AI was still as good as two doctors working together.</p><p>Unlike humans, AI is tireless. Experts say it could improve detection. Read More:&nbsp;<a href="https://www.bbc.com/news/health-50857759" target="_blank">https://www.bbc.com/news/health-50857759</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41501/hicanu-accurate-assembly-of-segmental-duplications-satellites-and-allelic-variants-from-high-fidelity-long-reads</guid>
	<pubDate>Fri, 27 Mar 2020 22:49:31 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41501/hicanu-accurate-assembly-of-segmental-duplications-satellites-and-allelic-variants-from-high-fidelity-long-reads</link>
	<title><![CDATA[HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads]]></title>
	<description><![CDATA[<p><span>HiCanu, a significant modification of the Canu assembler designed to leverage the full potential of HiFi reads via homopolymer compression, overlap-based error correction, and aggressive false overlap filtering.&nbsp;</span></p>
<p>More at&nbsp;<a href="https://www.biorxiv.org/content/10.1101/2020.03.14.992248v3?fbclid=IwAR2PaN4GLjvAZpWmCE2q0EWk2dtwY7wiKxVlXn9PPG7OBSP06PP2gcCrv3A">https://www.biorxiv.org/content/10.1101/2020.03.14.992248v3</a></p><p>Address of the bookmark: <a href="https://github.com/marbl/canu" rel="nofollow">https://github.com/marbl/canu</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42477/hifiasm-a-haplotype-resolved-assembler-for-accurate-hifi-reads</guid>
	<pubDate>Thu, 24 Dec 2020 10:03:36 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42477/hifiasm-a-haplotype-resolved-assembler-for-accurate-hifi-reads</link>
	<title><![CDATA[Hifiasm: a haplotype-resolved assembler for accurate Hifi reads]]></title>
	<description><![CDATA[<p><span>Hifiasm is a fast haplotype-resolved de novo assembler for PacBio Hifi reads. It can assemble a human genome in several hours and works with the California redwood genome, one of the most complex genomes sequenced so far. Hifiasm can produce primary/alternate assemblies of quality competitive with the best assemblers. It also introduces a new graph binning algorithm and achieves the best haplotype-resolved assembly given trio data.</span></p><p>Address of the bookmark: <a href="https://github.com/chhylp123/hifiasm" rel="nofollow">https://github.com/chhylp123/hifiasm</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44894/dna2bit-an-ultra-fast-and-accurate-genomic-distance-estimation-software</guid>
	<pubDate>Sun, 31 Aug 2025 06:24:58 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44894/dna2bit-an-ultra-fast-and-accurate-genomic-distance-estimation-software</link>
	<title><![CDATA[dna2bit: an ultra-fast and accurate genomic distance estimation software]]></title>
	<description><![CDATA[<p><span>dna2bit is a software tool developed in C++11, leveraging the capabilities of OpenMP for parallel computing and the popcount technique for efficient bit manipulation. It has been thoroughly tested using the g++ and clang compilers on both Linux and MacOS platforms.</span></p><p>Address of the bookmark: <a href="https://github.com/lijuzeng/dna2bit" rel="nofollow">https://github.com/lijuzeng/dna2bit</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41452/apollo-a-sequencing-technology-independent-scalable-and-accurate-assembly-polishing-algorithm</guid>
	<pubDate>Mon, 16 Mar 2020 10:09:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41452/apollo-a-sequencing-technology-independent-scalable-and-accurate-assembly-polishing-algorithm</link>
	<title><![CDATA[Apollo: A Sequencing-Technology-Independent, Scalable, and Accurate Assembly Polishing Algorithm]]></title>
	<description><![CDATA[<p><span>Apollo is an assembly polishing algorithm that attempts to correct the errors in an assembly. It can take multiple set of reads in a single run and polish the assemblies of genomes of any size. Described by Firtina et al. (preliminary version at&nbsp;</span><a href="https://arxiv.org/pdf/1902.04341.pdf">https://arxiv.org/pdf/1902.04341.pdf</a></p>
<p>More at&nbsp;<a href="https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa179/5804978?rss=1">https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa179/5804978?rss=1</a></p><p>Address of the bookmark: <a href="https://github.com/CMU-SAFARI/Apollo" rel="nofollow">https://github.com/CMU-SAFARI/Apollo</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38625/croco-a-program-to-detect-potential-cross-contaminations-in-hts-assembled-transcriptomes-using-expression-level-quantification</guid>
	<pubDate>Mon, 07 Jan 2019 18:17:44 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38625/croco-a-program-to-detect-potential-cross-contaminations-in-hts-assembled-transcriptomes-using-expression-level-quantification</link>
	<title><![CDATA[CroCo: A program to detect potential cross contaminations in HTS assembled transcriptomes using expression level quantification]]></title>
	<description><![CDATA[<p>CroCo is a program to detect cross contamination events in assembled transcriptomes using sequencing reads to determine the true origin of every transcripts.<br>Such cross contaminations can be expected if several RNA-Seq experiments were prepared during the same period at the same lab, or by the same people, or if they were processed or sequenced by the same sequencing service facility.<br>Our approach first determines a subset of transcripts that are suspiciously similar across samples using a pairwise BLAST procedure. CroCo then combine all transcriptomes into a metatranscriptome and quantifies the "expression level" of all transcripts successively using every sample read data (e.g. several species sequenced by the same lab for a particular study) while allowing read multi-mappings.<br>Several mapping tools implemented in CroCo can be used to estimate expression level (default is RapMap).<br>This information is then used to categorize each transcript in the following 5 categories :</p>
<p><br>clean: the transcript origin is from the focal sample.</p>
<p>cross contamination: the transcript origin is from an alien sample of the same experiment.</p>
<p>dubious: expression levels are too close between focal and alien samples to determine the true origin of the transcript.</p>
<p>low coverage: expression levels are too low in all samples, thus hampering our procedure (which relies on differential expression) to confidently assign it to any category.</p>
<p>over expressed: expression levels are very high in at least 3 samples and CroCo will not try to categorize it. Indeed, such a pattern does not correspond to expectations for cross contaminations, but often reflect highly conserved genes such as ribosomal gene, or external contamination shared by several samples (e.g. Escherichia coli contaminations).</p><p>Address of the bookmark: <a href="https://gitlab.mbb.univ-montp2.fr/mbb/CroCo" rel="nofollow">https://gitlab.mbb.univ-montp2.fr/mbb/CroCo</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34565/fogsaa-fast-optimal-global-sequence-alignment-algorithm</guid>
	<pubDate>Fri, 08 Dec 2017 14:41:08 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34565/fogsaa-fast-optimal-global-sequence-alignment-algorithm</link>
	<title><![CDATA[FOGSAA: Fast Optimal Global Sequence Alignment Algorithm]]></title>
	<description><![CDATA[<p>Sequence alignment algorithms are widely used to infer similarirty and the point of differences between pair of sequences. FOGSAA is a fast Global alignment algorithm. It is basically a branch and bound approach which starts branch expansion in a greedy way taking the symbols from the given pair of sequences (protein or nucleotide) and results in an optimal alignment faster than conventional dymanic programming techniques. It is also better than the heuristic methods with respect to alignment quality.</p><p>Address of the bookmark: <a href="http://www.isical.ac.in/~bioinfo_miu/FOGSAA.htm" rel="nofollow">http://www.isical.ac.in/~bioinfo_miu/FOGSAA.htm</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34618/mashmap-a-fast-and-approximate-software-for-mapping-long-reads-pacbioont-or-assembly-to-reference-genomes</guid>
	<pubDate>Tue, 12 Dec 2017 17:23:31 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34618/mashmap-a-fast-and-approximate-software-for-mapping-long-reads-pacbioont-or-assembly-to-reference-genomes</link>
	<title><![CDATA[MashMap: a fast and approximate software for mapping long reads (PacBio/ONT) or assembly to reference genome(s)]]></title>
	<description><![CDATA[<p><span>MashMap is a fast and approximate software for mapping long reads (PacBio/ONT) or assembly to reference genome(s). It maps a query sequence against a reference region if and only if its estimated alignment identity is above a specified threshold. It does not compute the alignments explicitly, but rather estimates a&nbsp;</span><em>k</em><span>-mer based&nbsp;</span><a href="https://en.wikipedia.org/wiki/Jaccard_index">Jaccard similarity</a><span>&nbsp;using a combination of&nbsp;</span><a href="http://www.cs.princeton.edu/courses/archive/spr05/cos598E/bib/p76-schleimer.pdf">Winnowing</a><span>&nbsp;and&nbsp;</span><a href="https://en.wikipedia.org/wiki/MinHash">MinHash</a><span>. This is then converted to an estimate of sequence identity using the&nbsp;</span><a href="http://mash.readthedocs.org/">Mash</a><span>&nbsp;distance. An appropriate&nbsp;</span><em>k</em><span>-mer sampling rate is automatically determined given minimum local alignment length and identity thresholds. The efficiency of the algorithm improves as both of these thresholds are increased.</span></p><p>Address of the bookmark: <a href="https://github.com/marbl/MashMap" rel="nofollow">https://github.com/marbl/MashMap</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36512/hisat2-a-fast-and-sensitive-alignment-program-for-mapping-next-generation-sequencing-reads</guid>
	<pubDate>Tue, 08 May 2018 04:27:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36512/hisat2-a-fast-and-sensitive-alignment-program-for-mapping-next-generation-sequencing-reads</link>
	<title><![CDATA[HISAT2: a fast and sensitive alignment program for mapping next-generation sequencing reads]]></title>
	<description><![CDATA[<p><strong>HISAT2</strong><span>&nbsp;is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as to a single reference genome). Based on an extension of BWT for graphs&nbsp;</span><a href="http://dl.acm.org/citation.cfm?id=2674828">[Sir&eacute;n et al. 2014]</a><span>, we designed and implemented a graph FM index (GFM), an original approach and its first implementation to the best of our knowledge. In addition to using one global GFM index that represents a population of human genomes, HISAT2 uses a large set of small GFM indexes that collectively cover the whole genome (each index representing a genomic region of 56 Kbp, with 55,000 indexes needed to cover the human population). These small indexes (called local indexes), combined with several alignment strategies, enable rapid and accurate alignment of sequencing reads. This new indexing scheme is called a Hierarchical Graph FM index (HGFM).&nbsp;</span></p>
<p><span>more at&nbsp;https://ccb.jhu.edu/software/hisat2/index.shtml</span></p><p>Address of the bookmark: <a href="https://github.com/infphilo/hisat2" rel="nofollow">https://github.com/infphilo/hisat2</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>