<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/40946?offset=130</link>
	<atom:link href="https://bioinformaticsonline.com/related/40946?offset=130" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/31566/software-and-tools-to-detect-structure-variation-with-long-reads</guid>
	<pubDate>Wed, 15 Mar 2017 14:31:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/31566/software-and-tools-to-detect-structure-variation-with-long-reads</link>
	<title><![CDATA[Software and Tools to detect structure variation with long reads !!]]></title>
	<description><![CDATA[<p>Uncovering the connection between genetics and heritable diseases requires an approach that looks at all the variant bases and types in a genome. While a PacBio&nbsp;<em>de novo</em>&nbsp;assembly resolves the most novel SV variants. 8-10X PacBio coverage of single genomes or trios reveals triple the SVs detectable by short-read data.</p><p>With&nbsp;<span style="text-decoration: underline;"><a href="http://www.pacb.com/smrt-science/">Single Molecule, Real-Time (SMRT) Sequencing</a></span>, you can access structural variations having a broad range of sizes, types, and GC content with the ability to:</p><ul>
<li>Uncover missing heritability linked to structural variation</li>
<li>Unambiguously identify genomic context and variant breakpoints at the sequence level to unravel the genetic etiology of disease</li>
<li>Resolve structural variation across the complete size spectrum with basepair resolution</li>
</ul><p>Following are the SV tools, which can assist you to achieve your goal.</p><p><strong>Sniffles:</strong>&nbsp;Structural variation caller using third generation sequencing</p><p>Sniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore). It detects all types of SVs using evidence from split-read alignments, high-mismatch regions, and coverage analysis. Please note the current version of Sniffles requires sorted output from BWA-MEM (use -M and -x parameter) or NGM-LR with the optional SAM attributes enabled!&nbsp;</p><p>More at&nbsp;https://github.com/fritzsedlazeck/Sniffles</p><p><strong style="font-size: 12.8px;"><br />MultiBreak-SV:</strong> It identifies structural variants from next-generation paired end data, third-generation long read data, or data from a combination of sequencing platforms.</p><p>There are two pieces of software in this release: (1) a pre-processor that takes machineformat (.m5) BLASR files, and (2) MultiBreak-SV. For installation and usage instructions, see doc/MultiBreakSV-Manual.txt.</p><p>More at&nbsp;https://github.com/raphael-group/multibreak-sv</p><p><strong style="font-size: 12.8px;"><br />Parliament:</strong>&nbsp;A Structural Variation Tool. Why ask a single sv-detection approach to find every variant when you can have a parliament of tools deciding?</p><p>Publication about the algorithm and &ldquo;&hellip;the first long-read characterization of structural variation in a diploid human personal genome&hellip;&rdquo; (HS1011) -&nbsp;<a href="http://www.biomedcentral.com/1471-2164/16/286">&ldquo;Assessing structural variation in a personal genome&mdash;towards a human reference diploid genome&rdquo;</a></p><p>More at&nbsp;https://sourceforge.net/projects/parliamentsv/</p><p>https://www.dnanexus.com/papers/Parliament_Info_Sheet.pdf</p><p><br /><strong>PBHoney:</strong>&nbsp;the structural variation discovery tool&nbsp;<br /><br />PBHoney is an implementation of two variant-identification approaches designed to exploit the high mappability of long reads (i.e., greater than 10,000 bp). PBHoney considers both intra-read discordance and soft-clipped tails of long reads to identify structural variants.</p><p>Read The Paper&nbsp;<a href="http://www.biomedcentral.com/1471-2105/15/180/abstract" target="_blank">http://www.biomedcentral.com/1471-2105/15/180/abstract</a></p><p>More at&nbsp;https://sourceforge.net/projects/pb-jelly/</p><p><strong><br />SMRT-SV:</strong> Structural variant and indel caller for PacBio reads</p><p>Structural variant (SV) and indel caller for PacBio reads based on methods from&nbsp;<a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html">Chaisson et al. 2014</a>.</p><p>SMRT-SV provides an official software package for tools described in&nbsp;<a href="http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html">Chaisson et al. 2014</a>&nbsp;and adds several key features including the following.</p><ul>
<li>Unified variant calling user interface with built-in cluster compute support</li>
<li>Small indel calling (2-49 bp)</li>
<li>Improved inversion calling (<code>screenInversions</code>)</li>
<li>Quality metric for SV calls based on number of local assemblies supporting each call</li>
<li>Higher sensitivity for SV calls using tiled local assemblies across the entire genome instead of "signature" regions</li>
<li>Genotyping of SVs with Illumina paired-end reads from WGS samples</li>
</ul><p>More at&nbsp;https://github.com/EichlerLab/pacbio_variant_caller</p>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35055/jabba-hybrid-error-correction-for-long-sequencing-reads</guid>
	<pubDate>Fri, 05 Jan 2018 03:58:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35055/jabba-hybrid-error-correction-for-long-sequencing-reads</link>
	<title><![CDATA[Jabba: Hybrid Error Correction for Long Sequencing Reads]]></title>
	<description><![CDATA[<p>Jabba is a hybrid error correction tool to correct third generation (PacBio / ONT) sequencing data, using second generation (Illumina) data.</p>
<p>Input</p>
<p>Jabba takes as input a concatenated de Bruijn graph and a set of sequences:</p>
<p>the de Bruijn graph should appear in fasta format with 1 entry per node, the meta information should be in the format:<br>&gt;NODE <br>the set of sequences should be in fasta or fastq format. These sequences will be corrected (e.g. PacBio reads). The corrections will be written to a file Jabba fasta.<br>The output is a file in fasta format with corrections of the long reads, and additionally a file in the input format containing uncorrected reads.</p>
<p>https://github.com/biointec/jabba/wiki</p>
<p>https://almob.biomedcentral.com/articles/10.1186/s13015-016-0075-7</p><p>Address of the bookmark: <a href="https://github.com/biointec/jabba" rel="nofollow">https://github.com/biointec/jabba</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43828/understanding-hifi-reads</guid>
	<pubDate>Thu, 24 Mar 2022 19:48:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43828/understanding-hifi-reads</link>
	<title><![CDATA[Understanding HiFi Reads !]]></title>
	<description><![CDATA[<p><span>While little public data is available for either of the new synthetic long read approaches, Illumina showed an example comparison earlier this year at the&nbsp;</span><a href="https://www.festivalofgenomics.com/rami-mehio" target="_blank">Festival of Genomics &amp; Biodata conference</a><span>&nbsp;(FoG 2022). In the IGV screenshot presented (below), synthetic Infinity reads &ndash; labeled &ldquo;Longas&rdquo; &ndash; are at the top, followed by standard Illumina short reads, and PacBio HiFi reads labeled &ldquo;CCS&rdquo; depicted at the bottom:</span></p><p>Address of the bookmark: <a href="http://pacb.com/blog/the-hifi-difference-true-long-reads-vs-synthetic-long-reads/" rel="nofollow">http://pacb.com/blog/the-hifi-difference-true-long-reads-vs-synthetic-long-reads/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44848/trust-but-verify-sequencing-your-cell-lines-might-reveal-an-uninvited-guest</guid>
	<pubDate>Wed, 04 Jun 2025 00:07:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44848/trust-but-verify-sequencing-your-cell-lines-might-reveal-an-uninvited-guest</link>
	<title><![CDATA[Trust But Verify: Sequencing Your Cell Lines Might Reveal an Uninvited Guest]]></title>
	<description><![CDATA[<p>High-throughput sequencing has become indispensable in cell biology, enabling detailed insights into chromatin structure, gene expression, and regulatory dynamics. Yet, when faced with unexpectedly low mapping rates to the human genome, researchers often rush to troubleshoot technical parameters&mdash;sequencer quality, adapter trimming, or aligner settings.</p><p>Before you go down that path, consider this critical biological question:<br /> <strong>Are you sequencing human cells&mdash;or bacterial contamination?</strong></p><h2>The Silent Saboteur: Mycoplasma in Cell Cultures</h2><p><em>Mycoplasma</em> contamination remains one of the most widespread and underdiagnosed issues in tissue culture work. Studies suggest that <strong>15&ndash;35% of cell lines in use may be contaminated</strong>, often without visible signs. Unlike other microbial infections, <em>Mycoplasma</em> does not produce cloudiness, odor, or a change in pH. Many researchers won&rsquo;t detect it unless they specifically test for it.</p><p>The consequences, however, are profound. <em>Mycoplasma</em> can significantly alter:</p><ul>
<li>
<p>Host gene expression patterns</p>
</li>
<li>
<p>Cell proliferation rates</p>
</li>
<li>
<p>Epigenetic profiles and chromatin accessibility</p>
</li>
<li>
<p>Cytokine signaling and immune responses</p>
</li>
</ul><p>In short, it can skew your results, compromise your biological conclusions, and invalidate weeks or months of research.</p><h2>A Simple Diagnostic Step: Map Against <em>Mycoplasma</em> Genomes</h2><p>If you encounter poor alignment rates to the human genome, consider mapping your reads to a <em>Mycoplasma</em> reference genome&mdash;or better yet, use a <strong>combined human + <em>Mycoplasma</em></strong> reference. There have been cases where over half of all reads, initially assumed to be from human cells, were in fact bacterial in origin. This check is fast, easy, and could save your project.</p><h2>How Contamination Happens&mdash;and Persists</h2><p><em>Mycoplasma</em> is small (0.1&ndash;0.3 &mu;m), lacks a cell wall, and can pass through standard filters undetected. Common sources include:</p><ul>
<li>
<p>Contaminated reagents (e.g., FBS)</p>
</li>
<li>
<p>Infected cell lines obtained from other labs</p>
</li>
<li>
<p>Poor aseptic technique or shared equipment</p>
</li>
</ul><p>Once present, it spreads quickly between cultures and can persist for months, silently affecting results.</p><h2>Why Treatment Is Difficult</h2><p>While antibiotics such as Plasmocin or BM-Cyclin are sometimes used, they often offer only partial resolution and may themselves alter cell behavior. In many cases, the best course of action is to <strong>discard the contaminated culture</strong> and start with a fresh, verified stock.</p><h2>Practical Recommendations for Researchers</h2><ul>
<li>
<p><strong>Routinely test for <em>Mycoplasma</em></strong> using PCR, qPCR, or fluorescence-based assays</p>
</li>
<li>
<p><strong>Incorporate contamination screens into your sequencing QC pipeline</strong></p>
</li>
<li>
<p><strong>Use combined reference genomes</strong> when mapping ambiguous reads</p>
</li>
<li>
<p><strong>Practice strict aseptic technique</strong> and monitor all incoming cell lines</p>
</li>
<li>
<p><strong>Don&rsquo;t ignore unexplained data anomalies</strong>&mdash;they might point to contamination</p>
</li>
</ul><h2>Closing Thought: Contamination Is a Biological Variable</h2><p>It&rsquo;s easy to view poor mapping as a technical issue, but sometimes the problem lies deeper&mdash;in the biology itself. <em>Mycoplasma</em> contamination doesn&rsquo;t just interfere with sequencing; it interferes with science. As a research community, we must treat contamination not as an afterthought, but as a key variable to control.</p><p>So next time your reads won&rsquo;t align, don&rsquo;t just tune the aligner. Ask if your cells are telling the truth&mdash;or if they're hiding something.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33909/itol-interactive-tree-of-life</guid>
	<pubDate>Tue, 18 Jul 2017 05:36:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33909/itol-interactive-tree-of-life</link>
	<title><![CDATA[iTOL: Interactive Tree Of Life]]></title>
	<description><![CDATA[<p><strong>Interactive Tree Of Life</strong>&nbsp;is an online tool for the display, annotation and management of phylogenetic trees.</p>
<p>Explore your trees directly in the browser, and annotate them with various types of data.</p>
<p><span>iTOL can easily visualize trees with 50'000 or more leaves. With advanced search capabilities and display of unrooted, circular and regular cladograms or phylograms, exploring and navigating trees of any size is simple.</span></p>
<p>https://itol.embl.de/</p><p>Address of the bookmark: <a href="https://itol.embl.de/" rel="nofollow">https://itol.embl.de/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/42327/blaxter-lab</guid>
  <pubDate>Thu, 19 Nov 2020 08:05:28 -0600</pubDate>
  <link></link>
  <title><![CDATA[Blaxter Lab]]></title>
  <description><![CDATA[
<p>Using these high quality genomes we explore</p>

<p>the evolutionary history of genes and species, building phylogenetic trees of life<br />the contrasting roles of horizontal gene transfer and introgression in shaping evolution<br />the biology of symbioses, especially symbioses between eukaryotes and bacteria, and between parasites and their hosts<br />the processes that drive the evolution of pattern in the structure of chromosomes<br />the diversity of meiofauna, particularly tardigrades, nematodes and other Ecdysozoa<br />the genomics of extremophilia</p>

<p>More at https://www.sanger.ac.uk/group/blaxter-group/</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38055/ancestral-genomes-a-resource-for-reconstructed-ancestral-genes-and-genomes-across-the-tree-of-life</guid>
	<pubDate>Fri, 02 Nov 2018 08:16:27 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38055/ancestral-genomes-a-resource-for-reconstructed-ancestral-genes-and-genomes-across-the-tree-of-life</link>
	<title><![CDATA[Ancestral Genomes: a resource for reconstructed ancestral genes and genomes across the tree of life]]></title>
	<description><![CDATA[<p><span>&nbsp;Ancestral Genomes (</span><a href="http://ancestralgenomes.org/" target="">http://ancestralgenomes.org</a><span>) is a resource for comprehensive reconstructions of these &lsquo;fossil genomes&rsquo;. Comprehensive sets of protein-coding genes have been reconstructed for 78 genomes of now-extinct species that were the common ancestors of extant species from across the tree of life.&nbsp;</span></p><p>Address of the bookmark: <a href="http://ancestralgenomes.org/" rel="nofollow">http://ancestralgenomes.org/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43916/understanding-dump-files-from-ncbi-taxonomy-database</guid>
	<pubDate>Fri, 15 Jul 2022 04:29:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43916/understanding-dump-files-from-ncbi-taxonomy-database</link>
	<title><![CDATA[Understanding DUMP files from NCBI Taxonomy database !]]></title>
	<description><![CDATA[<p>*.dmp files are bcp-like dump from GenBank taxonomy database</p><p>General information.</p><p>Field terminator is "\t|\t"</p><p>Row terminator is "\t|\n"</p><p>&nbsp;</p><p>nodes.dmp file consists of taxonomy nodes. The description for each node includes the following</p><p>fields:</p><p>tax_id -- node id in GenBank taxonomy database</p><p>&nbsp; parent tax_id -- parent node id in GenBank taxonomy database</p><p>&nbsp; rank -- rank of this node (superkingdom, kingdom, ...)&nbsp;</p><p>&nbsp; embl code -- locus-name prefix; not unique</p><p>&nbsp; division id -- see division.dmp file</p><p>&nbsp; inherited div flag&nbsp; (1 or 0) -- 1 if node inherits division from parent</p><p>&nbsp; genetic code id -- see gencode.dmp file</p><p>&nbsp; inherited GC&nbsp; flag&nbsp; (1 or 0) -- 1 if node inherits genetic code from parent</p><p>&nbsp; mitochondrial genetic code id -- see gencode.dmp file</p><p>&nbsp; inherited MGC flag&nbsp; (1 or 0) -- 1 if node inherits mitochondrial gencode from parent</p><p>&nbsp; GenBank hidden flag (1 or 0)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- 1 if name is suppressed in GenBank entry lineage</p><p>&nbsp; hidden subtree root flag (1 or 0) &nbsp; &nbsp; &nbsp; -- 1 if this subtree has no sequence data yet</p><p>&nbsp; comments -- free-text comments and citations</p><p>&nbsp;</p><p>Taxonomy names file (names.dmp):</p><p>tax_id -- the id of node associated with this name</p><p>name_txt -- name itself</p><p>unique name -- the unique variant of this name if name not unique</p><p>name class -- (synonym, common name, ...)</p><p>&nbsp;</p><p>Divisions file (division.dmp):</p><p>division id -- taxonomy database division id</p><p>division cde -- GenBank division code (three characters)</p><p>division name -- e.g. BCT, PLN, VRT, MAM, PRI...</p><p>comments</p><p>&nbsp;</p><p>Genetic codes file (gencode.dmp):</p><p>genetic code id -- GenBank genetic code id</p><p>abbreviation -- genetic code name abbreviation</p><p>name -- genetic code name</p><p>cde -- translation table for this genetic code</p><p>starts -- start codons for this genetic code</p><p>&nbsp;</p><p>Deleted nodes file (delnodes.dmp):</p><p>tax_id -- deleted node id</p><p>&nbsp;</p><p>Merged nodes file (merged.dmp):</p><p>old_tax_id&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- id of nodes which has been merged</p><p>new_tax_id&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- id of nodes which is result of merging</p><p>Citations file (citations.dmp):</p><p>cit_id -- the unique id of citation</p><p>cit_key -- citation key</p><p>pubmed_id -- unique id in PubMed database (0 if not in PubMed)</p><p>medline_id -- unique id in MedLine database (0 if not in MedLine)</p><p>url -- URL associated with citation</p><p>text -- any text (usually article name and authors).</p><p>-- The following characters are escaped in this text by a backslash:</p><p>-- newline (appear as "\n"),</p><p>-- tab character ("\t"),</p><p>-- double quotes ('\"'),</p><p>-- backslash character ("\\").</p><p>taxid_list -- list of node ids separated by a single space</p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33847/omega2-metagenome-assembly-pipeline</guid>
	<pubDate>Mon, 10 Jul 2017 05:56:07 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33847/omega2-metagenome-assembly-pipeline</link>
	<title><![CDATA[Omega2: metagenome assembly pipeline]]></title>
	<description><![CDATA[<p><span>Omega found overlaps between reads using a prefix/suffix hash table. The overlap graph of reads was simplified by removing transitive edges and trimming short branches. Unitigs were generated based on minimum cost flow analysis of the overlap graph and then merged to contigs and scaffolds using mate-pair information. In comparison with three de Bruijn graph assemblers (SOAPdenovo, IDBA-UD and MetaVelvet), Omega provided comparable overall performance on a HiSeq 100-bp dataset and superior performance on a MiSeq 300-bp dataset. In comparison with Celera on the MiSeq dataset, Omega provided more continuous assemblies overall using a fraction of the computing time of existing overlap-layout-consensus assemblers. This indicates Omega can more efficiently assemble longer Illumina reads, and at deeper coverage, for metagenomic datasets.</span></p><p>Address of the bookmark: <a href="http://omega.omicsbio.org/" rel="nofollow">http://omega.omicsbio.org/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35345/rgfa-powerful-and-convenient-handling-of-assembly-graphs</guid>
	<pubDate>Thu, 25 Jan 2018 05:47:53 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35345/rgfa-powerful-and-convenient-handling-of-assembly-graphs</link>
	<title><![CDATA[RGFA: powerful and convenient handling of assembly graphs]]></title>
	<description><![CDATA[<p><span>RGFA, an implementation of the proposed GFA specification in Ruby. It allows the user to conveniently parse, edit and write GFA files. Complex operations such as the separation of the implicit instances of repeats and the merging of linear paths can be performed. A typical application of RGFA is the editing of a graph, to finish the assembly of a sequence, using information not available to the assembler. We illustrate a use case, in which the assembly of a repetitive metagenomic fosmid insert was completed using a script based on RGFA.</span></p>
<p><span>https://github.com/ggonnella/rgfa</span></p><p>Address of the bookmark: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5103826/" rel="nofollow">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5103826/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>