<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/34618?offset=250</link>
	<atom:link href="https://bioinformaticsonline.com/related/34618?offset=250" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37574/simlord-a-read-simulator-for-third-generation-sequencing-reads</guid>
	<pubDate>Wed, 22 Aug 2018 10:40:27 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37574/simlord-a-read-simulator-for-third-generation-sequencing-reads</link>
	<title><![CDATA[SimLoRD: A read simulator for third generation sequencing reads]]></title>
	<description><![CDATA[<p>SimLoRD is a read simulator for third generation sequencing reads and is currently focused on the Pacific Biosciences SMRT error model.</p>
<p>Reads are simulated from both strands of a provided or randomly generated reference sequence.</p>
<div id="rst-header-features">
<ul>
<li>The reference can be read from a FASTA file or randomly generated with a given GC content. It can consist of several chromosomes, whose structure is respected when drawing reads. (Simulation of genome rearrangements may be incorporated at a later stage.)</li>
<li>The read lengths can be determined in four ways: drawing from a log-normal distribution (typical for genomic DNA), sampling from an existing FASTQ file (typical for RNA), sampling from a a text file with integers (RNA), or using a fixed length</li>
<li>Quality values and number of passes depend on fragment length.</li>
<li>Provided subread error probabilities are modified according to number of passes</li>
<li>Outputs reads in FASTQ format and alignments in SAM format</li>
</ul>
</div><p>Address of the bookmark: <a href="https://bitbucket.org/genomeinformatics/simlord/" rel="nofollow">https://bitbucket.org/genomeinformatics/simlord/</a></p>]]></description>
	<dc:creator>Aaryan Lokwani</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38481/arcs-scaffolding-genome-drafts-with-linked-reads</guid>
	<pubDate>Mon, 17 Dec 2018 17:40:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38481/arcs-scaffolding-genome-drafts-with-linked-reads</link>
	<title><![CDATA[ARCS: scaffolding genome drafts with linked reads]]></title>
	<description><![CDATA[<p>ARCS requires two input files:</p>
<ul>
<li>Draft assembly fasta file</li>
<li>Interleaved linked reads file (Barcode sequence expected in the BX tag of the read header or in the form "@readname_barcode" ; Run&nbsp;<a href="https://support.10xgenomics.com/genome-exome/software/pipelines/latest/what-is-long-ranger">Long Ranger basic</a>&nbsp;on raw chromium reads to produce this interleaved file)</li>
<li></li>
</ul><p>Address of the bookmark: <a href="https://github.com/bcgsc/ARCS/" rel="nofollow">https://github.com/bcgsc/ARCS/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41030/slr-superscaffolder-a-scaffold-assemble-pipeline-for-stlfr-reads</guid>
	<pubDate>Fri, 14 Feb 2020 14:23:30 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41030/slr-superscaffolder-a-scaffold-assemble-pipeline-for-stlfr-reads</link>
	<title><![CDATA[SLR-superscaffolder: A scaffold assemble pipeline for stLFR reads.]]></title>
	<description><![CDATA[<p>This is a scaffold assembler designed for stLFR reads[1]. It uses the link-reads information from stLFR reads to assemble contigs to scaffolds.</p>
<p>Here is an illustration of this pipeline:</p>
<p>&nbsp;<img src="https://github.com/BGI-Qingdao/SLR-superscaffolder/raw/master/image.png" alt="image" style="border: 0px;"></p><p>Address of the bookmark: <a href="https://github.com/BGI-Qingdao/SLR-superscaffolder" rel="nofollow">https://github.com/BGI-Qingdao/SLR-superscaffolder</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42139/mixtures-a-novel-tool-for-bacterial-strain-reconstruction-from-reads</guid>
	<pubDate>Fri, 21 Aug 2020 08:23:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42139/mixtures-a-novel-tool-for-bacterial-strain-reconstruction-from-reads</link>
	<title><![CDATA[mixtureS: a novel tool for bacterial strain reconstruction from reads]]></title>
	<description><![CDATA[<div>
<p>mixtureS that can de novo identify bacterial strains from shotgun reads of a clonal or metagenomic sample, without prior knowledge about the strains and their variations. Tested on 243 simulated datasets and 195 experimental datasets, mixtureS reliably identified the strains, their numbers and their abundance. Compared with three tools, mixtureS showed better performance in almost all simulated datasets and the vast majority of experimental datasets.</p>
</div>
<div>
<div>Availability</div>
<p>The source code and tool mixtureS is available at&nbsp;<a href="http://www.cs.ucf.edu/~xiaoman/mixtureS/" target="_blank">http://www.cs.ucf.edu/&tilde;xiaoman/mixtureS/</a>.</p>
</div><p>Address of the bookmark: <a href="http://www.cs.ucf.edu/~xiaoman/mixtureS/" rel="nofollow">http://www.cs.ucf.edu/~xiaoman/mixtureS/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43273/understanding-kmer</guid>
	<pubDate>Wed, 18 Aug 2021 04:27:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43273/understanding-kmer</link>
	<title><![CDATA[Understanding kmer !]]></title>
	<description><![CDATA[<p><a href="https://en.wikipedia.org/wiki/k-mer">What is a&nbsp;<em>k-mer</em>&nbsp;anyway?</a><span>&nbsp;A&nbsp;</span><em>k-mer</em><span>&nbsp;is just a sequence of&nbsp;</span><em>k</em><span>&nbsp;characters in a string (or nucleotides in a DNA sequence). Now, it is important to remember that to get&nbsp;</span><em>all k-mers</em><span>&nbsp;from a sequence you need to get the first&nbsp;</span><em>k</em><span>&nbsp;characters, then move just a single character for the start of the next&nbsp;</span><em>k-mer</em><span>&nbsp;and so on. Effectively, this will create sequences that overlap in&nbsp;</span><code>k-1</code><span>&nbsp;positions.</span></p><p>Address of the bookmark: <a href="https://bioinfologics.github.io/post/2018/09/17/k-mer-counting-part-i-introduction/" rel="nofollow">https://bioinfologics.github.io/post/2018/09/17/k-mer-counting-part-i-introduction/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26975/trimmomatic-a-flexible-read-trimming-tool-for-illumina-ngs-data</guid>
	<pubDate>Fri, 15 Apr 2016 05:58:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26975/trimmomatic-a-flexible-read-trimming-tool-for-illumina-ngs-data</link>
	<title><![CDATA[Trimmomatic: A flexible read trimming tool for Illumina NGS data]]></title>
	<description><![CDATA[<h4>Paired End:</h4>
<p><code>java -jar trimmomatic-0.35.jar PE -phred33 input_forward.fq.gz input_reverse.fq.gz output_forward_paired.fq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36</code></p>
<p>This will perform the following:</p>
<ul>
<li>Remove adapters (ILLUMINACLIP:TruSeq3-PE.fa:2:30:10)</li>
<li>Remove leading low quality or N bases (below quality 3) (LEADING:3)</li>
<li>Remove trailing low quality or N bases (below quality 3) (TRAILING:3)</li>
<li>Scan the read with a 4-base wide sliding window, cutting when the average quality per base drops below 15 (SLIDINGWINDOW:4:15)</li>
<li>Drop reads below the 36 bases long (MINLEN:36)</li>
</ul>
<p>More at http://www.usadellab.org/cms/?page=trimmomatic</p><p>Address of the bookmark: <a href="http://www.usadellab.org/cms/?page=trimmomatic" rel="nofollow">http://www.usadellab.org/cms/?page=trimmomatic</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27973/wgsim</guid>
	<pubDate>Thu, 23 Jun 2016 07:26:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27973/wgsim</link>
	<title><![CDATA[WgSim]]></title>
	<description><![CDATA[<p>Reads simulator</p>
<p>Wgsim is a small tool for simulating sequence reads from a reference genome. It is able to simulate diploid genomes with SNPs and insertion/deletion (INDEL) polymorphisms, and simulate reads with uniform substitution sequencing errors. It does not generate INDEL sequencing errors, but this can be partly compensated by simulating INDEL polymorphisms.<br><br>Wgsim outputs the simulated polymorphisms, and writes the true read coordinates as well as the number of polymorphisms and sequencing errors in read names. One can evaluate the accuracy of a mapper or a SNP caller with wgsim_eval.pl that comes with the package.<br><br></p><p>Address of the bookmark: <a href="https://github.com/lh3/wgsim" rel="nofollow">https://github.com/lh3/wgsim</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30557/speedseq</guid>
	<pubDate>Fri, 20 Jan 2017 06:05:43 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30557/speedseq</link>
	<title><![CDATA[SpeedSeq]]></title>
	<description><![CDATA[<p>A flexible framework for rapid genome analysis and interpretation</p>
<p>C Chiang, R M Layer, G G Faust, M R Lindberg, D B Rose, E P Garrison, G T Marth, A R Quinlan, and I M Hall. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat Meth (2015). doi:10.1038/nmeth.3505.</p>
<p><a href="http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3505.html">http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3505.html</a></p><p>Address of the bookmark: <a href="https://github.com/hall-lab/speedseq" rel="nofollow">https://github.com/hall-lab/speedseq</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40893/quorum-an-error-corrector-for-illumina-reads</guid>
	<pubDate>Tue, 04 Feb 2020 23:26:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40893/quorum-an-error-corrector-for-illumina-reads</link>
	<title><![CDATA[QuorUM: An Error Corrector for Illumina Reads]]></title>
	<description><![CDATA[<p><span>We produce trimmed and error-corrected reads that result in assemblies with longer contigs and fewer errors. We compared QuorUM against several published error correctors and found that it is the best performer in most metrics we use. QuorUM is efficiently implemented making use of current multi-core computing architectures and it is suitable for large data sets (1 billion bases checked and corrected per day per core)</span></p><p>Address of the bookmark: <a href="http://www.genome.umd.edu/" rel="nofollow">http://www.genome.umd.edu/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44848/trust-but-verify-sequencing-your-cell-lines-might-reveal-an-uninvited-guest</guid>
	<pubDate>Wed, 04 Jun 2025 00:07:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44848/trust-but-verify-sequencing-your-cell-lines-might-reveal-an-uninvited-guest</link>
	<title><![CDATA[Trust But Verify: Sequencing Your Cell Lines Might Reveal an Uninvited Guest]]></title>
	<description><![CDATA[<p>High-throughput sequencing has become indispensable in cell biology, enabling detailed insights into chromatin structure, gene expression, and regulatory dynamics. Yet, when faced with unexpectedly low mapping rates to the human genome, researchers often rush to troubleshoot technical parameters&mdash;sequencer quality, adapter trimming, or aligner settings.</p><p>Before you go down that path, consider this critical biological question:<br /> <strong>Are you sequencing human cells&mdash;or bacterial contamination?</strong></p><h2>The Silent Saboteur: Mycoplasma in Cell Cultures</h2><p><em>Mycoplasma</em> contamination remains one of the most widespread and underdiagnosed issues in tissue culture work. Studies suggest that <strong>15&ndash;35% of cell lines in use may be contaminated</strong>, often without visible signs. Unlike other microbial infections, <em>Mycoplasma</em> does not produce cloudiness, odor, or a change in pH. Many researchers won&rsquo;t detect it unless they specifically test for it.</p><p>The consequences, however, are profound. <em>Mycoplasma</em> can significantly alter:</p><ul>
<li>
<p>Host gene expression patterns</p>
</li>
<li>
<p>Cell proliferation rates</p>
</li>
<li>
<p>Epigenetic profiles and chromatin accessibility</p>
</li>
<li>
<p>Cytokine signaling and immune responses</p>
</li>
</ul><p>In short, it can skew your results, compromise your biological conclusions, and invalidate weeks or months of research.</p><h2>A Simple Diagnostic Step: Map Against <em>Mycoplasma</em> Genomes</h2><p>If you encounter poor alignment rates to the human genome, consider mapping your reads to a <em>Mycoplasma</em> reference genome&mdash;or better yet, use a <strong>combined human + <em>Mycoplasma</em></strong> reference. There have been cases where over half of all reads, initially assumed to be from human cells, were in fact bacterial in origin. This check is fast, easy, and could save your project.</p><h2>How Contamination Happens&mdash;and Persists</h2><p><em>Mycoplasma</em> is small (0.1&ndash;0.3 &mu;m), lacks a cell wall, and can pass through standard filters undetected. Common sources include:</p><ul>
<li>
<p>Contaminated reagents (e.g., FBS)</p>
</li>
<li>
<p>Infected cell lines obtained from other labs</p>
</li>
<li>
<p>Poor aseptic technique or shared equipment</p>
</li>
</ul><p>Once present, it spreads quickly between cultures and can persist for months, silently affecting results.</p><h2>Why Treatment Is Difficult</h2><p>While antibiotics such as Plasmocin or BM-Cyclin are sometimes used, they often offer only partial resolution and may themselves alter cell behavior. In many cases, the best course of action is to <strong>discard the contaminated culture</strong> and start with a fresh, verified stock.</p><h2>Practical Recommendations for Researchers</h2><ul>
<li>
<p><strong>Routinely test for <em>Mycoplasma</em></strong> using PCR, qPCR, or fluorescence-based assays</p>
</li>
<li>
<p><strong>Incorporate contamination screens into your sequencing QC pipeline</strong></p>
</li>
<li>
<p><strong>Use combined reference genomes</strong> when mapping ambiguous reads</p>
</li>
<li>
<p><strong>Practice strict aseptic technique</strong> and monitor all incoming cell lines</p>
</li>
<li>
<p><strong>Don&rsquo;t ignore unexplained data anomalies</strong>&mdash;they might point to contamination</p>
</li>
</ul><h2>Closing Thought: Contamination Is a Biological Variable</h2><p>It&rsquo;s easy to view poor mapping as a technical issue, but sometimes the problem lies deeper&mdash;in the biology itself. <em>Mycoplasma</em> contamination doesn&rsquo;t just interfere with sequencing; it interferes with science. As a research community, we must treat contamination not as an afterthought, but as a key variable to control.</p><p>So next time your reads won&rsquo;t align, don&rsquo;t just tune the aligner. Ask if your cells are telling the truth&mdash;or if they're hiding something.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>

</channel>
</rss>