<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/36630?offset=110</link>
	<atom:link href="https://bioinformaticsonline.com/related/36630?offset=110" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38563/hecil-a-hybrid-error-correction-algorithm-for-long-reads-with-iterative-learning</guid>
	<pubDate>Tue, 01 Jan 2019 12:01:00 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38563/hecil-a-hybrid-error-correction-algorithm-for-long-reads-with-iterative-learning</link>
	<title><![CDATA[HECIL: A Hybrid Error Correction Algorithm for Long Reads with Iterative Learning]]></title>
	<description><![CDATA[<p><span>HECIL&mdash;Hybrid Error Correction with Iterative Learning&mdash;a hybrid error correction framework that determines a correction policy for erroneous long reads, based on optimal combinations of decision weights obtained from short read alignments.&nbsp;</span></p>
<p><span><span>HECIL&rsquo;s core algorithm by introducing an iterative learning paradigm that enhances the correction policy at each iteration by incorporating knowledge gathered from previous iterations via data-driven confidence metrics assigned to prior corrections.</span></span></p><p>Address of the bookmark: <a href="https://github.com/NDBL/HECIL" rel="nofollow">https://github.com/NDBL/HECIL</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</guid>
	<pubDate>Sat, 16 Jan 2021 21:42:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42633/protocol-for-de-novo-genome-assembly-using-illumina-reads</link>
	<title><![CDATA[Protocol for De novo Genome Assembly using Illumina Reads]]></title>
	<description><![CDATA[<p>In this protocol, we address and describe the de novo assembly method for small to medium-sized genomes.</p><p><strong>What is de novo genome assembly?<br /></strong>The method of taking a large number of short DNA sequences and placing them back together to create a reflection of the original chromosomes from which the DNA originated relates to genome assembly. No previous knowledge of the source DNA sequence length, structure or composition is inferred by De novo genome assemblies. The DNA of the target organism is split up into millions of tiny parts and read on a sequencing computer in a genome sequencing experiment. Depending on the sequencing system used, these "reads" range from 20 to 1000 nucleotide base pairs (bp) in length. Usually, length reads of 36 - 150 bp are produced for Illumina style short read sequencing. These reads can be either &ldquo;single ended&rdquo; as described above or &ldquo;paired end.&rdquo;</p><p><strong>Why genome assembly?</strong><br />In basic research into why and how they live, as well as in applied topics, identifying the DNA sequence of an organism is useful. Awareness of a DNA sequence may be useful in virtually any biological research because of the relevance of DNA to living things. For example, it may be used in medicine to classify, diagnose and eventually improve genetic disorder therapies. Similarly, pathogens study can lead to treatments for infectious diseases.</p><p><strong>Raw NGS data</strong><br />Reads can be saved as a Fasta file as text or in a FastQ file with their attributes.&nbsp;FastQ is the most common read file format since this is what the Illumina sequencing pipeline creates. This will henceforth be the subject of our conversation.</p><p><strong>In a nutshell the protocol:</strong> <br />Get the sequence file(s) read from the sequencing machine (s). <br />Look at the readings - have an idea of what you have and what the standard is like. <br />If required, raw data cleanup/quality trimming. <br />Choose an adequate parameter set for assembly. <br />Assemble the data into scaffolds/contigs. <br />Examine the assembly performance and determine the efficiency of the assembly.</p><p><strong>Read Quality Control:</strong><br />Check the qualiy with fastQC.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42540/install-fastqc-using-conda</p><p>Quality trimming/cleanup of read files.<br />This function trims adapters, barcodes and other contaminants from the reads.<br />Script<br />https://bioinformaticsonline.com/snippets/view/42542/trimmomatic-command</p><p><strong>Genome Assembly:</strong><br />The object of this portion of the protocol is to explain the method of assembling the reads trimmed by quality into draft contigs.</p><blockquote><p>spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o result_of_spades_assembly_all_illumina</p></blockquote><p>A significant range of short-read assemblers are available. Everyone with strengths and disadvantages of their own. <br /><em>Some of the assemblers available include:</em><br />Velvet<br />SOAP-denovo<br />MIRA<br />ALLPATHS</p><p>Next step is to assess the suitability and what to do with a draft package of contiguous details for the remainder of the study now.&nbsp;Few stuff you can note about the contigs you just created:&nbsp;They're the draft Contigs. Any mis-assemblies can occur.</p><p><strong>Mis-assembly checking and assembly metric tools:</strong><br />QUAST - Quality assessment tool for genome assembly http://bioinf.spbau.ru/quast<br />Mauve assembly metrics - http://code.google.com/p/ngopt/wiki/How_To_Score_Genome_Assemblies_with_Mauve<br />InGAP-SV - https://sites.google.com/site/nextgengenomics/ingap and http://ingap.sourceforge.net/<br />inGAP is also useful for finding structural variants between genomes from read mappings.</p><p><strong>Genome finishing tools:</strong><br />Semi-automated gap fillers:<br />Gap filler - http://www.baseclear.com/landingpages/basetools-a-wide-range-of-bioinformatics-solutions/gapfiller/</p><p>IMAGE (V2) - http://sourceforge.net/apps/mediawiki/image2/index.php?title=Main_Page</p><p><strong>Genome visualisers and editors:</strong><br />Artemis - http://www.sanger.ac.uk/resources/software/artemis/<br />IGV - http://www.broadinstitute.org/igv/</p><p><strong>Automated and semi automated annotation tools:</strong><br />Prokka - https://github.com/tseemann/prokka<br />RAST - http://www.nmpdr.org/FIG/wiki/view.cgi/FIG/RapidAnnotationServer<br />JCVI Annotation Service - http://www.jcvi.org/cms/research/projects/annotation-service/</p><p><strong>Frequent command use for the analysis are at:</strong></p><p>https://bioinformaticsonline.com/blog/view/38765/list-of-tools-frequently-used-while-genome-assembly<br />https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44848/trust-but-verify-sequencing-your-cell-lines-might-reveal-an-uninvited-guest</guid>
	<pubDate>Wed, 04 Jun 2025 00:07:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44848/trust-but-verify-sequencing-your-cell-lines-might-reveal-an-uninvited-guest</link>
	<title><![CDATA[Trust But Verify: Sequencing Your Cell Lines Might Reveal an Uninvited Guest]]></title>
	<description><![CDATA[<p>High-throughput sequencing has become indispensable in cell biology, enabling detailed insights into chromatin structure, gene expression, and regulatory dynamics. Yet, when faced with unexpectedly low mapping rates to the human genome, researchers often rush to troubleshoot technical parameters&mdash;sequencer quality, adapter trimming, or aligner settings.</p><p>Before you go down that path, consider this critical biological question:<br /> <strong>Are you sequencing human cells&mdash;or bacterial contamination?</strong></p><h2>The Silent Saboteur: Mycoplasma in Cell Cultures</h2><p><em>Mycoplasma</em> contamination remains one of the most widespread and underdiagnosed issues in tissue culture work. Studies suggest that <strong>15&ndash;35% of cell lines in use may be contaminated</strong>, often without visible signs. Unlike other microbial infections, <em>Mycoplasma</em> does not produce cloudiness, odor, or a change in pH. Many researchers won&rsquo;t detect it unless they specifically test for it.</p><p>The consequences, however, are profound. <em>Mycoplasma</em> can significantly alter:</p><ul>
<li>
<p>Host gene expression patterns</p>
</li>
<li>
<p>Cell proliferation rates</p>
</li>
<li>
<p>Epigenetic profiles and chromatin accessibility</p>
</li>
<li>
<p>Cytokine signaling and immune responses</p>
</li>
</ul><p>In short, it can skew your results, compromise your biological conclusions, and invalidate weeks or months of research.</p><h2>A Simple Diagnostic Step: Map Against <em>Mycoplasma</em> Genomes</h2><p>If you encounter poor alignment rates to the human genome, consider mapping your reads to a <em>Mycoplasma</em> reference genome&mdash;or better yet, use a <strong>combined human + <em>Mycoplasma</em></strong> reference. There have been cases where over half of all reads, initially assumed to be from human cells, were in fact bacterial in origin. This check is fast, easy, and could save your project.</p><h2>How Contamination Happens&mdash;and Persists</h2><p><em>Mycoplasma</em> is small (0.1&ndash;0.3 &mu;m), lacks a cell wall, and can pass through standard filters undetected. Common sources include:</p><ul>
<li>
<p>Contaminated reagents (e.g., FBS)</p>
</li>
<li>
<p>Infected cell lines obtained from other labs</p>
</li>
<li>
<p>Poor aseptic technique or shared equipment</p>
</li>
</ul><p>Once present, it spreads quickly between cultures and can persist for months, silently affecting results.</p><h2>Why Treatment Is Difficult</h2><p>While antibiotics such as Plasmocin or BM-Cyclin are sometimes used, they often offer only partial resolution and may themselves alter cell behavior. In many cases, the best course of action is to <strong>discard the contaminated culture</strong> and start with a fresh, verified stock.</p><h2>Practical Recommendations for Researchers</h2><ul>
<li>
<p><strong>Routinely test for <em>Mycoplasma</em></strong> using PCR, qPCR, or fluorescence-based assays</p>
</li>
<li>
<p><strong>Incorporate contamination screens into your sequencing QC pipeline</strong></p>
</li>
<li>
<p><strong>Use combined reference genomes</strong> when mapping ambiguous reads</p>
</li>
<li>
<p><strong>Practice strict aseptic technique</strong> and monitor all incoming cell lines</p>
</li>
<li>
<p><strong>Don&rsquo;t ignore unexplained data anomalies</strong>&mdash;they might point to contamination</p>
</li>
</ul><h2>Closing Thought: Contamination Is a Biological Variable</h2><p>It&rsquo;s easy to view poor mapping as a technical issue, but sometimes the problem lies deeper&mdash;in the biology itself. <em>Mycoplasma</em> contamination doesn&rsquo;t just interfere with sequencing; it interferes with science. As a research community, we must treat contamination not as an afterthought, but as a key variable to control.</p><p>So next time your reads won&rsquo;t align, don&rsquo;t just tune the aligner. Ask if your cells are telling the truth&mdash;or if they're hiding something.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40214/gooey-turn-almost-any-python-command-line-program-into-a-full-gui-application-with-one-line</guid>
	<pubDate>Fri, 01 Nov 2019 00:29:27 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40214/gooey-turn-almost-any-python-command-line-program-into-a-full-gui-application-with-one-line</link>
	<title><![CDATA[Gooey: Turn (almost) any Python command line program into a full GUI application with one line]]></title>
	<description><![CDATA[<p><span>Turn (almost) any Python command line program into a full GUI application with one line</span></p>
<p>The easiest way to install Gooey is via&nbsp;<code>pip</code></p>
<pre><code>pip install Gooey 
</code></pre>
<p>Alternatively, you can install Gooey by cloning the project to your local directory</p>
<pre><code>git clone https://github.com/chriskiehl/Gooey.git
</code></pre>
<p>run&nbsp;<code>setup.py</code></p>
<pre><code>python setup.py install</code></pre><p>Address of the bookmark: <a href="https://github.com/chriskiehl/Gooey" rel="nofollow">https://github.com/chriskiehl/Gooey</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/4590/tigers-genome-sequenced</guid>
	<pubDate>Tue, 17 Sep 2013 16:48:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/4590/tigers-genome-sequenced</link>
	<title><![CDATA[Tigers genome sequenced]]></title>
	<description><![CDATA[<p>Fifteen scientists led by Dr Jong Bhak of Genome Research Foundation, South Korea, decoded as many as 3 billion nucleotides (organic molecules that form the basic building blocks of nucleic acids, such as DNA). They identified 20,000 genes related to various functions of the tiger.&nbsp;</p><p>The biggest and perhaps most fearsome of the world's big cats, the tiger, shares 95.6 percent of its DNA with humans' cute and furry companions, domestic cats.</p><p>The new research showed that big cats have genetic mutations that enabled them to be carnivores. The team also identified mutations that allow snow leopards to thrive at high altitudes.</p><p>Reference:</p><p><a href="http://www.nbcnews.com/science/your-cat-ferocious-tigers-share-lot-95-6-percent-their-4B11182690">http://www.nbcnews.com/science/your-cat-ferocious-tigers-share-lot-95-6-percent-their-4B11182690</a></p><p><a href="http://timesofindia.indiatimes.com/home/environment/flora-fauna/Gene-mapping-of-tiger-completed/articleshow/22671681.cms">http://timesofindia.indiatimes.com/home/environment/flora-fauna/Gene-mapping-of-tiger-completed/articleshow/22671681.cms</a></p><p>Paper:</p><p><a href="http://www.nature.com/ncomms/2013/130917/ncomms3433/full/ncomms3433.html">http://www.nature.com/ncomms/2013/130917/ncomms3433/full/ncomms3433.html</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38012/cosine-non-seeding-method-for-mapping-long-noisy-sequences</guid>
	<pubDate>Fri, 26 Oct 2018 00:41:59 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38012/cosine-non-seeding-method-for-mapping-long-noisy-sequences</link>
	<title><![CDATA[COSINE: non-seeding method for mapping long noisy sequences]]></title>
	<description><![CDATA[<p><span>Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors.</span></p><p>Address of the bookmark: <a href="https://github.com/SUwonglab/COSINE" rel="nofollow">https://github.com/SUwonglab/COSINE</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/11249/how-to-sequence-the-human-genome-mark-j-kiel</guid>
	<pubDate>Fri, 30 May 2014 13:24:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/11249/how-to-sequence-the-human-genome-mark-j-kiel</link>
	<title><![CDATA[How to sequence the human genome - Mark J. Kiel]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/MvuYATh7Y74" frameborder="0" allowfullscreen></iframe>View full lesson: http://ed.ted.com/lessons/how-to-sequence-the-human-genome-mark-j-kiel

Your genome, every human's genome, consists of a unique DNA sequence of A's, T's, C's and G's that tell your cells how to operate. Thanks to technological advances, scientists are now able to know the sequence of letters that makes up an individual genome relatively quickly and inexpensively. Mark J. Kiel takes an in-depth look at the science behind the sequence.

Lesson by Mark J. Kiel, animation by Marc Christoforidis.]]></description>
	
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27430/mosaik-a-hash-based-algorithm-for-accurate-next-generation-sequencing-short-read-mapping</guid>
	<pubDate>Fri, 20 May 2016 18:53:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27430/mosaik-a-hash-based-algorithm-for-accurate-next-generation-sequencing-short-read-mapping</link>
	<title><![CDATA[MOSAIK: A Hash-Based Algorithm for Accurate Next-Generation Sequencing Short-Read Mapping]]></title>
	<description><![CDATA[<p><span>MOSAIK is a stable, sensitive and open-source program for mapping second and third-generation sequencing reads to a reference genome. Uniquely among current mapping tools, MOSAIK can align reads generated by all the major sequencing technologies, including Illumina, Applied Biosystems SOLiD, Roche 454, Ion Torrent and Pacific BioSciences SMRT. Indeed, MOSAIK was the only aligner to provide consistent mappings for all the generated data (sequencing technologies, low-coverage and exome) in the 1000 Genomes Project. To provide highly accurate alignments, MOSAIK employs a hash clustering strategy coupled with the Smith-Waterman algorithm. This method is well-suited to capture mismatches as well as short insertions and deletions. To support the growing interest in larger structural variant (SV) discovery, MOSAIK provides explicit support for handling known-sequence SVs, e.g. mobile element insertions (MEIs) as well as generating outputs tailored to aid in SV discovery.</span></p><p>Address of the bookmark: <a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0090581" rel="nofollow">http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0090581</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32018/tmap-torrent-mapping-alignment-program-general-notes</guid>
	<pubDate>Sun, 02 Apr 2017 15:53:47 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32018/tmap-torrent-mapping-alignment-program-general-notes</link>
	<title><![CDATA[TMAP - torrent mapping alignment program General Notes]]></title>
	<description><![CDATA[<p>TMAP - torrent mapping alignment program <a href="https://github.com/iontorrent/TS/tree/master/Analysis/TMAP#general-notes"></a>General Notes</p>
<p>TMAP is a fast and accurate alignment software for short and long nucleotide sequences produced by next-generation sequencing technologies.</p>
<ul>
<li>
<p>The latest TMAP is unsupported. To use a supported version, please see the TMAP version associated with a Torrent Suite release below.</p>
</li>
<li>
<p>Get the latest source code:</p>
<div>
<pre>git clone git://github.com/iontorrent/TMAP.git
 <span>cd</span> TMAP
 git submodule init
 git submodule update</pre>
</div>
</li>
</ul>
<p>https://github.com/iontorrent/TS/tree/master/Analysis/TMAP</p><p>Address of the bookmark: <a href="https://github.com/iontorrent/TS/tree/master/Analysis/TMAP" rel="nofollow">https://github.com/iontorrent/TS/tree/master/Analysis/TMAP</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26906/paired-end-assembler-for-dna-sequences</guid>
	<pubDate>Wed, 06 Apr 2016 05:25:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26906/paired-end-assembler-for-dna-sequences</link>
	<title><![CDATA[PAired-eND Assembler for DNA sequences]]></title>
	<description><![CDATA[<p>PANDASEQ is a program to align Illumina reads, optionally with PCR primers embedded in the sequence, and reconstruct an overlapping sequence.</p>
<p>&nbsp;</p>
<p>More at https://github.com/neufeld/pandaseq</p><p>Address of the bookmark: <a href="https://github.com/neufeld/pandaseq" rel="nofollow">https://github.com/neufeld/pandaseq</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>