<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/38452?offset=80</link>
	<atom:link href="https://bioinformaticsonline.com/related/38452?offset=80" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44848/trust-but-verify-sequencing-your-cell-lines-might-reveal-an-uninvited-guest</guid>
	<pubDate>Wed, 04 Jun 2025 00:07:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44848/trust-but-verify-sequencing-your-cell-lines-might-reveal-an-uninvited-guest</link>
	<title><![CDATA[Trust But Verify: Sequencing Your Cell Lines Might Reveal an Uninvited Guest]]></title>
	<description><![CDATA[<p>High-throughput sequencing has become indispensable in cell biology, enabling detailed insights into chromatin structure, gene expression, and regulatory dynamics. Yet, when faced with unexpectedly low mapping rates to the human genome, researchers often rush to troubleshoot technical parameters&mdash;sequencer quality, adapter trimming, or aligner settings.</p><p>Before you go down that path, consider this critical biological question:<br /> <strong>Are you sequencing human cells&mdash;or bacterial contamination?</strong></p><h2>The Silent Saboteur: Mycoplasma in Cell Cultures</h2><p><em>Mycoplasma</em> contamination remains one of the most widespread and underdiagnosed issues in tissue culture work. Studies suggest that <strong>15&ndash;35% of cell lines in use may be contaminated</strong>, often without visible signs. Unlike other microbial infections, <em>Mycoplasma</em> does not produce cloudiness, odor, or a change in pH. Many researchers won&rsquo;t detect it unless they specifically test for it.</p><p>The consequences, however, are profound. <em>Mycoplasma</em> can significantly alter:</p><ul>
<li>
<p>Host gene expression patterns</p>
</li>
<li>
<p>Cell proliferation rates</p>
</li>
<li>
<p>Epigenetic profiles and chromatin accessibility</p>
</li>
<li>
<p>Cytokine signaling and immune responses</p>
</li>
</ul><p>In short, it can skew your results, compromise your biological conclusions, and invalidate weeks or months of research.</p><h2>A Simple Diagnostic Step: Map Against <em>Mycoplasma</em> Genomes</h2><p>If you encounter poor alignment rates to the human genome, consider mapping your reads to a <em>Mycoplasma</em> reference genome&mdash;or better yet, use a <strong>combined human + <em>Mycoplasma</em></strong> reference. There have been cases where over half of all reads, initially assumed to be from human cells, were in fact bacterial in origin. This check is fast, easy, and could save your project.</p><h2>How Contamination Happens&mdash;and Persists</h2><p><em>Mycoplasma</em> is small (0.1&ndash;0.3 &mu;m), lacks a cell wall, and can pass through standard filters undetected. Common sources include:</p><ul>
<li>
<p>Contaminated reagents (e.g., FBS)</p>
</li>
<li>
<p>Infected cell lines obtained from other labs</p>
</li>
<li>
<p>Poor aseptic technique or shared equipment</p>
</li>
</ul><p>Once present, it spreads quickly between cultures and can persist for months, silently affecting results.</p><h2>Why Treatment Is Difficult</h2><p>While antibiotics such as Plasmocin or BM-Cyclin are sometimes used, they often offer only partial resolution and may themselves alter cell behavior. In many cases, the best course of action is to <strong>discard the contaminated culture</strong> and start with a fresh, verified stock.</p><h2>Practical Recommendations for Researchers</h2><ul>
<li>
<p><strong>Routinely test for <em>Mycoplasma</em></strong> using PCR, qPCR, or fluorescence-based assays</p>
</li>
<li>
<p><strong>Incorporate contamination screens into your sequencing QC pipeline</strong></p>
</li>
<li>
<p><strong>Use combined reference genomes</strong> when mapping ambiguous reads</p>
</li>
<li>
<p><strong>Practice strict aseptic technique</strong> and monitor all incoming cell lines</p>
</li>
<li>
<p><strong>Don&rsquo;t ignore unexplained data anomalies</strong>&mdash;they might point to contamination</p>
</li>
</ul><h2>Closing Thought: Contamination Is a Biological Variable</h2><p>It&rsquo;s easy to view poor mapping as a technical issue, but sometimes the problem lies deeper&mdash;in the biology itself. <em>Mycoplasma</em> contamination doesn&rsquo;t just interfere with sequencing; it interferes with science. As a research community, we must treat contamination not as an afterthought, but as a key variable to control.</p><p>So next time your reads won&rsquo;t align, don&rsquo;t just tune the aligner. Ask if your cells are telling the truth&mdash;or if they're hiding something.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/38551/gupta-lab</guid>
  <pubDate>Sat, 29 Dec 2018 13:18:31 -0600</pubDate>
  <link></link>
  <title><![CDATA[Gupta Lab]]></title>
  <description><![CDATA[
<p>Work include (i) understanding the evolutionary relationships among different prokaryotic and eukaryotic organisms; (ii) Understanding the cellular functions of these lineage-specific signature proteins as well as lineage-specific conserved inserts and deletions in important housekeeping proteins by genetic and biochemical studies; (iii) Development of novel diagnostic methods (PCR based and immunological) for identification of different groups of organisms based upon these signature proteins and conserved indels; (iv) The use of these lineage-specific probes with predicitive ability to identify/explore the presence of different groups of organisms in metagenomic sequences from various environments.</p>

<p>https://fhs.mcmaster.ca/gupta-lab/index.html</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/1178/r-package-for-visualising-go-enrichment</guid>
	<pubDate>Mon, 22 Jul 2013 12:25:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/1178/r-package-for-visualising-go-enrichment</link>
	<title><![CDATA[R package for visualising GO enrichment]]></title>
	<description><![CDATA[<p>An R package that visualizes the GO enrichment results as word clouds and arranges them together with figures of experimental data. This allows us to draw informative summary plots for analyses such as differential expression or clustering, where for each gene list we display its behaviour in the experiment alongside with its GO annotations.</p><p>Links @ http://raivokolde.github.io/GOsummaries/</p><p>Lab @ http://biit.cs.ut.ee/about/main</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27845/cnidaria-fast-reference-free-phylogenomic-clustering</guid>
	<pubDate>Thu, 16 Jun 2016 17:55:17 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27845/cnidaria-fast-reference-free-phylogenomic-clustering</link>
	<title><![CDATA[CNIDARIA: fast, reference-free phylogenomic clustering]]></title>
	<description><![CDATA[<p>Motivation: Identification of biological specimens is a major requirement for a range of applications. Reference-free methods analyse unprocessed sequencing data without relying on prior knowledge, but these do not scale to arbitrarily large genomes and arbitrarily large phylogenetic distances.</p>
<p>Results: We present Cnidaria, a practical tool for clustering genomic and transcriptomic data with no limitation on ge-nome size or phylogenetic distances. We successfully simultaneously clustered 169 genomic and transcriptomic datasets from 4 kingdoms, achieving 100% accuracy at supra-species level and 78% accuracy for species level.</p>
<p>Availability and Implementation: Cnidaria is written in C++ and Python and is available at http://www.ab.wur.nl/cnidaria.</p>
<p>Contact: Saulo Aflitos - sauloal@gmail.com</p>
<p>Supplementary information: Supplementary data are available at Bioinformatics online.</p><p>Address of the bookmark: <a href="https://github.com/sauloal/cnidaria/wiki" rel="nofollow">https://github.com/sauloal/cnidaria/wiki</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42645/mmseqs2-ultra-fast-and-sensitive-sequence-search-and-clustering-suite</guid>
	<pubDate>Mon, 18 Jan 2021 10:47:56 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42645/mmseqs2-ultra-fast-and-sensitive-sequence-search-and-clustering-suite</link>
	<title><![CDATA[MMseqs2: ultra fast and sensitive sequence search and clustering suite]]></title>
	<description><![CDATA[<p><span>MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The software is designed to run on multiple cores and servers and exhibits very good scalability. MMseqs2 can run 10000 times faster than BLAST. At 100 times its speed it achieves almost the same sensitivity. It can perform profile searches with the same sensitivity as PSI-BLAST at over 400 times its speed.</span></p><p>Address of the bookmark: <a href="https://github.com/soedinglab/MMseqs2" rel="nofollow">https://github.com/soedinglab/MMseqs2</a></p>]]></description>
	<dc:creator>Manisha Mishra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42038/pyparanoid-a-pipeline-for-rapid-identification-of-homologous-gene-families-in-a-set-of-genomes</guid>
	<pubDate>Thu, 13 Aug 2020 10:06:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42038/pyparanoid-a-pipeline-for-rapid-identification-of-homologous-gene-families-in-a-set-of-genomes</link>
	<title><![CDATA[PyParanoid: a pipeline for rapid identification of homologous gene families in a set of genomes]]></title>
	<description><![CDATA[<p>PyParanoid is a pipeline for rapid identification of homologous gene families in a set of genomes - a central task of any comparative genomics analysis. The "gold standard" for identifying homologs is to use reciprocal best hits (RBHs) which depends on performing a all-vs-all sequence comparison, usually using BLAST, to determine homology. However, these methods are computationally expensive, requiring&nbsp;O(n2)&nbsp;resources to identify RBHs. This is problematic, as the modern deluge of sequencing data means that comparative genomics analyses could be performed on datasets of thousands of strains.</p><p>Address of the bookmark: <a href="https://github.com/ryanmelnyk/PyParanoid" rel="nofollow">https://github.com/ryanmelnyk/PyParanoid</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/27459/tools-for-searching-repeats-and-palindromic-sequences</guid>
	<pubDate>Sat, 21 May 2016 22:32:25 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/27459/tools-for-searching-repeats-and-palindromic-sequences</link>
	<title><![CDATA[Tools for Searching Repeats And Palindromic Sequences]]></title>
	<description><![CDATA[<p>What are genomic interspersed repeats?</p><p>In the mid 1960's scientists discovered that many genomes contain stretches of highly repetitive DNA sequences ( see Reassociation Kinetics Experiments, and C-Value Paradox ). These sequences were later characterized and placed into five categories:</p><p><strong>Simple Repeats</strong> - Duplications of simple sets of DNA bases (typically 1-5bp) such as A, CA, CGG etc.<br /><strong>Tandem Repeats</strong> - Typically found at the centromeres and telomeres of chromosomes these are duplications of more complex 100-200 base sequences.<br /><strong>Segmental Duplications</strong> - Large blocks of 10-300 kilobases which are that have been copied to another region of the genome.<br /><strong>Interspersed Repeats</strong><br />Processed Pseudogenes, Retrotranscripts, SINES - Non-functional copies of RNA genes which have been reintegrated into the genome with the assitance of a reverse transcriptase.<br />DNA Transposons<br />Retrovirus Retrotransposons<br />Non-Retrovirus Retrotransposons ( LINES )</p><p>Currently up to 50% of the human genome is repetitive in nature and as improvements are made in detection methods this number is expected to increase.</p><p>On the other hand; In genetics, the term palindrome refers to a sequence of nucleotides along a DNA (deoxyribonucleic acid) or RNA (ribonucleic acid) strand that contains the same series of nitrogenous bases regardless from which direction the strand is analyzed. Akin to a language palindrome&mdash;wherein a word or phrase is spelled the same left-to-right as right-to-left (e.g., the word RADAR or the phrase "able was I ere I saw elba")&mdash;with genetic palindromes it does not matter whether the nucleic acid strand is read starting from the 3' (three prime) end or the 5' (five prime) end of the strand.</p><p>Recent research on palindromes centers on understanding palindrome formation during gene amplification. Other studies have attempted to relate palindrome formation to molecular mechanisms involved in double stranded breaks and in the formation of inverted repeats. Assisted by high speed computers, other groups of scientists link palindrome formation to the conservation of genetic information.</p><p>Related to the direction of transcription by RNA polymerase, DNA strands have upstream and downstream terminus defined by differing chemical groups at each end. The ends of each strand of DNA or RNA are termed the 5' (phosphate bound to the 5' position carbon) and 3' (phosphate bound to the 3' carbon) ends to indicate a polarity within the molecule. Using the letters A, T, C, G, to represent the nitrogenous bases adenine, thymine, cytosine, and guanine found in DNA, and the letters A, U, C, G to represent the nitrogenous bases adenine, uracil, cytosine, guanine found in RNA (Note that uracil in RNA replaces the thymine found in DNA), geneticists usually represent DNA by a series of base codes (e.g., 5' AATCGGATTGCA 3'). The base codes are usually arranged from the 5' end to the 3' end.</p><p>Because of specific base pairing in DNA (i.e., adenine (A) always bonds with (thymine (T) and cytosine (C) always bonds with guanine (G)) the complimentary stand to the sequence 5' AATCGGATTGCA 3' would be 3' TTAGCCTAACGT 5'.</p><p>With palindromes the sequences on the complimentary strands read the same in either direction. For example, a sequence of 5' GAATTC3' on one strand would be complimented by a 3' CTTAAG 5' strand. In either case, when either strand is read from the 5' prime end the sequence is GAATTC. Another example of a palindrome would be the sequence 5' CGAAGC 3' that, when reversed, still reads CGAAGC.</p><p>Palindromes are important sequences within nucleic acids. Often they are the site of binding for specific enzymes (e.g., restriction endobucleases) designed to cut the DNA strands at specific locations (i.e., at palindromes).</p><p>Palindromes may arise from brakeage and chromosomal inversions that form inverted repeats that compliment each other. When a palindrome results from an inversion, it is often referred to as an inverted repeat. For example, the sequence 5' CGAAGC 3', if inverted (reversed 180&deg;), still reads CGAAGC.</p><p>The <a href="http://emboss.open-bio.org/">European Molecular Biology Open Software Suite (EMBOSS)</a> includes some basic tools for finding tandem repeats and inverted repeats (see <a href="http://emboss.open-bio.org/html/use/apbs06.html#GroupsAppsTableNucleicrepeatsR6">B.6.22. Applications in group Nucleic:repeats</a>). There are many on-line services providing the EMBOSS tools, for example:</p><ul>
<li>Wageningen Bioinformatics Webportal <a href="http://emboss.bioinformatics.nl/">EMBOSS explorer</a></li>
<li><a href="http://mobyle.pasteur.fr/">Mobyle@Pasteur</a></li>
<li><a href="http://wsembnet.vital-it.ch/">Soaplab2 Web Services at Vital-IT</a></li>
</ul><p>For more sophisticated repeat finding you will want to look at tools using <a href="http://www.girinst.org/repbase/">Repbase</a> for example:</p><ul>
<li>CENSOR
<ul>
<li><a href="http://www.girinst.org/censor/">CENSOR@GIRI</a></li>
<li><a href="http://www.ebi.ac.uk/Tools/so/censor/">CENSOR@EMBL-EBI</a></li>
</ul>
</li>
<li><a href="http://www.repeatmasker.org/">RepeatMasker</a></li>
<li><a href="http://mummer.sourceforge.net/">MUMmer</a>&nbsp;(scan_for_match)</li>
<li><a href="http://emboss.bioinformatics.nl/cgi-bin/emboss/palindrome">Emboss Palindrome</a></li>
</ul><p>Other nucleotide repeat finding methods found by a couple of web searches:</p><ul>
<li><a href="http://tandem.bu.edu/trf/trf.html">Tandem Repeats Finder</a></li>
<li><a href="http://selab.janelia.org/recon.html">RECON</a></li>
<li><a href="http://www.yandell-lab.org/software/repeatrunner.html">RepeatRunner</a></li>
<li><a href="http://bibiserv.techfak.uni-bielefeld.de/reputer/">REPuter</a></li>
<li><a href="http://210.212.215.200/IMEX/index.html">Imperfect Microsatellite Extractor (IMEx)</a></li>
<li><a href="http://www.imtech.res.in/raghava/srf/">Spectral Repeat Finder (SRF)</a></li>
<li><a href="http://zlab.bu.edu/repfind/form.html">REPFIND</a></li>
<li><a href="http://crispr.u-psud.fr/Server/CRISPRfinder.php">CRISPRfinder</a></li>
<li><a href="http://grail.lsd.ornl.gov/grailexp/">GrailEXP</a></li>
<li><a href="http://alggen.lsi.upc.edu/recerca/search/frame-search.html">CONREPP</a></li>
<li><a href="http://www.biophp.org/minitools/find_palindromes/demo.php%20"><span>find_palindromes</span></a></li>
<li><a href="http://insilico.ehu.eus/palindromes/"><span>Palindrome</span></a></li>
<li><a href="http://emboss.bioinformatics.nl/cgi-bin/emboss/palindrome">EMBOSS Palindrome</a></li>
<li><a href="http://bioinfo.cs.technion.ac.il/projects/Engel-Freund/new.html">Palindrome Search</a></li>
</ul>]]></description>
	<dc:creator>Radha Agarkar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38668/gvolante-completeness-assessment-of-genometranscriptome-sequences</guid>
	<pubDate>Sun, 13 Jan 2019 07:03:25 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38668/gvolante-completeness-assessment-of-genometranscriptome-sequences</link>
	<title><![CDATA[gVolante: Completeness Assessment of Genome/Transcriptome Sequences]]></title>
	<description><![CDATA[<p><span>A brand-new web server, gVolante, which provides an online tool for (i) on-demand completeness assessment of sequence sets by means of the previously developed pipelines CEGMA and BUSCO and (ii) browsing pre-computed completeness scores for publicly available data in its database section</span></p><p>Address of the bookmark: <a href="https://gvolante.riken.jp/analysis.html" rel="nofollow">https://gvolante.riken.jp/analysis.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/44364/genbank-release-2570-is-now-available</guid>
	<pubDate>Wed, 23 Aug 2023 00:23:23 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/44364/genbank-release-2570-is-now-available</link>
	<title><![CDATA[GenBank release 257.0 is now available!]]></title>
	<description><![CDATA[<p><span>GenBank release 257.0 is now available! This release has 25.10 trillion bases and 3.69 billion records. Learn more:&nbsp;https://ncbiinsights.ncbi.nlm.nih.gov/2023/08/21/genbank-release-257/</span><a href="https://ow.ly/zHbV50PBE5o"><br /></a></p><p><a href="https://www.ncbi.nlm.nih.gov/genbank/?utm_source=ncbi_insights&amp;utm_medium=referral&amp;utm_campaign=genbank-release-20230821">GenBank</a>&nbsp;release 257.0 (8/15/2023) is now available on the&nbsp;<a href="https://ftp.ncbi.nlm.nih.gov/genbank/">NCBI FTP site</a>. This release has 25.10 trillion bases and 3.69 billion records.</p><p><strong>The current release has:</strong></p><ul>
<li>246,119,175 traditional records containing 2,112,058,517,945 base pairs of sequence data</li>
<li>2,631,493,489 WGS records containing 22,294,446,104,543 base pairs of sequence data</li>
<li>686,271,945 bulk-oriented TSA records containing 646,176,166,908 base pairs of sequence data</li>
<li>124,421,006 bulk-oriented TLS records containing 48,289,699,026 base pairs of sequence data</li>
</ul>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36758/pbalign-maps-pacbio-reads-to-reference-sequences-and-saves-alignments-to-a-bam-file</guid>
	<pubDate>Thu, 24 May 2018 10:06:52 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36758/pbalign-maps-pacbio-reads-to-reference-sequences-and-saves-alignments-to-a-bam-file</link>
	<title><![CDATA[pbalign: maps PacBio reads to reference sequences and saves alignments to a BAM file]]></title>
	<description><![CDATA[pbalign aligns PacBio reads to reference sequences, filters aligned reads according to user-specific filtering criteria, and converts the output to either the SAM format or PacBio Compare HDF5 (e.g., .cmp.h5) format. The output Compare HDF5 file will be compatible with Quiver if --forQuiver option is specified.<p>Address of the bookmark: <a href="https://github.com/PacificBiosciences/pbalign" rel="nofollow">https://github.com/PacificBiosciences/pbalign</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>