<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/41881?offset=60</link>
	<atom:link href="https://bioinformaticsonline.com/related/41881?offset=60" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/44724/step-by-step-guide-to-detect-pirnas-using-bioinformatics</guid>
	<pubDate>Fri, 13 Dec 2024 11:41:46 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/44724/step-by-step-guide-to-detect-pirnas-using-bioinformatics</link>
	<title><![CDATA[Step-by-Step Guide to Detect piRNAs Using Bioinformatics]]></title>
	<description><![CDATA[<p>Piwi-interacting RNAs (piRNAs) are a class of small non-coding RNAs that play crucial roles in silencing transposable elements and regulating gene expression, particularly in germline cells. Detecting piRNAs involves identifying their unique characteristics, such as size, sequence motifs, and association with Piwi proteins, from high-throughput RNA sequencing data.</p><p>This blog provides a comprehensive step-by-step guide to detect piRNAs using bioinformatics tools and workflows.</p><h4><strong>Step 1: Prepare Your Data</strong></h4><ol>
<li>
<p><strong>Obtain RNA Sequencing Data</strong><br />Acquire raw small RNA-seq data in FASTQ format. Datasets can be sourced from repositories like <strong>NCBI SRA</strong>, <strong>EMBL-EBI</strong>, or specific small RNA sequencing projects.</p>
</li>
<li>
<p><strong>Quality Control (QC)</strong><br />Use <strong>FastQC</strong> to assess the quality of raw reads:</p>
<div>
<div dir="ltr"><code>fastqc reads.fastq </code></div>
</div>
<p>Evaluate the per-base quality, adapter content, and overrepresented sequences.</p>
</li>
<li>
<p><strong>Trimming and Adapter Removal</strong><br />Use tools like <strong>Cutadapt</strong> or <strong>Trim Galore!</strong> to remove adapters and low-quality bases:</p>
<div>
<div dir="ltr"><code>cutadapt -a TGGAATTCTCGGGTGCCAAGG -o trimmed_reads.fastq reads.fastq </code></div>
</div>
<p>Ensure the remaining reads are of high quality for downstream analysis.</p>
</li>
</ol><h4><strong>Step 2: Map Reads to the Genome</strong></h4><p>Mapping reads to the reference genome is crucial for identifying piRNA loci.</p><ol>
<li>
<p><strong>Reference Genome Preparation</strong><br />Download the genome assembly of your organism from databases like <strong>Ensembl</strong>, <strong>UCSC Genome Browser</strong>, or <strong>NCBI</strong>.</p>
</li>
<li>
<p><strong>Align Reads</strong><br />Use <strong>Bowtie</strong> or <strong>STAR</strong> for small RNA alignment:</p>
<div>
<div dir="ltr"><code>bowtie -v 1 -k 1 --best genome_index trimmed_reads.fastq -S aligned_reads.sam </code></div>
</div>
<ul>
<li><code>-v 1</code>: Allows one mismatch.</li>
<li><code>-k 1</code>: Reports the best alignment.</li>
</ul>
</li>
<li>
<p><strong>Convert SAM to BAM</strong><br />Convert and sort alignments using <strong>SAMtools</strong>:</p>
<div>
<div dir="ltr"><code>samtools view -Sb aligned_reads.sam | samtools sort -o sorted_reads.bam </code></div>
</div>
</li>
</ol><h4><strong>Step 3: Identify Small RNAs</strong></h4><p>piRNAs are characterized by their size (24&ndash;32 nt) and strand bias.</p><ol>
<li>
<p><strong>Extract Reads by Size</strong><br />Use tools like <strong>BEDtools</strong> or custom scripts to filter reads between 24 and 32 nt:</p>
<div>
<div dir="ltr"><code>bedtools bamtofastq -i sorted_reads.bam -fq all_reads.fastq seqkit seq -m 24 -M 32 all_reads.fastq &gt; piRNA_size_reads.fastq </code></div>
</div>
</li>
<li>
<p><strong>Check for Sequence Bias</strong><br />piRNAs often have a strong bias for a uridine at the 5&rsquo; end (1U bias). Use tools like <strong>WebLogo</strong> to visualize sequence motifs.</p>
</li>
</ol><h4><strong>Step 4: Detect Ping-Pong Signature</strong></h4><p>The ping-pong amplification loop is a hallmark of piRNA biogenesis, characterized by a 10 nt overlap between piRNAs on opposite strands.</p><ol>
<li>
<p><strong>Generate Overlap Statistics</strong><br />Use the <strong>piPipes</strong> tool or custom scripts to calculate overlap:</p>
<div>
<div dir="ltr"><code>python ping_pong_overlap.py sorted_reads.bam </code></div>
</div>
</li>
<li>
<p><strong>Visualize Overlap Distribution</strong><br />Plot the distribution of overlaps to confirm the presence of the 10 nt ping-pong signature.</p>
</li>
</ol><h4><strong>Step 5: Annotate piRNA Clusters</strong></h4><p>piRNAs are often generated from genomic clusters.</p><ol>
<li>
<p><strong>Cluster Identification</strong><br />Use tools like <strong>proTRAC</strong> or <strong>PIRANHA</strong> to identify piRNA-producing clusters:</p>
<div>
<div dir="ltr"><code>proTRAC.pl -s sorted_reads.bam -g genome.fa -o clusters </code></div>
</div>
</li>
<li>
<p><strong>Annotate Genomic Regions</strong><br />Annotate the identified clusters using gene annotation files (GTF/GFF). Tools like <strong>BEDtools intersect</strong> can help associate piRNA clusters with genes or transposable elements:</p>
<div>
<div dir="ltr"><code>bedtools intersect -a clusters.bed -b genome_annotation.gtf &gt; annotated_clusters.bed </code></div>
</div>
</li>
</ol><h4><strong>Step 6: Functional Analysis</strong></h4><p>Functional analysis of piRNAs can uncover their targets and regulatory roles.</p><ol>
<li>
<p><strong>Predict piRNA Targets</strong><br />Use tools like <strong>IntaRNA</strong> or <strong>RNAhybrid</strong> to predict interactions between piRNAs and potential target mRNAs:</p>
<div>
<div dir="ltr"><code>RNAhybrid -t target_transcripts.fa -q piRNAs.fa &gt; piRNA_targets.txt </code></div>
</div>
</li>
<li>
<p><strong>Enrichment Analysis</strong><br />Perform GO or KEGG enrichment analysis of target genes using tools like <strong>g:Profiler</strong> or <strong>DAVID</strong>.</p>
</li>
</ol><h4><strong>Step 7: Validation and Visualization</strong></h4><ol>
<li>
<p><strong>Validate piRNA Candidates</strong><br />Cross-check the identified piRNAs against known piRNA databases, such as <strong>piRBase</strong> or <strong>piRNAdb</strong>.</p>
</li>
<li>
<p><strong>Visualize Results</strong></p>
<ul>
<li>Use <strong>IGV</strong> (Integrative Genomics Viewer) to visualize piRNA alignment and clusters on the genome.</li>
<li>Generate heatmaps or circos plots to present piRNA distributions.</li>
</ul>
</li>
</ol><h4><strong>Step 8: Share and Publish Findings</strong></h4><ol>
<li>
<p><strong>Archive Data</strong><br />Submit sequencing data to public repositories like <strong>SRA</strong> or <strong>GEO</strong> with metadata specifying piRNA-related experiments.</p>
</li>
<li>
<p><strong>Publish Results</strong><br />Share findings in journals or conferences, emphasizing novel piRNA candidates, target genes, or regulatory mechanisms.</p>
</li>
</ol><h4><strong>Conclusion</strong></h4><p>Detecting piRNAs involves a combination of computational and analytical methods to identify these unique small RNAs and their roles in gene regulation and transposable element suppression. By following this step-by-step guide, you can confidently navigate the complexities of piRNA detection and contribute to the growing understanding of their biological significance.</p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37221/asplice-a-scalable-and-memory-efficient-algorithm-for-de-novo-transcriptome-assembly</guid>
	<pubDate>Tue, 03 Jul 2018 04:09:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37221/asplice-a-scalable-and-memory-efficient-algorithm-for-de-novo-transcriptome-assembly</link>
	<title><![CDATA[ASplice: a scalable and memory-efficient algorithm for de novo transcriptome assembly]]></title>
	<description><![CDATA[With increased availability of de novo assembly algorithms, it is feasible to study entire transcriptomes of non-model organisms. While algorithms are available that are specifically designed for performing transcriptome assembly from high-throughput sequencing data, they are very memory-intensive, limiting their applications to small data sets with few libraries.

Texas A&amp;M University researchers develop a transcriptome assembly algorithm that recovers alternatively spliced isoforms and expression levels while utilizing as many RNA-Seq libraries as possible that contain hundreds of gigabases of data. New techniques are developed so that computations can be performed on a computing cluster with moderate amount of physical memory.

Availability – A software program that implements the algorithm is available at: http://faculty.cse.tamu.edu/shsze/asplice.

Sze SH, Pimsler ML, Tomberlin JK, Jones CD, Tarone AM. (2017) A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms. BMC Genomics 18(Suppl 4):387.<p>Address of the bookmark: <a href="http://faculty.cse.tamu.edu/shsze/asplice/" rel="nofollow">http://faculty.cse.tamu.edu/shsze/asplice/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38452/silix-implements-an-ultra-efficient-algorithm-for-the-clustering-of-homologous-sequences</guid>
	<pubDate>Wed, 12 Dec 2018 09:22:41 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38452/silix-implements-an-ultra-efficient-algorithm-for-the-clustering-of-homologous-sequences</link>
	<title><![CDATA[SiLiX: implements an ultra-efficient algorithm for the clustering of homologous sequences]]></title>
	<description><![CDATA[<p>The software package SiLiX implements<strong>&nbsp;an ultra-efficient algorithm for the clustering of homologous sequences</strong>, based on single transitive links (<em>single linkage</em>) with alignment coverage constraints.</p>
<p>SiLiX adopts a graph-theoretical framework to interpret similarity pairs as edges of a network. A very efficient algorithm, based on the&nbsp;<em>Disjoint Sets Data Structure</em>, allows the computation of sequence families with&nbsp;<strong>low time and space requirements</strong>.</p>
<p><strong>A parallel version</strong>&nbsp;of SiLiX, based on MPI, is also available in this package and has been proved to be scalable, so that its allows the study of&nbsp;<strong>very large datasets</strong>.</p>
<p>SiLiX is already included in the analysis pipeline for&nbsp;<a href="http://pbil.univ-lyon1.fr/databases/hogenom/acceuil.php">HOGENOM</a>.</p><p>Address of the bookmark: <a href="http://lbbe.univ-lyon1.fr/SiLiX?lang=fr" rel="nofollow">http://lbbe.univ-lyon1.fr/SiLiX?lang=fr</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27430/mosaik-a-hash-based-algorithm-for-accurate-next-generation-sequencing-short-read-mapping</guid>
	<pubDate>Fri, 20 May 2016 18:53:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27430/mosaik-a-hash-based-algorithm-for-accurate-next-generation-sequencing-short-read-mapping</link>
	<title><![CDATA[MOSAIK: A Hash-Based Algorithm for Accurate Next-Generation Sequencing Short-Read Mapping]]></title>
	<description><![CDATA[<p><span>MOSAIK is a stable, sensitive and open-source program for mapping second and third-generation sequencing reads to a reference genome. Uniquely among current mapping tools, MOSAIK can align reads generated by all the major sequencing technologies, including Illumina, Applied Biosystems SOLiD, Roche 454, Ion Torrent and Pacific BioSciences SMRT. Indeed, MOSAIK was the only aligner to provide consistent mappings for all the generated data (sequencing technologies, low-coverage and exome) in the 1000 Genomes Project. To provide highly accurate alignments, MOSAIK employs a hash clustering strategy coupled with the Smith-Waterman algorithm. This method is well-suited to capture mismatches as well as short insertions and deletions. To support the growing interest in larger structural variant (SV) discovery, MOSAIK provides explicit support for handling known-sequence SVs, e.g. mobile element insertions (MEIs) as well as generating outputs tailored to aid in SV discovery.</span></p><p>Address of the bookmark: <a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0090581" rel="nofollow">http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0090581</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41452/apollo-a-sequencing-technology-independent-scalable-and-accurate-assembly-polishing-algorithm</guid>
	<pubDate>Mon, 16 Mar 2020 10:09:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41452/apollo-a-sequencing-technology-independent-scalable-and-accurate-assembly-polishing-algorithm</link>
	<title><![CDATA[Apollo: A Sequencing-Technology-Independent, Scalable, and Accurate Assembly Polishing Algorithm]]></title>
	<description><![CDATA[<p><span>Apollo is an assembly polishing algorithm that attempts to correct the errors in an assembly. It can take multiple set of reads in a single run and polish the assemblies of genomes of any size. Described by Firtina et al. (preliminary version at&nbsp;</span><a href="https://arxiv.org/pdf/1902.04341.pdf">https://arxiv.org/pdf/1902.04341.pdf</a></p>
<p>More at&nbsp;<a href="https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa179/5804978?rss=1">https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa179/5804978?rss=1</a></p><p>Address of the bookmark: <a href="https://github.com/CMU-SAFARI/Apollo" rel="nofollow">https://github.com/CMU-SAFARI/Apollo</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41872/autodock-vina-an-open-source-program-for-doing-molecular-docking</guid>
	<pubDate>Sat, 13 Jun 2020 07:55:56 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41872/autodock-vina-an-open-source-program-for-doing-molecular-docking</link>
	<title><![CDATA[AutoDock Vina: an open-source program for doing molecular docking.]]></title>
	<description><![CDATA[<p><span>AutoDock Vina is an open-source program for doing&nbsp;</span><a href="http://en.wikipedia.org/wiki/Docking_(molecular)">molecular docking</a><span>. It was designed and implemented by&nbsp;</span><a href="http://olegtrott.com/">Dr. Oleg Trott</a><span>&nbsp;in the Molecular Graphics Lab at The Scripps Research Institute.</span>&nbsp;It is especially effective for protein-ligand docking. AutoDock 4 is available under the GNU General Public License. AutoDock is one of the most cited docking software applications in the research community.</p>
<p><img src="http://vina.scripps.edu/img/accuracy.png" width="352" height="264" alt="image" style="border: 0px;"></p>
<p><a href="http://vina.scripps.edu/">http://vina.scripps.edu/</a></p><p>Address of the bookmark: <a href="http://vina.scripps.edu/" rel="nofollow">http://vina.scripps.edu/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34475/oxford-nanopore-sequencing-hybrid-error-correction-and-de-novo-assembly-of-a-eukaryotic-genome</guid>
	<pubDate>Wed, 29 Nov 2017 05:08:53 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34475/oxford-nanopore-sequencing-hybrid-error-correction-and-de-novo-assembly-of-a-eukaryotic-genome</link>
	<title><![CDATA[Oxford Nanopore Sequencing, Hybrid Error Correction, and de novo Assembly of a Eukaryotic Genome]]></title>
	<description><![CDATA[<p><span>Monitoring the progress of DNA molecules through a membrane pore has been postulated as a method for sequencing DNA for several decades. Recently, a nanopore-based sequencing instrument, the Oxford Nanopore MinION, has become available that we used for sequencing the S. cerevisiae genome. To make use of these data, we developed a novel open-source hybrid error correction algorithm Nanocorr (</span><a href="https://github.com/jgurtowski/nanocorr">https://github.com/jgurtowski/nanocorr</a><span>) specifically for Oxford Nanopore reads, as existing packages were incapable of assembling the long read lengths (5-50kbp) at such high error rate (between ~5 and 40% error). With this new method we were able to perform a hybrid error correction of the nanopore reads using complementary MiSeq data and produce a de novo assembly that is highly contiguous and accurate: the contig N50 length is more than ten-times greater than an Illumina-only assembly (678kb versus 59.9kbp), and has greater than 99.88% consensus identity when compared to the reference. Furthermore, the assembly with the long nanopore reads presents a much more complete representation of the features of the genome and correctly assembles gene cassettes, rRNAs, transposable elements, and other genomic features that were almost entirely absent in the Illumina-only assembly.</span></p><p>Address of the bookmark: <a href="http://schatzlab.cshl.edu/data/nanocorr/" rel="nofollow">http://schatzlab.cshl.edu/data/nanocorr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36867/cerulean-a-hybrid-assembly-using-high-throughput-short-and-long-reads</guid>
	<pubDate>Tue, 05 Jun 2018 10:10:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36867/cerulean-a-hybrid-assembly-using-high-throughput-short-and-long-reads</link>
	<title><![CDATA[Cerulean: A hybrid assembly using high throughput short and long reads]]></title>
	<description><![CDATA[Cerulean extends contigs assembled using short read datasets like Illumina paired-end reads using long reads like PacBio RS long reads.

Cerulean v0.1 has been implemented with bacterial genomes in mind.

The method is fully described in Deshpande, V., Fung, E. D., Pham, S., &amp; Bafna, V. (2013). Cerulean: A hybrid assembly using high throughput short and long reads. arXiv preprint arXiv:1307.7933.
http://arxiv.org/abs/1307.7933<p>Address of the bookmark: <a href="https://sourceforge.net/projects/ceruleanassembler/" rel="nofollow">https://sourceforge.net/projects/ceruleanassembler/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/8317/new-version-of-modeller-913</guid>
	<pubDate>Thu, 13 Feb 2014 09:07:57 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/8317/new-version-of-modeller-913</link>
	<title><![CDATA[New version of Modeller, 9.13]]></title>
	<description><![CDATA[<p>The new version of Modeller, 9.13, is now available for download! Please see the download page at <a href="http://www.facebook.com/l.php?u=http%3A%2F%2Fsalilab.org%2Fmodeller%2F&amp;h=mAQG5wo_Z&amp;enc=AZOoq2B7BxT95AT3Mw3za3VlbmRFke43YMI5vAjCAbBlIcf3bptn8pmFC1Idxrssy98117S03IgdcNmEWcQBi9bmi8Or_ut1D1yybt1ZonvPoCT3_LOglcYV7o6bEaa442_6LhbjefEaelkq0aq6dl0w&amp;s=1" target="_blank">http://salilab.org/modeller/</a> for more information.</p><p><img src="http://salilab.org/modeller/gifs/modeller.jpg" alt="image" width="848" height="272" style="border: 0px; border: 0px;"><br /> <br /> If you have a license key for Modeller 8 or 9, there is no need to reregister for Modeller 9.13 - the same license key will work. (It won't <span>do any harm to reregister if you want to, though!)<br /> <br /> 9.13 is primarily a bugfix release relative to the last public release(9.12). Major user-visible changes include:<br /> <br /> # Modeller now includes a variety of SOAP (statistically optimized atomic potential) scores for assessing proteins, loops, and interfaces.<br /> <br /> # The Lennard-Jones interaction energy is now artificially truncated at very short distance; this makes simulations with poor starting conditions much less likely to 'blow up'.<br /> <br /> # model.get_insertions(), model.get_deletions() and model.loops() now have an include_termini option; if False, residue ranges that include chain termini are excluded from the output.<br /> <br /> See the Modeller manual for a full change log: <a href="http://salilab.org/modeller/9.13/manual/node39.html" target="_blank">http://salilab.org/modeller/9.13/manual/node39.html</a><br /> <br /> If you encounter bugs in Modeller 9.13, please see <a href="http://salilab.org/modeller/9.13/manual/node10.html" target="_blank">http://salilab.org/modeller/9.13/manual/node10.html</a> for information on how to report them.</span></p><p><span>Reference:</span></p><p><span>http://salilab.org/modeller/</span></p>]]></description>
	<dc:creator>Radha Agarkar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/4183/320000-viruses-in-mammals-yet-to-sequenced-in-future</guid>
	<pubDate>Tue, 03 Sep 2013 08:35:30 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/4183/320000-viruses-in-mammals-yet-to-sequenced-in-future</link>
	<title><![CDATA[320000 viruses in mammals yet to sequenced in future!!!]]></title>
	<description><![CDATA[<p>With current biological technique improvements, finally it is now possible to look at millions of unknown viruses at genomic level and understand the mechanism. According to available data, close to 70 per cent of emerging viral diseases such as HIV/AIDS, West Nile, Ebola, SARS, and influenza, are zoonoses - infections of animals that cross into humans.</p><p>To address the challenges of describing and estimating virodiversity, a team of investigators from Center for Infection and Immunity (CII) and EcoHealth Alliance began in jungles of Bangladesh - home to the flying fox.</p><p>Reference:</p><p><a href="http://economictimes.indiatimes.com/news/news-by-industry/et-cetera/mammals-harbour-at-least-320000-new-viruses/articleshow/22253268.cms">http://economictimes.indiatimes.com/news/news-by-industry/et-cetera/mammals-harbour-at-least-320000-new-viruses/articleshow/22253268.cms</a></p><p><a href="http://www.bbc.co.uk/news/science-environment-23932400">http://www.bbc.co.uk/news/science-environment-23932400</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>

</channel>
</rss>