<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/28051?offset=150</link>
	<atom:link href="https://bioinformaticsonline.com/related/28051?offset=150" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40715/mutatrix-a-population-genome-simulator-which-generates-simulated-genomes</guid>
	<pubDate>Tue, 28 Jan 2020 04:06:58 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40715/mutatrix-a-population-genome-simulator-which-generates-simulated-genomes</link>
	<title><![CDATA[mutatrix: a population genome simulator which generates simulated genomes.]]></title>
	<description><![CDATA[<p><span>genome simulation across a population with zeta-distributed allele frequency, snps, insertions, deletions, and multi-nucleotide polymorphisms</span></p>
<p><span>More at&nbsp;<a href="https://github.com/ekg/mutatrix">https://github.com/ekg/mutatrix</a></span></p>
<pre>./mutatrix -S sample -P test/ -p 2 -n 10 reference.fasta</pre><p>Address of the bookmark: <a href="https://github.com/ekg/mutatrix" rel="nofollow">https://github.com/ekg/mutatrix</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/42921/run-bash-script-in-perl-program</guid>
	<pubDate>Sat, 27 Feb 2021 01:42:23 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/42921/run-bash-script-in-perl-program</link>
	<title><![CDATA[Run bash script in Perl program !]]></title>
	<description><![CDATA[<p>BioPerl is a compilation of Perl modules that can be used to build bioinformatics-related Perl scripts. It is used, for example, in the development of source codes, standalone software/tools, and algorithms in bioinformatics programmes. Different modules are easy to instal and include, making it easier to perform different functions. Despite the fact that Python is commonly favoured over Perl, some bioinformatics software, such as the standalone version of 'alienomics', is written in Perl. Often it's a major problem for beginners to execute certain Unix/shell commands in Perl script, so it's hard to determine which feature is unique to a situation.</p><div style="background-color: white;">Perl provides various features and operators for the execution of external commands (described as follows), which are unique in their own way.</div><div style="background-color: white;">&nbsp;</div><div style="background-color: white;">They vary slightly from one another, making it difficult for Perl beginners to choose between them.</div><div style="background-color: white;">&nbsp;</div><div style="background-color: white;"><strong>1. IPC::Open2</strong></div><p>More at https://bioinformaticsonline.com/snippets/view/42919/perl-ipcopen2-module</p><p><strong>2. exec&rdquo;&rdquo;</strong></p><p><em>&nbsp;syntax:&nbsp;</em><code>exec "command";</code></p><div style="background-color: #edfbff;">It's a Perl function (perlfunc) that executes a command but never returns, similar to a function's return statement.</div><div style="background-color: white;">While running the command, it keeps processing the script and does not wait until it finishes first, returns false when the command is not found, but never returns true.</div><p><strong>3. Backticks &ldquo; or qx//</strong></p><p><em>syntax:&nbsp;</em><code>`command`;</code></p><p><em>syntax:&nbsp;</em><code>qx/command/;</code></p><div style="background-color: white;">It's a Perl operator (perlop) that executes a command and then resumes the Perl script once the command has ended, but the return value is the command's STDOUT.</div><p><strong>4. IPC::Open3</strong></p><p><em>syntax:&nbsp;</em><code>$output =&nbsp;open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR,&nbsp;'command arg1 arg2', 'optarg',...);</code></p><p style="text-align: justify;"><code></code>It is very similar to <code>IPC::Open2</code> with the capability to capture all three file handles of the process, i.e., STDIN, STDOUT, and STDERR. It can also be used with or without the shell. You can read about it more in the documentation: <a href="https://perldoc.perl.org/IPC/Open3.html" target="_blank">IPC::Open3</a>.</p><p><code>$output = open3(my $o ut,&nbsp;my $in, 'command arg1 arg2');</code></p><p>OR without using the shell</p><p><code>$output = open3(my $out,&nbsp;my $in, 'command', 'arg1', 'arg2');</code></p><p><strong>5.a2p</strong></p><p><em>syntax:&nbsp;</em><code>a2p [options] [awkscript]</code></p><p>There is a Perl utility known as <code>a2p</code> which translates <code>awk</code> command to Perl. It takes awk script as input and generates a comparable Perl script as the output. Suppose, you want to execute the following <code>awk</code> statement</p><p><code>awk '{$1 = ""; $2 = ""; print}' f1.txt</code></p><p>This statement gives error sometimes even after escaping the variables (\$1, \$2) but by using <code>a2p</code> it can be easily converted to Perl script:</p><p>For further information, you can read it&rsquo;s documentation: <code><a href="https://perldoc.perl.org/a2p.html" target="_blank">a2p</a></code></p><p>More help at https://bioinformaticsonline.com/snippets/view/42920/perl-script-to-run-awk-inside-perl</p><p><strong>6. system()</strong></p><p><em>syntax:&nbsp;</em><code>system( "command" );</code></p><p>It is also a Perl function (<a href="https://perldoc.perl.org/functions/system.html" target="_blank">perlfunc</a>) that executes a command and waits for it to get finished first and then resume the Perl script. The return value is the exit status of the command. It can be called in two ways:</p><p><code>system( "command arg1 arg2" );</code></p><p>OR</p><p><code>system( "command", "arg1", "arg2" );</code></p><p>HELP</p><p>Here are some useful Perl cheat sheets that can be used as a quick reference guide.--&nbsp;<a href="https://www.pcwdld.com/perl-cheat-sheet" target="_blank">https://www.pcwdld.com/perl-cheat-sheet</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37666/ensembl-variation-calculated-variant-consequences</guid>
	<pubDate>Sun, 09 Sep 2018 19:17:37 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37666/ensembl-variation-calculated-variant-consequences</link>
	<title><![CDATA[Ensembl Variation - Calculated variant consequences]]></title>
	<description><![CDATA[<p><span>For each variant that is mapped to the reference genome, we identify all overlapping Ensembl transcripts. We then use a rule-based approach to predict the effects that each allele of the variant may have on each transcript. The set of consequence terms, defined by the&nbsp;</span><a href="http://www.sequenceontology.org/">Sequence Ontology</a><span>&nbsp;(SO), that can be currently assigned to each combination of an allele and a transcript is shown in the table below. Note that each allele of each variant may have a different effect in different transcripts.</span></p>
<p><span><img src="https://www.ensembl.org/info/genome/variation/prediction/consequences.jpg" width="1280" height="570" alt="image" style="border: 0px;"></span></p><p>Address of the bookmark: <a href="https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html" rel="nofollow">https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/view/2044</guid>
	<pubDate>Mon, 12 Aug 2013 12:19:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/view/2044</link>
	<title><![CDATA[Does anyone have Nanopore latest updates?]]></title>
	<description><![CDATA[<p>There was a lot of buzz about&nbsp;<span>Oxford Nanopore Technologies&reg; is developing the GridION&trade; system and miniaturised MinION&trade; device. These are a new generation of electronic molecular analysis system for use in scientific research, personalised medicine, crop science, security/defence and more. The platform technology uses nanopores to analyse single molecules including DNA/RNA and proteins. With a broad patent portfolio, the Oxford Nanopore pipeline includes biological nanopores and solid-state nanopores.</span></p><p>Is this available, or still under trial mode?&nbsp;</p><p><a href="https://www.nanoporetech.com/">https://www.nanoporetech.com/</a></p><p><a href="https://www.nanoporetech.com/technology/the-minion-device-a-miniaturised-sensing-system/the-minion-device-a-miniaturised-sensing-system">https://www.nanoporetech.com/technology/the-minion-device-a-miniaturised-sensing-system/the-minion-device-a-miniaturised-sensing-system</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/3918/the-human-genome-project-video-3d-animation-introduction-low</guid>
	<pubDate>Sat, 24 Aug 2013 19:01:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/3918/the-human-genome-project-video-3d-animation-introduction-low</link>
	<title><![CDATA[The Human Genome Project Video   3D Animation Introduction Low)]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/YxoQFSBwyms" frameborder="0" allowfullscreen></iframe>]]></description>
	
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/7913/the-genome-factory</guid>
	<pubDate>Thu, 16 Jan 2014 02:09:31 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/7913/the-genome-factory</link>
	<title><![CDATA[The genome factory !!!]]></title>
	<description><![CDATA[<p>Illumina, Inc. announced Tuesday that its new HiSeq X Ten Sequencing System has broken the &ldquo;sound barrier&rdquo; of human genomics by enabling the $1,000 genome. &ldquo;This platform includes dramatic technology breakthroughs that enable researchers to undertake studies of unprecedented scale by providing the throughput to sequence tens of thousands of human whole genomes in a single year in a single lab,&rdquo; Illumina stated.</p><p>Initial customers for the HiSeq X Ten System, which will ship in Q1 2014, include Macrogen, based in Seoul, South Korea and its CLIA laboratory in Rockville, Maryland, the Broad Institute in Cambridge, Massachusetts, and the Garvan Institute of Medical Research in Sydney, Australia.</p><p>&ldquo;For the first time, it looks like it will be possible to deliver the $1,000 genome, which is tremendously exciting,&rdquo; said Eric Lander, founding director of the Broad Institute and a professor of biology at MIT. &ldquo;The HiSeq X Ten should give us the ability to analyze complete genomic information from huge sample populations. Over the next few years, we have an opportunity to learn as much about the genetics of human disease as we have learned in the history of medicine.&rdquo;</p><p>&ldquo;The HiSeq X Ten is an ideal platform for scientists and institutions focused on the discovery of genotypic variation to enable a deeper understanding of human biology and genetic disease,&rdquo; Illumina stated. &ldquo;It can sequence tens of thousands of samples annually with high-quality, high-coverage sequencing, delivering a comprehensive catalog of human variation within and outside coding regions.&rdquo;</p><p>HiSeq X Ten utilizes a number of advanced design features to generate massive throughput. Patterned flow cells, which contain billions of nanowells at fixed locations, combined with a new clustering chemistry deliver a significant increase in data density (6 billion clusters per run). Using state-of-the art optics and faster chemistry, HiSeq X Ten can process sequencing flow cells more quickly than ever before &mdash; generating a 10x increase in daily throughput when compared to current HiSeq 2500 performance.</p><p>The HiSeq X Ten is sold as a set of 10 or more ultra-high throughput sequencing systems, each generating up to 1.8 terabases (Tb) of sequencing data in less than three days or up to 600 gigabases (Gb) per day, per system, providing the throughput to sequence tens of thousands of high-quality, high-coverage genomes per year. Illumina says the $1,000 includes typical instrument depreciation, DNA extraction, library preparation, and estimated labor.</p>]]></description>
	<dc:creator>Madhvan Reddy</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/11175/next-generation-sequencingngs-books</guid>
	<pubDate>Fri, 30 May 2014 04:48:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/11175/next-generation-sequencingngs-books</link>
	<title><![CDATA[Next generation sequencing(NGS) books]]></title>
	<description><![CDATA[<p>Employing different technologies, the purpose of NGS platform is to decode the identity or modification on the nucleotides. NGS platforms evolve quickly and capture the main stream.</p>
<p>This bookmark is created to provide NGS online books links.</p><p>Address of the bookmark: <a href="http://en.wikibooks.org/wiki/Next_Generation_Sequencing_%28NGS%29/Print_version" rel="nofollow">http://en.wikibooks.org/wiki/Next_Generation_Sequencing_%28NGS%29/Print_version</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/11399/next-generation-sequencing-in-r-or-bioconductor-environment</guid>
	<pubDate>Mon, 02 Jun 2014 18:03:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/11399/next-generation-sequencing-in-r-or-bioconductor-environment</link>
	<title><![CDATA[Next generation sequencing in R or bioconductor environment]]></title>
	<description><![CDATA[<p>There are many R software and bioconductor packages for NGS data analysis, some of them are as follows</p><h3><a name="TOC-Biostrings" id="TOC-Biostrings"></a>Biostrings</h3><p>The Biostrings package from Bioconductor provides an advanced environment for efficient sequence management and analysis in R. It contains many speed and memory effective string containers, string matching algorithms, and other utilities, for fast manipulation of large sets of biological sequences. The objects and functions provided by Biostrings form the basis for many other sequence analysis packages. <a href="http://bioconductor.org/packages/release/bioc/html/Biostrings.html">Documentation</a></p><div><div style="text-align: left;"><div style="color: #000000;"><h4><a name="TOC-IRanges-Overview" id="TOC-IRanges-Overview"></a>IRanges Overview</h4><p>IRanges provides the low-level infrastructure and containers for handling sets of integer ranges within Bioconductor's BioC-Seq domain. Its classes and methods provide support for many more high-level packages like GenomicRanges, ShortRead, Rsamtools, etc. <a href="http://bioconductor.org/packages/release/bioc/html/IRanges.html">Documentation</a></p><div style="text-align: right;"><div style="text-align: left;"><h4><a name="TOC-GenomicRanges-Overview" id="TOC-GenomicRanges-Overview"></a>GenomicRanges Overview</h4><p>The <em>GenomicRanges</em> package serves as the foundation for representing genomic locations within the Bioconductor project. It is built upon the <em>IRanges</em> infrastructure and defines three major data containers - <em>GRanges, GRangesList</em> and <em>GappedAlignments</em> - which are supporting other important BioC-Seq packages including <em>ShortRead, Rsamtools, rtracklayer, GenomicFeatures</em> and <em>BSgenome</em>.&nbsp; Compared to the IRanges container, the GRanges/<em>GRangesList</em> classes are more flexible and extensible to store additional information about sequence ranges, such as chromosome identifiers (sequence space), strand information and annotation data. <a href="http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html">Documentation</a></p></div></div></div></div><h3><a name="TOC-Motif-Discovery" id="TOC-Motif-Discovery"></a>Motif Discovery</h3><h4><a name="TOC-cosmo" id="TOC-cosmo"></a>cosmo</h4><p>The cosmo package allows to search a set of unaligned DNA sequences for a shared motif that may function as transcription factor binding site. The algorithm extends the popular motif discovery tool MEME (Bailey and Elkan, 1995) in that it allows the search to be supervised by specifying a set of constraints that the motif to be discovered must satisfy. <a href="http://bioconductor.org/packages/release/bioc/html/cosmo.html">Documentation</a></p></div><div>
<p><span></span><span></span></p>
<div style="color: #0000ff;"><h4><a name="TOC-BCRANK" id="TOC-BCRANK"></a>BCRANK</h4><p>BCRANK is a method that takes a ranked list of genomic regions as input and outputs short DNA sequences that are overrepresented in some part of the list. The algorithm was developed for detecting transcription factor (TF) binding sites in a large number of enriched regions from high-throughput ChIP-chip or ChIP-seq experiments, but it can be applied to any ranked list of DNA sequences. Documentation</p>
<p><a href="http://bioconductor.org/packages/release/bioc/html/BCRANK.html"></a></p>
<p>rGADEM: <a href="http://bioconductor.org/packages/devel/bioc/html/rGADEM.html">Documentation</a></p><p>MotIV: <a href="http://bioconductor.org/packages/devel/bioc/html/MotIV.html">Documentation</a></p></div><h3><a name="TOC-ShortRead" id="TOC-ShortRead"></a>ShortRead</h3><p>The ShortRead package provides input, quality control, filtering, parsing, and manipulation functionality for short read sequences produced by high throughput sequencing technologies. While support is provided for many sequencing technologies, this package is primairly focused on Solexa/Illumina reads. <a href="http://bioconductor.org/packages/release/bioc/html/ShortRead.html">Documentation</a></p><h3><a name="TOC-Rsamtools" id="TOC-Rsamtools"></a>Rsamtools</h3><p>Rsamtools provides functions for parsing and inspecting samtools BAM formatted binary alignment data. SAM/BAM is quickly becoming a universal standard alignment format, and is now supported by a wide variety of alignment tools. <a href="http://bioconductor.org/help/bioc-views/2.7/bioc/html/Rsamtools.html">Documentation</a></p>
<p><a href="http://samtools.sourceforge.net/">Samtools Website</a><br /> <a href="http://bio-bwa.sourceforge.net/">BWA (Burrows-Wheeler Alignment) Website</a><br /><span style="color: #0000ff;"></span></p>
<div style="color: #000000;">&nbsp;</div></div><div>
<p><span style="color: #000000;">Additional tools for SNP analysis:&nbsp;</span></p>
<p><a href="http://bioconductor.org/help/bioc-views/release/bioc/html/snpMatrix.html">snpMatrix</a></p><h3><a name="TOC-BSgenome" id="TOC-BSgenome"></a>BSgenome</h3><p>BSgenome provides an object oriented infrastructure for interacting with a Biostring based genome sequence. BSgenome packages exist for many common genomes, and can be created to represent custom genomes. See the "How to forge a BSgenome data package" Vignette for instructions to create a new BSgenome package if a prebuilt package does not exist for your organism. <a href="http://bioconductor.org/packages/release/bioc/html/BSgenome.html">Documentation</a></p><h3><a name="TOC-rtracklayer" id="TOC-rtracklayer"></a>rtracklayer</h3><p>rtracklayer provides an interface for exporting annotation feature data to various genome browsers and file formats (such as GFF). See the Small RNA Profiling exercise for an example of using rtracklayer to visualize alignment coverage. <a href="http://bioconductor.org/packages/release/bioc/html/rtracklayer.html">Documentation</a></p><h3><a name="TOC-biomaRt" id="TOC-biomaRt"></a>biomaRt</h3><p>The biomaRt package, provides an interface to a growing collection of databases implementing the BioMart software suite (http:// www.biomart.org). The package enables online retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas. This data is retrieved automatically via the Internet, so it's recommended that you cache the data locally, or check versions if your code will be adversely affected by updates to these data. <a href="http://bioconductor.org/packages/release/bioc/html/biomaRt.html">Documentation</a></p><h3><a name="TOC-ChIP-Seq-Analysis-Packages" id="TOC-ChIP-Seq-Analysis-Packages"></a>ChIP-Seq Analysis Packages</h3><p>Bioconductor provides various packages for analyzing and visualizing ChIP-Seq data. Only a small selection of these packages is introduced here. Additional useful introductions to this topic are: <a href="http://www.bioconductor.org/workshops/2009/SeattleJan09/ChIP-seq/">BioC ChIP-seq Case Study</a> and BioC <a href="http://www.bioconductor.org/help/course-materials/2009/SeattleNov09/ChIP-seq/">ChIP-Seq</a>.</p><h4><a name="TOC-chipseq" id="TOC-chipseq"></a>chipseq</h4><p>The chipseq package combines a variety of HT-Seq packages to a pipeline for ChIP-Seq data analysis. <a href="http://bioconductor.org/packages/release/bioc/html/chipseq.html">Documentation</a></p><h4><a name="TOC-BayesPeak" id="TOC-BayesPeak"></a>BayesPeak</h4><p>BayesPeak is a peak calling package for identifying DNA binding sites of proteins in ChIP-Seq experiments. Its algorithm uses hidden Markov models (HMM) and Bayesian statistical methods. The following sample code introduces the identification of peaks with the BayesPeak package as well as the incorporation of read coverage information obtained by the chipseq package. <a href="http://bioconductor.org/packages/release/bioc/html/BayesPeak.html">Documentation</a> [ <a href="http://www.biomedcentral.com/1471-2105/10/299">Publication</a> ]</p><h4><a name="TOC-PICS" id="TOC-PICS"></a>PICS</h4><p>The PICS package applies probabilistic inference to aligned-read ChIP-Seq data in order to identify regions bound by transcription factors. PICS identifies enriched regions by modeling local concentrations of directional reads, and uses DNA fragment length prior information to discriminate closely adjacent binding events via a Bayesian hierarchical t-mixture model. The following sample code uses the test data set from the above BayesPeak package in order to compare the results from both methods by identifying their consensus peak set. <a href="http://www.bioconductor.org/packages/release/bioc/html/PICS.html">Documentation</a> [ <a href="http://www.hubmed.org/display.cgi?uids=20528864">Publication</a> ]</p><h4><a name="TOC-ChIPpeakAnno" id="TOC-ChIPpeakAnno"></a>ChIPpeakAnno</h4><p>The ChIPpeakAnno package provides. batch annotation of the peaks identified from either ChIP-seq or ChIP-chip experiments. It includes functions to retrieve the sequences around peaks, obtain enriched Gene Ontology (GO) terms, find the nearest gene, exon, miRNA or custom features such as most conserved elements and other transcription factor binding sites supplied by users. The package leverages the biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest and stat packages. <a href="http://bioconductor.org/packages/release/bioc/html/ChIPpeakAnno.html">Documentation</a></p><h4><a name="TOC-Additional-ChIP-Seq-Packages" id="TOC-Additional-ChIP-Seq-Packages"></a>Additional ChIP-Seq Packages</h4><p>DiffBind: <a href="http://www.bioconductor.org/packages/release/bioc/html/DiffBind.html">Documentation</a></p><p>MOSAICS: <a href="http://bioconductor.org/packages/devel/bioc/html/mosaics.html">Documentation</a></p><p>iSeq: <a href="http://bioconductor.org/packages/release/bioc/html/iSeq.html">Documentation</a></p><p>ChIPseqR: <a href="http://bioconductor.org/packages/release/bioc/html/ChIPseqR.html">Documentation</a></p><p>ChiPsim: <a href="http://bioconductor.org/packages/release/bioc/html/ChIPsim.html">Documentation</a></p><p>CSAR: <a href="http://www.bioconductor.org/packages/devel/bioc/html/CSAR.html">Documentation</a></p><p>ChIP-Seq Pipeline: <a href="http://www.bioconductor.org/packages/release/bioc/html/PICS.html">PICS</a>, rGADEM and MotIV (<a href="http://www.rglab.org/pics-and-bioconductor/">developer web site</a>)</p><p>SPP: <a href="http://compbio.med.harvard.edu/Supplements/ChIP-seq/">ChIP-seq processing pipeline</a></p><p><a href="http://compbio.med.harvard.edu/Supplements/ChIP-seq/tutorial.html">SPP Tutorial</a></p><p><a href="http://liulab.dfci.harvard.edu/MACS/index.html">MACS</a></p><p><a href="http://gmdd.shgmo.org/Computational-Biology/ChIP-Seq/download/SIPeS">SIPeS</a></p><h3><a name="TOC-RNA-Seq-Analysis" id="TOC-RNA-Seq-Analysis"></a>RNA-Seq Analysis</h3><h4><a name="TOC-Counting-Reads-that-Overlap-with-Annotation-Ranges-" id="TOC-Counting-Reads-that-Overlap-with-Annotation-Ranges-"></a>Counting Reads that Overlap with Annotation Ranges&nbsp;</h4><p>The GenomicRanges package provides support for importing into R short read alignment data in BAM format (via Rsamtools) and associating them with genomic feature ranges, such as exons or genes. This way one can quantify the number of reads aligning to annotated genomic regions. The package defines general purpose containers for storing genomic intervals as well as more specialized containers for storing alignments against a reference genome. The two main functions for read counting provided by this infrastructure are <span>countOverlaps <span style="color: #000000;"><span>and</span></span> summarizeOverlaps</span>. For their proper usage, it is important to read the corresponding <a href="http://www.bioconductor.org/packages/devel/bioc/vignettes/GenomicRanges/inst/doc/summarizeOverlaps.pdf">PDF manual</a>. <a href="http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html">Documentation</a></p><h4><a name="TOC-Differential-Gene-Expression-Analysis-with-DESeq" id="TOC-Differential-Gene-Expression-Analysis-with-DESeq"></a>Differential Gene Expression Analysis with DESeq</h4><p>The DESeq package contains functions to call differentially expressed genes (DEGs) in count tables based on a model using the negative binomial distribution. It expects as input a data frame with the raw read counts per region/gene of interest (rows) for each test sample (columns).&nbsp; Such a count table can be imported into R or generated from BAM alignment files using the <span>countOverlaps</span> function as introduced above. <a href="http://www.bioconductor.org/packages/release/bioc/html/DESeq.html">Documentation</a></p><h4><a name="TOC-Differential-Gene-Expression-Analysis-with-edgeR" id="TOC-Differential-Gene-Expression-Analysis-with-edgeR"></a>Differential Gene Expression Analysis with edgeR</h4><p>The edgeR package uses empirical Bayes estimation and exact tests based on the negative binomial distribution to call differentially expressed genes (DEGs) in count data.&nbsp;</p>
<p><a href="http://www.bioconductor.org/packages/release/bioc/html/edgeR.html">Documentation</a></p>
<p><span style="color: #000000;">A variety of additional R packages are available for normalizing RNA-Seq read count data and identifying differentially expressed genes (DEG): <br /> </span></p><p><a href="http://bioconductor.org/packages/devel/bioc/html/easyRNASeq.html">easyRNASeq</a> (simplifies read counting per genome feature)</p><p><a href="http://www.bioconductor.org/packages/release/bioc/html/DEXSeq.html">DEXSeq</a> (Inference of differential exon usage);&nbsp;<a href="http://www.bioconductor.org/packages/release/data/experiment/html/parathyroidSE.html">parathyroidSE</a> explains how to generate exon read counts in R</p><p><a href="http://bioconductor.org/packages/release/bioc/html/DEGseq.html">DEGseq</a></p><p><a href="http://www.bioconductor.org/packages/release/bioc/html/baySeq.html">baySeq</a> (also see: <a href="http://www.bioconductor.org/packages/release/bioc/html/segmentSeq.html">segmentSeq</a>)</p><p><a href="http://bioconductor.org/packages/release/bioc/html/Genominator.html">Genominator</a> (<a href="http://www.hubmed.org/display.cgi?uids=20167110">Bullard et al. 2010</a>)</p><div style="text-align: right;"><div style="text-align: left;"><h4><a name="TOC-Detection-of-Alternative-Splice-Junctions" id="TOC-Detection-of-Alternative-Splice-Junctions"></a>Detection of Alternative Splice Junctions</h4>
<p><span style="color: #000000;">Another utility of RNA-Seq experiments is the analysis of splice junctions. The following software suggestions provide this utility:</span></p>
<p><a href="http://woldlab.caltech.edu/rnaseq/">ERANGE<br /> </a><a href="http://tophat.cbcb.umd.edu/">TopHat</a></p><p><a href="http://biogibbs.stanford.edu/%7Ekinfai/SpliceMap/">SpliceMap</a></p><p><a href="http://solidsoftwaretools.com/gf/project/splitseek/">SplitSeek</a></p><h3><a name="TOC-DNA-Methylation-Data-Analysis" id="TOC-DNA-Methylation-Data-Analysis"></a>DNA-Methylation Data Analysis</h3><div><ul>
<li><span style="font-size: 10pt;"><a href="http://www.bioconductor.org/help/course-materials/2012/BiocEurope2012/mattia_pelizzola_methylPipe.pdf">methylPipe</a></span></li>
<li><span style="font-size: 10pt;"><a href="http://www.bioconductor.org/packages/devel/bioc/html/bsseq.html">bsseq</a></span></li>
<li><a href="http://www.bioconductor.org/packages/devel/bioc/html/BiSeq.html">BiSeq</a></li>
<li>Much more under <a href="http://www.bioconductor.org/packages/devel/BiocViews.html#___DNAMethylation">BiocViews</a></li>
</ul></div></div></div><h3><a name="TOC-HT-Seq-Data-Visualization" id="TOC-HT-Seq-Data-Visualization"></a>HT-Seq Data Visualization</h3>
<p><a href="http://www.bioconductor.org/packages/release/bioc/html/ggbio.html">ggbio</a>: ggplot2 extension for genomics data (<a href="http://tengfei.github.com/ggbio/">online manual</a>) <a href="http://www.bioconductor.org/packages/devel/bioc/html/Gviz.html">Gviz</a>:&nbsp;Plotting data and annotation information along genomic coordinates <a href="http://bioconductor.org/packages/release/bioc/html/HilbertVis.html">HilbertVis</a>: Hilbert genome plots</p>
<p><a href="http://bioconductor.org/packages/release/bioc/html/GenomeGraphs.html">GenomeGraphs</a>: Plotting genomic information from Ensembl</p><p><a href="http://www.hubmed.org/display.cgi?uids=18507856">TileQC</a>: Flow Cell Quality Visualization</p><p><a href="http://bioconductor.org/packages/release/bioc/html/rtracklayer.html">rtracklayer</a>: R interface to genome browsers</p><p><a href="http://genoplotr.r-forge.r-project.org/">genoPlotR</a>: Plotting maps of genes and genomes</p><p><a href="http://bioconductor.org/packages/release/bioc/html/Genominator.html">Genominator</a>: Tools for storing, accessing, analyzing and visualizing genomic data.</p><p>&nbsp;</p><p>To install all packages</p><blockquote><p>source("http://bioconductor.org/biocLite.R")<br />biocLite()<br />biocLite(c("ShortRead", "Biostrings", "IRanges", "BSgenome", "rtracklayer", "biomaRt", "chipseq", "ChIPpeakAnno", "Rsamtools", "BayesPeak", "PICS", "GenomicRanges", "DESeq", "edgeR", "leeBamViews", "GenomicFeatures", "BSgenome.Celegans.UCSC.ce2"))</p></blockquote></div>]]></description>
	<dc:creator>John Parker</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/12944/orione-%E2%80%93-a-web-based-framework-for-ngs-analysis-in-microbiology</guid>
	<pubDate>Wed, 23 Jul 2014 06:43:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/12944/orione-%E2%80%93-a-web-based-framework-for-ngs-analysis-in-microbiology</link>
	<title><![CDATA[Orione – a web-based framework for NGS analysis in microbiology]]></title>
	<description><![CDATA[<p>End-to-end NGS microbiology data analysis requires a diversity of tools covering bacterial resequencing, de novo assembly, scaffolding, bacterial RNA-Seq, gene annotation and metagenomics. However, the construction of computational pipelines that use different software packages is difficult due to a lack of interoperability, reproducibility, and transparency. To overcome these limitations researchers at <a href="http://www.crs4.it/" target="_blank">CRS4</a>, Italy have developed Orione, a Galaxy-based framework consisting of publicly available research software and specifically designed pipelines to build complex, reproducible workflows for NGS microbiology data analysis. Enabling microbiology researchers to conduct their own custom analysis and data manipulation without software installation or programming, Orione provides new opportunities for data-intensive computational analyses in microbiology and metagenomics.</p>
<p>Reference</p>
<p>Cuccuru G1, Orsini M, Pinna A, Sbardellati A, Soranzo N, Travaglione A, Uva P, Zanetti G, Fotia G. (2014)<strong> Orione, a web-based framework for NGS analysis in microbiology.</strong> <em>Bioinformatics</em> [Epub ahead of print]. [<a href="http://bioinformatics.oxfordjournals.org/content/early/2014/03/10/bioinformatics.btu135.long" target="_blank">article</a>]</p><p>Address of the bookmark: <a href="http://orione.crs4.it/" rel="nofollow">http://orione.crs4.it/</a></p>]]></description>
	<dc:creator>Martin Jones</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/poll/view/15000/which-mathstatistics-programming-languageapplication-do-you-most-frequently-use-in-bioinformatics</guid>
	<pubDate>Thu, 04 Sep 2014 17:46:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/poll/view/15000/which-mathstatistics-programming-languageapplication-do-you-most-frequently-use-in-bioinformatics</link>
	<title><![CDATA[Which math/statistics programming language/application do you most frequently use in bioinformatics?]]></title>
	<description><![CDATA[<p>I'm doing a bit more statistical analysis on some bioinformatics things lately, and I'm curious if there are any programming languages that are particularly good for this NGS computation. What suggestions do you guys have? Are there any languages that have exceptionally good libraries?</p>]]></description>
	<dc:creator>John Parker</dc:creator>
</item>

</channel>
</rss>