<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/19921?offset=30</link>
	<atom:link href="https://bioinformaticsonline.com/related/19921?offset=30" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/19980/seqloc-06</guid>
	<pubDate>Sun, 28 Dec 2014 12:51:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/19980/seqloc-06</link>
	<title><![CDATA[seqloc 0.6]]></title>
	<description><![CDATA[<p>The <code>Bio.SeqLoc</code> modules in <code>seqloc</code> are designed to represent positions and locations (ranges of positions) on sequences, particularly nucleotide sequences. My original motivation for writing these packages was handing the locations of genes in eukaryotic genomes.</p>
<p>Handle sequence locations for bioinformatics http://www.ingolia-lab.org/seqloc-tutorial.html</p><p>Address of the bookmark: <a href="http://www.stackage.org/snapshot/nightly-2014-12-28/package/seqloc-0.6" rel="nofollow">http://www.stackage.org/snapshot/nightly-2014-12-28/package/seqloc-0.6</a></p>]]></description>
	<dc:creator>Gudiya Pal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/21312/r-for-microsoft-excel</guid>
	<pubDate>Wed, 18 Feb 2015 00:43:27 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/21312/r-for-microsoft-excel</link>
	<title><![CDATA[R for Microsoft Excel]]></title>
	<description><![CDATA[<div><p>If you currently use a spreadsheet like Microsoft Excel for data analysis, you might be interested in taking a look at this <a href="https://districtdatalabs.silvrback.com/intro-to-r-for-microsoft-excel-users" target="_blank">tutorial on how to transition from Excel to R</a>&nbsp;by Tony Ojeda. The tutorial explains how to use R functions in place of Excel formulas, including tools like =AVERAGE and =VLOOKUP. For the most part, it uses modern R packages to keep the R code clear and concise.</p><p>You'll likely still be using Excel as a data source, though, so you'll also want to check out this <a href="http://www.milanor.net/blog/?p=779" target="_blank">guide to importing data from Excel to R</a> from MilanoR.</p></div><p>Reference http://www.r-bloggers.com/an-r-tutorial-for-microsoft-excel-users/</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/21367/a-guide-for-complete-r-beginners-r-syntax</guid>
	<pubDate>Fri, 20 Feb 2015 23:41:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/21367/a-guide-for-complete-r-beginners-r-syntax</link>
	<title><![CDATA[A guide for complete R beginners :- R Syntax]]></title>
	<description><![CDATA[<p>R is a functional based language, the inputs to a function, including options, are in brackets. Note that all dat and options are separated by a comma</p><ul>
<li>Function(data, options)</li>
</ul><p>Even quit is a function</p><ul>
<li>q()</li>
</ul><p>So is help</p><blockquote><p><strong>help(read.table)</strong></p></blockquote><p>Provides the help page for the FUNCTION &lsquo;read.table&rsquo;</p><blockquote><p><strong>help.search(&ldquo;t test&rdquo;)</strong></p></blockquote><p>Searches for help pages that might relate to the phrase &lsquo;t test&rsquo;</p><p><strong>NOTE</strong>: quotes are needed for search strings, they are not needed when referring to data objects or function names.</p><p>There is a short cut for help,</p><p>? shows the help page on a function name, same as <em>help(function)</em></p><blockquote><p><strong>?read.table</strong></p></blockquote><p>?? searches for help pages on functions, same as <em>help.search(&lsquo;phrase&rsquo;)</em></p><blockquote><p><strong>??&ldquo;t test&rdquo;</strong></p></blockquote><p>Information is usually returned from a function, by default this is printed to screen</p><blockquote><p><strong>read.table(&lsquo;data.tsv&rsquo;)</strong></p></blockquote><p>This can always be stored, we call what it is stored in an &lsquo;object&rsquo;</p><p><strong>mydata </strong></p><p>here <strong>mydata</strong> is an object of type <span style="text-decoration: underline;">dataframe</span></p><p><strong>Reminder:</strong></p><ul>
<li>Vector: a list of numbers, equivalent to a column in a table</li>
<li>Data Frame = a collection of vectors. Equivalent to a table</li>
</ul><p><strong>Hint</strong>:</p><ul>
<li>Up/Down arrow keys can be use to cycle through previous commands</li>
</ul>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/22807/software-packages-for-next-gen-sequence-analysis</guid>
	<pubDate>Fri, 19 Jun 2015 21:07:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/22807/software-packages-for-next-gen-sequence-analysis</link>
	<title><![CDATA[Software packages for next gen sequence analysis]]></title>
	<description><![CDATA[<p><strong>Integrated solutions</strong><br /> * <a href="http://www.clcbio.com/index.php?id=1240" target="_blank">CLCbio Genomics Workbench</a> - <em>de novo</em> and reference assembly of Sanger, Roche FLX, Illumina, Helicos, and SOLiD data. Commercial next-gen-seq software that extends the CLCbio Main Workbench software. Includes SNP detection, CHiP-seq, browser and other features. Commercial. Windows, Mac OS X and Linux.<br /> * <a href="http://g2.trac.bx.psu.edu/" target="_blank">Galaxy</a> - Galaxy = interactive and reproducible genomics. A job webportal.<br /> * <a href="http://www.genomatix.de/products/index.html" target="_blank">Genomatix</a> - Integrated Solutions for Next Generation Sequencing data analysis.<br /> * <a href="http://www.jmp.com/software/genomics/" target="_blank">JMP Genomics</a> - Next gen visualization and statistics tool from SAS. They are <a href="http://www.marketwatch.com/news/story/JMPR-Genomics-NCGR-Partnership-Foster/story.aspx?guid=%7B7AC9DE36-B6AA-4EDE-9CD5-633B29FE6154%7D" target="_blank">working with NCGR</a> to refine this tool and produce others.<br /> * <a href="http://softgenetics.com/NextGENe.html" target="_blank">NextGENe</a> - <em>de novo</em> and reference assembly of Illumina, SOLiD and Roche FLX data. Uses a novel Condensation Assembly Tool approach where reads are joined via "anchors" into mini-contigs before assembly. Includes SNP detection, CHiP-seq, browser and other features. Commercial. Win or MacOS.<br /> * <a href="http://www.dnastar.com/products/SMGA.php" target="_blank">SeqMan Genome Analyser</a> - Software for Next Generation sequence assembly of Illumina, Roche FLX and Sanger data integrating with Lasergene Sequence Analysis software for additional analysis and visualization capabilities. Can use a hybrid templated/de novo approach. Commercial. Win or Mac OS X.<br /> * <a href="http://1001genomes.org/downloads/shore.html" target="_blank">SHORE</a> - SHORE, for Short Read, is a mapping and analysis pipeline for short DNA sequences produced on a Illumina Genome Analyzer. A suite created by the 1001 Genomes project. Source for POSIX.<br /> * <a href="http://www.realtimegenomics.com/" target="_blank">SlimSearch</a> - Fledgling commercial product.<br /> <br /> <strong>Align/Assemble to a reference</strong><br /> * <a href="https://secure.genome.ucla.edu/index.php/BFAST" target="_blank">BFAST</a> - Blat-like Fast Accurate Search Tool. Written by Nils Homer, Stanley F. Nelson and Barry Merriman at UCLA.<br /> * <a href="http://bowtie-bio.sourceforge.net/" target="_blank">Bowtie</a> - Ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of 25 million reads per hour on a typical workstation with 2 gigabytes of memory. Uses a Burrows-Wheeler-Transformed (BWT) index. <a href="http://seqanswers.com/forums/showthread.php?t=706" target="_blank">Link to discussion thread here</a>. Written by Ben Langmead and Cole Trapnell. Linux, Windows, and Mac OS X.<br /> * <a href="http://maq.sourceforge.net/" target="_blank">BWA</a> - Heng Lee's BWT Alignment program - a progression from Maq. BWA is a fast light-weighted tool that aligns short sequences to a sequence database, such as the human reference genome. By default, BWA finds an alignment within edit distance 2 to the query sequence. C++ source.<br /> * <a href="http://bioinfo.cgrb.oregonstate.edu/docs/solexa/" target="_blank">ELAND</a> - Efficient Large-Scale Alignment of Nucleotide Databases. Whole genome alignments to a reference genome. Written by Illumina author Anthony J. Cox for the Solexa 1G machine.<br /> * <a href="http://www.ebi.ac.uk/%7Eguy/exonerate/" target="_blank">Exonerate</a> - Various forms of pairwise alignment (including Smith-Waterman-Gotoh) of DNA/protein against a reference. Authors are Guy St C Slater and Ewan Birney from EMBL. C for POSIX.<br /> * <a href="http://1001genomes.org/downloads/genomemapper.html" target="_blank">GenomeMapper</a> - GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. A tool created by the 1001 Genomes project. Source for POSIX.<br /> * <a href="http://www.gene.com/share/gmap/" target="_blank">GMAP</a> - GMAP (Genomic Mapping and Alignment Program) for mRNA and EST Sequences. Developed by Thomas Wu and Colin Watanabe at Genentec. C/Perl for Unix.<br /> * <a href="http://dna.cs.byu.edu/gnumap/" target="_blank">gnumap</a> - The Genomic Next-generation Universal MAPper (gnumap) is a program designed to accurately map sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. It seeks to align reads from nonunique repeats using statistics. From authors at Brigham Young University. C source/Unix.<br /> * <a href="http://sourceforge.net/projects/maq/" target="_blank">MAQ</a> - Mapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina with preliminary functions to handle ABI SOLiD data. Written by Heng Li from the Sanger Centre. Features extensive supporting tools for DIP/SNP detection, etc. C++ source<br /> * <a href="http://bioinformatics.bc.edu/marthlab/Mosaik" target="_blank">MOSAIK</a> - MOSAIK produces gapped alignments using the Smith-Waterman algorithm. Features a number of support tools. Support for Roche FLX, Illumina, SOLiD, and Helicos. Written by Michael Str&ouml;mberg at Boston College. Win/Linux/MacOSX<br /> * <a href="http://mrfast.sourceforge.net/" target="_blank">MrFAST and MrsFAST</a> - mrFAST &amp; mrsFAST are designed to map short reads generated with the Illumina platform to reference genome assemblies; in a fast and memory-efficient manner. Robust to INDELs and MrsFAST has a bisulphite mode. Authors are from the University of Washington. C as source.<br /> * <a href="http://mummer.sourceforge.net/" target="_blank">MUMmer</a> - MUMmer is a modular system for the rapid whole genome alignment of finished or draft sequence. Released as a package providing an efficient suffix tree library, seed-and-extend alignment, SNP detection, repeat detection, and visualization tools. Version 3.0 was developed by Stefan Kurtz, Adam Phillippy, Arthur L Delcher, Michael Smoot, Martin Shumway, Corina Antonescu and Steven L Salzberg - most of whom are at The Institute for Genomic Research in Maryland, USA. POSIX OS required.<br /> * <a href="http://www.novocraft.com/index.html" target="_blank">Novocraft</a> - Tools for reference alignment of paired-end and single-end Illumina reads. Uses a Needleman-Wunsch algorithm. Can support Bis-Seq. Commercial. Available free for evaluation, educational use and for use on open not-for-profit projects. Requires Linux or Mac OS X.<br /> * <a href="http://pass.cribi.unipd.it/cgi-bin/pass.pl" target="_blank">PASS</a> - It supports Illumina, SOLiD and Roche-FLX data formats and allows the user to modulate very finely the sensitivity of the alignments. Spaced seed intial filter, then NW dynamic algorithm to a SW(like) local alignment. Authors are from CRIBI in Italy. Win/Linux.<br /> * <a href="http://rulai.cshl.edu/rmap/" target="_blank">RMAP</a> - Assembles 20 - 64 bp Illumina reads to a FASTA reference genome. By Andrew D. Smith and Zhenyu Xuan at CSHL. (published in BMC Bioinformatics). POSIX OS required.<br /> * <a href="http://biogibbs.stanford.edu/%7Ejiangh/SeqMap/" target="_blank">SeqMap</a> - Supports up to 5 or more bp mismatches/INDELs. Highly tunable. Written by Hui Jiang from the Wong lab at Stanford. Builds available for most OS's.<br /> * <a href="http://compbio.cs.toronto.edu/shrimp/" target="_blank">SHRiMP</a> - Assembles to a reference sequence. Developed with Applied Biosystem's colourspace genomic representation in mind. Authors are Michael Brudno and Stephen Rumble at the University of Toronto. POSIX.<br /> * <a href="http://www.bcgsc.ca/platform/bioinfo/software/slider" target="_blank"><span style="text-decoration: underline;">Slider</span></a>- An application for the Illumina Sequence Analyzer output that uses the probability files instead of the sequence files as an input for alignment to a reference sequence or a set of reference sequences. Authors are from BCGSC. Paper is <a href="http://seqanswers.com/forums/showthread.php?t=740" target="_blank">here</a>.<br /> * <a href="http://soap.genomics.org.cn/" target="_blank">SOAP</a> - SOAP (Short Oligonucleotide Alignment Program). A program for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The updated version uses a BWT. Can call SNPs and INDELs. Author is Ruiqiang Li at the Beijing Genomics Institute. C++, POSIX.<br /> * <a href="http://www.sanger.ac.uk/Software/analysis/SSAHA/" target="_blank">SSAHA</a> - SSAHA (Sequence Search and Alignment by Hashing Algorithm) is a tool for rapidly finding near exact matches in DNA or protein databases using a hash table. Developed at the Sanger Centre by Zemin Ning, Anthony Cox and James Mullikin. C++ for Linux/Alpha.<br /> * <a href="http://socs.biology.gatech.edu/" target="_blank">SOCS</a> - Aligns SOLiD data. SOCS is built on an iterative variation of the Rabin-Karp string search algorithm, which uses hashing to reduce the set of possible matches, drastically increasing search speed. Authors are Ondov B, Varadarajan A, Passalacqua KD and Bergman NH.<br /> * <a href="http://bibiserv.techfak.uni-bielefeld.de/swift/welcome.html" target="_blank">SWIFT</a> - The SWIFT suit is a software collection for fast index-based sequence comparison. It contains: SWIFT &mdash; fast local alignment search, guaranteeing to find epsilon-matches between two sequences. SWIFT BALSAM &mdash; a very fast program to find semiglobal non-gapped alignments based on k-mer seeds. Authors are Kim Rasmussen (SWIFT) and Wolfgang Gerlach (SWIFT BALSAM)<br /> * <a href="http://synasite.mgrc.com.my:8080/sxog/NewSXOligoSearch.php" target="_blank">SXOligoSearch</a> - SXOligoSearch is a commercial platform offered by the Malaysian based <a href="http://www.synamatix.com/" target="_blank">Synamatix</a>. Will align Illumina reads against a range of Refseq RNA or NCBI genome builds for a number of organisms. Web Portal. OS independent.<br /> * <a href="http://www.vmatch.de/" target="_blank">Vmatch</a> - A versatile software tool for efficiently solving large scale sequence matching tasks. Vmatch subsumes the software tool REPuter, but is much more general, with a very flexible user interface, and improved space and time requirements. Essentially a large string matching toolbox. POSIX.<br /> * <a href="http://www.bioinformaticssolutions.com/products/zoom/index.php" target="_blank">Zoom</a> - ZOOM (Zillions Of Oligos Mapped) is designed to map millions of short reads, emerged by next-generation sequencing technology, back to the reference genomes, and carry out post-analysis. ZOOM is developed to be highly accurate, flexible, and user-friendly with speed being a critical priority. Commercial. Supports Illumina and SOLiD data.<br /> <br /> <strong><em>De novo</em> Align/Assemble</strong><br /> * <a href="http://www.bcgsc.ca/platform/bioinfo/software/abyss" target="_blank">ABySS</a> - Assembly By Short Sequences. ABySS is a de novo sequence assembler that is designed for very short reads. The single-processor version is useful for assembling genomes up to 40-50 Mbases in size. The parallel version is implemented using MPI and is capable of assembling larger genomes. By Simpson JT and others at the Canada's Michael Smith Genome Sciences Centre. C++ as source. <br /> * <a href="http://www.broad.mit.edu/science/programs/genome-biology/computational-rd/computational-research-and-development" target="_blank">ALLPATHS</a> - ALLPATHS: De novo assembly of whole-genome shotgun microreads. ALLPATHS is a whole genome shotgun assembler that can generate high quality assemblies from short reads. Assemblies are presented in a graph form that retains ambiguities, such as those arising from polymorphism, thereby providing information that has been absent from previous genome assemblies. Broad Institute.<br /> * <a href="http://www.genomic.ch/edena.php" target="_blank">Edena</a> - Edena (Exact DE Novo Assembler) is an assembler dedicated to process the millions of very short reads produced by the Illumina Genome Analyzer. Edena is based on the traditional overlap layout paradigm. By D. Hernandez, P. Fran&ccedil;ois, L. Farinelli, M. Osteras, and J. Schrenzel. Linux/Win.<br /> * <a href="http://euler-assembler.ucsd.edu/portal/" target="_blank">EULER-SR</a> - Short read <em>de novo</em> assembly. By Mark J. Chaisson and Pavel A. Pevzner from UCSD (published in Genome Research). Uses a de Bruijn graph approach.<br /> * <a href="http://chevreux.org/projects_mira.html" target="_blank">MIRA2</a> - MIRA (Mimicking Intelligent Read Assembly) is able to perform true hybrid de-novo assemblies using reads gathered through 454 sequencing technology (GS20 or GS FLX). Compatible with 454, Solexa and Sanger data. Linux OS required.<br /> * <a href="http://www.seqan.de/projects/consensus.html" target="_blank">SEQAN</a> - A Consistency-based Consensus Algorithm for De Novo and Reference-guided Sequence Assembly of Short Reads. By Tobias Rausch and others. C++, Linux/Win.<br /> * <a href="http://sharcgs.molgen.mpg.de/" target="_blank">SHARCGS</a> - De novo assembly of short reads. Authors are Dohm JC, Lottaz C, Borodina T and Himmelbauer H. from the Max-Planck-Institute for Molecular Genetics.<br /> * <a href="http://www.bcgsc.ca/platform/bioinfo/software/ssake" target="_blank">SSAKE</a> - The Short Sequence Assembly by K-mer search and 3' read Extension (SSAKE) is a genomics application for aggressively assembling millions of short nucleotide sequences by progressively searching for perfect 3'-most k-mers using a DNA prefix tree. Authors are Ren&eacute; Warren, Granger Sutton, Steven Jones and Robert Holt from the Canada's Michael Smith Genome Sciences Centre. Perl/Linux.<br /> * <a href="http://soap.genomics.org.cn/" target="_blank">SOAPdenovo</a> - Part of the SOAP suite. See above. <br /> * <a href="https://sourceforge.net/projects/vcake" target="_blank">VCAKE</a> - De novo assembly of short reads with robust error correction. An improvement on early versions of SSAKE.<br /> * <a href="http://www.ebi.ac.uk/%7Ezerbino/velvet/" target="_blank">Velvet</a> - Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454. Need about 20-25X coverage and paired reads. Developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI). <br /> <br /> <strong>SNP/Indel Discovery</strong><br /> * <a href="http://www.sanger.ac.uk/Software/analysis/ssahaSNP/" target="_blank">ssahaSNP</a> - ssahaSNP is a polymorphism detection tool. It detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence. Highly repetitive elements are filtered out by ignoring those kmer words with high occurrence numbers. More tuned for ABI Sanger reads. Developers are Adam Spargo and Zemin Ning from the Sanger Centre. Compaq Alpha, Linux-64, Linux-32, Solaris and Mac<br /> * <a href="http://bioinformatics.bc.edu/marthlab/PbShort" target="_blank">PolyBayesShort</a> - A re-incarnation of the PolyBayes SNP discovery tool developed by Gabor Marth at Washington University. This version is specifically optimized for the analysis of large numbers (millions) of high-throughput next-generation sequencer reads, aligned to whole chromosomes of model organism or mammalian genomes. Developers at Boston College. Linux-64 and Linux-32.<br /> * <a href="http://bioinformatics.bc.edu/marthlab/PyroBayes" target="_blank">PyroBayes</a> - PyroBayes is a novel base caller for pyrosequences from the 454 Life Sciences sequencing machines. It was designed to assign more accurate base quality estimates to the 454 pyrosequences. Developers at Boston College. <br /> <br /> <strong>Genome Annotation/Genome Browser/Alignment Viewer/Assembly Database</strong><br /> * <a href="http://bioinformatics.bc.edu/marthlab/EagleView" target="_blank">EagleView</a> - An information-rich genome assembler viewer. EagleView can display a dozen different types of information including base quality and flowgram signal. Developers at Boston College.<br /> * <a href="http://www.sanger.ac.uk/Software/analysis/lookseq/" target="_blank">LookSeq</a> - LookSeq is a web-based application for alignment visualization, browsing and analysis of genome sequence data. LookSeq supports multiple sequencing technologies, alignment sources, and viewing modes; low or high-depth read pileups; and easy visualization of putative single nucleotide and structural variation. From the Sanger Centre.<br /> * <a href="http://evolution.sysu.edu.cn/mapview/" target="_blank">MapView</a> - MapView: visualization of short reads alignment on desktop computer. From the Evolutionary Genomics Lab at Sun-Yat Sen University, China. Linux.<br /> * <a href="http://www.bcgsc.ca/platform/bioinfo/software/sam" target="_blank">SAM</a> - Sequence Assembly Manager. Whole Genome Assembly (WGA) Management and Visualization Tool. It provides a generic platform for manipulating, analyzing and viewing WGA data, regardless of input type. Developers are Rene Warren, Yaron Butterfield, Asim Siddiqui and Steven Jones at Canada's Michael Smith Genome Sciences Centre. MySQL backend and Perl-CGI web-based frontend/Linux. <br /> * <a href="http://staden.sourceforge.net/" target="_blank">STADEN</a> - Includes GAP4. GAP5 once completed will handle next-gen sequencing data. A partially implemented test version is available <a href="https://sourceforge.net/project/show...kage_id=256957" target="_blank">here</a><br /> * <a href="http://www.bcgsc.ca/platform/bioinfo/software/xmatchview" target="_blank">XMatchView</a> - A visual tool for analyzing cross_match alignments. Developed by Rene Warren and Steven Jones at Canada's Michael Smith Genome Sciences Centre. Python/Win or Linux.<br /> <br /> <strong>Counting e.g. CHiP-Seq, Bis-Seq, CNV-Seq</strong><br /> * <a href="http://epigenomics.mcdb.ucla.edu/BS-Seq/download.html" target="_blank">BS-Seq</a> - The source code and data for the "Shotgun Bisulphite Sequencing of the Arabidopsis Genome Reveals DNA Methylation Patterning" Nature paper by <a href="http://www.ncbi.nlm.nih.gov/sites/entrez?holding=&amp;db=pubmed&amp;cmd=search&amp;term=Shotgun%20Bisulphite%20Sequencing" target="_blank">Cokus et al.</a> (Steve Jacobsen's lab at UCLA). POSIX.<br /> * <a href="http://woldlab.caltech.edu/chipseq/" target="_blank">CHiPSeq</a> - Program used by Johnson et al. (2007) in their Science publication<br /> * <a href="http://tiger.dbs.nus.edu.sg/cnv-seq/" target="_blank">CNV-Seq</a> - CNV-seq, a new method to detect copy number variation using high-throughput sequencing. Chao Xie and Martti T Tammi at the National University of Singapore. Perl/R.<br /> * <a href="http://www.bcgsc.ca/platform/bioinfo/software/findpeaks" target="_blank">FindPeaks</a> - perform analysis of ChIP-Seq experiments. It uses a naive algorithm for identifying regions of high coverage, which represent Chromatin Immunoprecipitation enrichment of sequence fragments, indicating the location of a bound protein of interest. Original algorithm by Matthew Bainbridge, in collaboration with Gordon Robertson. Current code and implementation by Anthony Fejes. Authors are from the Canada's Michael Smith Genome Sciences Centre. JAVA/OS independent. Latest versions available as part of the <a href="http://vancouvershortr.sourceforge.net/" target="_blank">Vancouver Short Read Analysis Package</a><br /> * <a href="http://liulab.dfci.harvard.edu/MACS/" target="_blank">MACS</a> - Model-based Analysis for ChIP-Seq. MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. Written by Yong Zhang and Tao Liu from Xiaole Shirley Liu's Lab. <br /> * <a href="http://www.gersteinlab.org/proj/PeakSeq/" target="_blank">PeakSeq</a> - PeakSeq: Systematic Scoring of ChIP-Seq Experiments Relative to Controls. a two-pass approach for scoring ChIP-Seq data relative to controls. The first pass identifies putative binding sites and compensates for variation in the mappability of sequences across the genome. The second pass filters out sites that are not significantly enriched compared to the normalized input DNA and computes a precise enrichment and significance. By Rozowsky J et al. C/Perl.<br /> * <a href="http://mendel.stanford.edu/sidowlab/downloads/quest/" target="_blank">QuEST</a> - Quantitative Enrichment of Sequence Tags. Sidow and Myers Labs at Stanford. From the 2008 publication <a href="http://www.ncbi.nlm.nih.gov/pubmed/18711362" target="_blank">Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data</a>. (C++)<br /> * <a href="http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/" target="_blank">SISSRs</a> - Site Identification from Short Sequence Reads. BED file input. Raja Jothi @ NIH. Perl.<br /> **See also <a href="http://seqanswers.com/forums/showthread.php?t=742" target="_blank">this thread</a> for ChIP-Seq, until I get time to update this list.<br /> <br /> <strong>Alternate Base Calling</strong><br /> * <a href="http://svitsrv25.epfl.ch/R-doc/library/Rolexa/html/00Index.html" target="_blank">Rolexa</a> - R-based framework for base calling of Solexa data. Project <a href="http://www.biomedcentral.com/1471-2105/9/431" target="_blank">publication</a><br /> * <a href="http://hannonlab.cshl.edu/Alta-Cyclic/main.html" target="_blank">Alta-cyclic</a> - "a novel Illumina Genome-Analyzer (Solexa) base caller"<br /> <br /> <strong>Transcriptomics</strong><br /> * <a href="http://woldlab.caltech.edu/rnaseq/" target="_blank">ERANGE</a> - Mapping and Quantifying Mammalian Transcriptomes by RNA-Seq. Supports Bowtie, BLAT and ELAND. From the Wold lab.<br /> * <a href="http://www.genoscope.cns.fr/externe/gmorse/" target="_blank">G-Mo.R-Se</a> - G-Mo.R-Se is a method aimed at using RNA-Seq short reads to build de novo gene models. First, candidate exons are built directly from the positions of the reads mapped on the genome (without any ab initio assembly of the reads), and all the possible splice junctions between those exons are tested against unmapped reads. From CNS in France.<br /> * <a href="http://evolution.sysu.edu.cn/english/software/mapnext.htm" target="_blank">MapNext</a> - MapNext: A software tool for spliced and unspliced alignments and SNP detection of short sequence reads. From the Evolutionary Genomics Lab at Sun-Yat Sen University, China.<br /> * <a href="http://www.fml.tuebingen.mpg.de/raetsch/suppl/qpalma" target="_blank">QPalma</a> - Optimal Spliced Alignments of Short Sequence Reads. Authors are Fabio De Bona, Stephan Ossowski, Korbinian Schneeberger, and Gunnar R&auml;tsch. A paper is <a href="http://www.fml.tuebingen.mpg.de/raetsch/suppl/qpalma/qpalma-final.pdf" target="_blank">available</a>.<br /> * <a href="http://biogibbs.stanford.edu/%7Ejiangh/rsat/" target="_blank">RSAT</a> - RSAT: RNA-Seq Analysis Tools. RNASAT is developed and maintained by Hui Jiang at Stanford University.<br /> * <a href="http://tophat.cbcb.umd.edu/" target="_blank">TopHat</a> - TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons. TopHat is a collaborative effort between the University of Maryland and the University of California, Berkeley</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/24264/cancer-research-database</guid>
	<pubDate>Tue, 01 Sep 2015 17:36:31 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/24264/cancer-research-database</link>
	<title><![CDATA[Cancer research database]]></title>
	<description><![CDATA[<p>Researchers in Andhra Pradesh have developed a database to identify genes that are common in tumours to provide their colleagues with easy access to insights into the genetic alterations in cancer.<br /> &nbsp;<br /> The database, hosted at the Sri Venkateswara University (SVU) in Tirupati, will integrate information on cancer genes and markers with experimental data.<br /> &nbsp;<br /> The <a href="http://cgmd.in/" target="_blank">Cancer Gene Markers Database</a> (CGMD) is meant to help scientists better understand tumour genes and markers at a molecular level by combining data with literature on treatment regimen and recent advances in cancer therapy.<br /> <br /> The database is free to access, and already includes 309 genes and 206 markers that correspond to 40 different human cancers. Accompanying literature comes from databases such as the United States&rsquo; <a href="http://www.ncbi.nlm.nih.gov/" target="_blank">National Center for Biotechnology Information</a> and the <a href="http://www.genome.jp/kegg/" target="_blank">Kyoto Encyclopedia of Genes and Genomes</a>. It also includes experimental data from <a href="http://www.ncbi.nlm.nih.gov/pubmed" target="_blank">PubMed</a>.<br /> <br /> In a paper <a href="http://dx.doi.org/10.1038/srep12035" target="_blank">published</a> last month in <em>Nature Scientific Reports</em>, the researchers from SVU&rsquo;s department of animal biotechnology, describes the need for a database for different genes and markers along with their molecular characteristics and pathway associations.</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26303/maker</guid>
	<pubDate>Sun, 07 Feb 2016 15:59:24 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26303/maker</link>
	<title><![CDATA[MAKER]]></title>
	<description><![CDATA[<p>MAKER is a portable and easily configurable genome annotation pipeline.Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence-based quality values.</p>
<p>More at http://www.yandell-lab.org/software/maker.html</p><p>Address of the bookmark: <a href="http://www.yandell-lab.org/software/maker.html" rel="nofollow">http://www.yandell-lab.org/software/maker.html</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/26322/liftover</guid>
	<pubDate>Mon, 08 Feb 2016 15:45:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/26322/liftover</link>
	<title><![CDATA[liftover]]></title>
	<description><![CDATA[<p><span>Convenient conversions between genome assemblie.&nbsp;The liftover package makes it easy to remap genomic coordinates to a different genome assembly. </span></p>
<p><span>More at https://github.com/aaronwolen/liftover<br></span></p>
<p><span>https://www.bioconductor.org/help/workflows/liftOver/</span></p><p>Address of the bookmark: <a href="https://github.com/aaronwolen/liftover" rel="nofollow">https://github.com/aaronwolen/liftover</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/21443/a-guide-for-complete-r-beginners-getting-data-into-r</guid>
	<pubDate>Tue, 24 Feb 2015 20:15:08 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/21443/a-guide-for-complete-r-beginners-getting-data-into-r</link>
	<title><![CDATA[A guide for complete R beginners :- Getting data into R]]></title>
	<description><![CDATA[<p>For a beginner this can be is the hardest part, it is also the most important to get right.</p><p>It is possible to create a vector by typing data directly into R using the combine function &lsquo;c&rsquo;</p><blockquote><p><strong>x </strong></p></blockquote><p>same as</p><blockquote><p><strong>x </strong></p></blockquote><p>creates the vector x with the numbers between 1 and 5.</p><p>You can see what is in an object at any time by typing its name;</p><blockquote><p><strong>x</strong></p></blockquote><p>will produce the output<strong> &lsquo;[1] 1 2 3 4 5&prime;</strong></p><p>Note that names need to be quoted</p><blockquote><p><strong>daysofweek </strong><strong>&larr; c(&lsquo;Monday&rsquo;, &lsquo;Tuesday&rsquo;, &lsquo;Wednesday&rsquo;, &lsquo;Thursday&rsquo;, &lsquo;Friday&rsquo;);</strong></p></blockquote><p>Usually however you want to input from a file. We have touched on the &lsquo;read.table&rsquo; function already.</p><blockquote><p><strong>mydata </strong></p></blockquote><p>Now <strong>mydata</strong> is a data frame with multiple vectors</p><p>each vector can be identified by the default syntax</p><p>#if any of these are typed it will print to screen</p><blockquote><p><strong>mydata$V1 mydata$V2 mydata$V3 </strong></p></blockquote><p>By default the function assumes certain things from the file</p><ul>
<li>The file is a plain text file (there are function to read excel files: <em>not covered here</em>)</li>
<li>columns are separated by any number of tabs or spaces</li>
<li>there is the same number of data points in each column</li>
<li>there is no header row (labels for the columns)</li>
<li>there is no column with names for the rows** [I&rsquo;ll explain].</li>
</ul><p><span style="text-decoration: underline;">If any of these are false, we need to tell that to the function</span></p><p>If it has a header column</p><blockquote><p><strong>mydata <em>header=T also works</em></strong></p></blockquote><p>Note that there is a comma between different parts of the functions arguments</p><p>If there is one less column in the header row, then R assumes that the 1<sup>st</sup> column of data after the header are the row names</p><p>Now the vectors (columns) are identified by their name</p><p>#if any of these are typed it will print to screen</p><blockquote><p><strong>mydata$A mydata$B mydata$C </strong></p></blockquote><p># Summary about the whole data frame</p><blockquote><p><strong>summary(mydata)</strong></p></blockquote><p># Summary information of column A</p><blockquote><p><strong>summary(mydata$A) </strong></p></blockquote><p>We can shortcut having to type the data frame each time by attaching it</p><blockquote><p><strong>attach(mydata)</strong></p></blockquote><p># summary of column B as &lsquo;mydata&rsquo; is attached</p><blockquote><p><strong>summary(B)</strong></p></blockquote><p><span style="text-decoration: underline;">Two other important options for </span><em><span style="text-decoration: underline;">read.table</span></em></p><p>If is is separated only by tabs and has a header</p><blockquote><p><strong>mydata </strong></p></blockquote><p>Really useful if you have spaces in the contents of some columns, so R does not mess up reading the columns . However if the columns or of an uneven length it will tell you.</p><p>If you know that the file has uneven columns</p><blockquote><p><strong>mydata </strong></p></blockquote><p>This causes R to fill empty spaces in a columns with &lsquo;NA&rsquo; .</p><p>The last two examples will still work with our file and give the same result as with only headers=T</p><p><span style="text-decoration: underline;">Graphs</span></p><p>to get an idea of what R is capable of type</p><blockquote><p><strong>demo(graphics)</strong></p></blockquote><p>steps through the examples, and the code is printed to the screen</p><p>We will work with simpler examples that have immediate use to biologists.</p><p>Remember to get more information about the options to a function type &lsquo;?function&rsquo;</p><p><span style="text-decoration: underline;">Histogram of A</span><span style="text-decoration: underline;"></span></p><blockquote><p><strong>hist(mydata$A)</strong></p></blockquote><p>If there was more data we could increase the number of vertical columns with the option, breaks=50 (or another relevant number).</p><blockquote><p><strong>boxplot(mydata)</strong></p></blockquote><p>We can get rid of the need to type the data frame each time by using the <strong>attach</strong> function</p><p># if not already done so</p><blockquote><p><strong>attach(mydata) </strong></p><p><strong>boxplot(mydata$A, mydata$B, name=c(&ldquo;Value A&rdquo;, &ldquo;Value B&rdquo;) , ylab=&ldquo;Count of Something&rdquo;)</strong></p></blockquote><p>same as</p><blockquote><p><strong>boxplot(A, B, name=c(&ldquo;Value A&rdquo;, &ldquo;Value B&rdquo;) , ylab=&ldquo;Count of Something&rdquo;)</strong></p></blockquote><p><span style="text-decoration: underline;">Scatter plot</span></p><p># if not already done so</p><blockquote><p><strong>attach(mydata) </strong></p><p><strong>plot(A,B) # or plot(mydata$A, mydata$B)</strong></p></blockquote><p><strong><span style="text-decoration: underline;">SAVING an image</span></strong></p><p>Windows users (Rgui) RIGHT click on image and select which you want.</p><p><span style="text-decoration: underline;">These instructions work for everyone.</span></p><p>You need to create a new device of the type of file you need, then send the data to that device</p><p>to save as a png file (easy to load into the likes of powerpoint, also great for web applications.</p><blockquote><p><strong>png(&lsquo;filename&rsquo;) </strong></p><p><strong>boxplot(A, B, name=c(&ldquo;Value A&rdquo;, &ldquo;Value B&rdquo;) , ylab=&ldquo;Count of Something&rdquo;)</strong></p></blockquote><p>or to save as a pdf</p><blockquote><p><strong>pdf(&lsquo;filename&rsquo;) </strong></p><p><strong>boxplot(A, B, name=c(&ldquo;Value A&rdquo;, &ldquo;Value B&rdquo;) , ylab=&ldquo;Count of Something&rdquo;)</strong></p></blockquote><p><span style="text-decoration: underline;">Note</span></p><ul>
<li>Nothing will appear on screen, the output is going to the file</li>
<li>Also it may not be saved immediately but will once the device (or R) is turned quit.</li>
</ul><p>To quit R type</p><p><strong>q() # </strong>If you save your session, next time you start R, you will have your data preloaded.</p><p>Or if you want to remain in R</p><blockquote><pre><strong>dev.off() #</strong>turns of the png (or pdf etc) device, thus forces the data to save</pre></blockquote>]]></description>
	<dc:creator>Archana Malhotra</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/22432/walk-ins-for-jrf-ans-srf-post-in-assam-agricultural-university</guid>
  <pubDate>Thu, 28 May 2015 19:16:47 -0500</pubDate>
  <link></link>
  <title><![CDATA[Walk-ins for JRF ans SRF post in Assam Agricultural University]]></title>
  <description><![CDATA[
<p>Distributed Information Centre<br />Department of Agricultural Biotechnology<br />Assam Agricultural University<br />Jorhat – 785 013<br />Walk-in interview</p>

<p>(ABT/DIC/01/2014 (No. AAU/ABT/DIT/Advt. 01/2015/111 Dtd. 19-05-2015)</p>

<p>Walk in interview for the following position will be held on 6th June, 2015 at 10.00AM in the Office Chamber of the undersigned. Candidates may appear for the interview with bio-data, reprints / publication / thesis etc and passport size photographs, original and attested copies of all testimonials etc, which must be presented at the time of interview. The applicants may submit their resume in advance tomkmodi@aau.ac.in.</p>

<p>Research Associate</p>

<p>    Ph.D. in Biotechnology/ Bioinformatics. Or</p>

<p>    Masters degree in Biotechnology/Bioinformatics with minimum 3(three) years research experience</p>

<p>    Desirable : Experience in Bioinformatics as evidenced from published research</p>

<p>    Rs 36,000+HRA for the 1st two years and 38,000+HRA for the 3rd year.</p>

<p>Senior Research Fellow</p>

<p>    Master Degree in Biotechnology/ Bioinformatics. With 2 (two) years  Experience in Bioinformatics as evidenced from Course work/ Diploma/Published research</p>

<p>    Rs 28,000+HRA for NET qualified candidate/Professional degree holder</p>

<p>    Rs 18,000+HRA for non-NET qualified general degree holder</p>

<p>Junior Research Fellow</p>

<p>    Master Degree in Biotechnology/ Bioinformatics/Computer Science/Computer Application</p>

<p>    Desirable: Experience in Bioinformatics as evident from Course work/ Diploma/Published research</p>

<p>    Rs 25,000+HRA for NET qualified candidate/Professional degree holder</p>

<p>    Rs 16,000+HRA for non-NET qualified general degree holder</p>

<p>Note: Term and conditions will be as per the DBT, Govt of India guidelines.</p>

<p>Advertisement: http://14.139.222.145/classified/biotech46.html</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/23160/opencpu</guid>
	<pubDate>Sun, 05 Jul 2015 18:34:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/23160/opencpu</link>
	<title><![CDATA[OpenCPU]]></title>
	<description><![CDATA[<p>OpenCPU is a system for embedded scientific computing and reproducible research. The OpenCPU server provides a reliable and interoperable <a href="https://www.opencpu.org/api.html">HTTP API</a> for data analysis based on R.</p><p>The OpenCPU <a href="https://www.opencpu.org/jslib.html">JavaScript client library</a> provides the most seamless integration of R and JavaScript available today.</p><p>OpenCPU uses standard R packaging to develop, ship and deploy web applications. Several open source <a href="https://www.opencpu.org/apps.html">example apps</a> are available from Github.</p><p>Installing your own OpenCPU server is <a href="https://www.opencpu.org/download.html">super easy</a> and only takes a few minutes.</p><p>More at https://www.opencpu.org/</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>