<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/27971?offset=120</link>
	<atom:link href="https://bioinformaticsonline.com/related/27971?offset=120" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32481/sspace</guid>
	<pubDate>Fri, 05 May 2017 05:42:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32481/sspace</link>
	<title><![CDATA[SSPACE]]></title>
	<description><![CDATA[<p>SSPACE standard is a stand-alone program for scaffolding pre-assembled contigs using NGS paired-read data. It is unique in offering the possibility to manually control the scaffolding process. By using the distance information of paired-end and/or matepair data, SSPACE is able to assess the order, distance and orientation of your contigs and combine them into scaffolds. Currently we offer this as a command-line tool in Perl. The input data is given by pre-assembled contig sequences (FASTA) and NGS paired-read data (Illumina/454/Solid FASTA or FASTQ). The final scaffolds are provided in FASTA format.</p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://www.baseclear.com/genomics/bioinformatics/basetools/SSPACE" rel="nofollow">https://www.baseclear.com/genomics/bioinformatics/basetools/SSPACE</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/1161/genomics-for-bioinformatician</guid>
	<pubDate>Sat, 20 Jul 2013 07:03:00 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/1161/genomics-for-bioinformatician</link>
	<title><![CDATA[Genomics for Bioinformatician]]></title>
	<description><![CDATA[<p>Genomics is the study of the genomes of organisms. The field includes intensive efforts to determine the entire DNA sequence of organisms and fine-scale genetic mapping efforts. The field also includes studies of intragenomic phenomena such as heterosis, epistasis, pleiotropy and other interactions between loci and alleles within the genome. In contrast, the investigation of the roles and functions of single genes is a primary focus of molecular biology or genetics and is a common topic of modern medical and biological research. Research of single genes does not fall into the definition of genomics unless the aim of this genetic, pathway, and functional information analysis is to elucidate its effect on, place in, and response to the entire genome's networks.<br /><br />Genomics was established by Fred Sanger when he first sequenced the complete genomes of a virus and a mitochondrion. His group established techniques of sequencing, genome mapping, data storage, and bioinformatic analyses in the 1970-1980s. A major branch of genomics is still concerned with sequencing the genomes of various organisms, but the knowledge of full genomes has created the possibility for the field of functional genomics, mainly concerned with patterns of gene expression during various conditions. The most important tools here are microarrays and bioinformatics. Study of the full set of proteins in a cell type or tissue, and the changes during various conditions, is called proteomics. A related concept is materiomics, which is defined as the study of the material properties of biological materials (e.g. hierarchical protein structures and materials, mineralized biological tissues, etc.) and their effect on the macroscopic function and failure in their biological context, linking processes, structure and properties at multiple scales through a materials science approach. The actual term 'genomics' is thought to have been coined by Dr. Tom Roderick, a geneticist at the Jackson Laboratory (Bar Harbor, ME) over beer at a meeting held in Maryland on the mapping of the human genome in 1986.<br /><br />The outcome of almost two years of intense discussions with literally hundreds of scientists and members of the public, has three major areas of focus: Genomics to Biology, Genomics to Health, and Genomics to Society.<br /><br /><strong><em>Genomics to Biology:</em></strong>&nbsp;<br />The human genome sequence provides foundational information that now will allow development of a comprehensive catalog of all of the genome's components, determination of the function of all human genes, and deciphering of how genes and proteins work together in pathways and networks.<br /><br /><strong><em>Genomics to Health:<br /></em></strong>Completion of the human genome sequence offers a unique opportunity to understand the role of genetic factors in health and disease, and to apply that understanding rapidly to prevention, diagnosis, and treatment. This opportunity will be realized through such genomics-based approaches as identification of genes and pathways and determining how they interact with environmental factors in health and disease, more precise prediction of disease susceptibility and drug response, early detection of illness, and development of entirely new therapeutic approaches.<br /><br /><strong><em>Genomics to Society:</em>&nbsp;<br /></strong>Just as the HGP has spawned new areas of research in basic biology and in health, it has created new opportunities in exploring the ethical, legal, and social implications (ELSI) of such work. These include defining policy options regarding the use of genomic information in both medical and non-medical settings and analysis of the impact of genomics on such concepts as race, ethnicity, kinship, individual and group identity, health, disease, and "normality" for traits and behaviors.<br /><br />This vision for the future of genomics is not just about the NHGRI. It encompasses the whole field of genomics, including the work of all the other Institutes and Centers at the NIH and of a number of other federal agencies. All of the NIH Institutes are already taking full advantage of the sequence and will apply its data to the better understanding of both rare and common diseases, almost all of which have a genetic component. A recent example of the way that the HGP and the knowledge and new technologies it has spawned are already facilitating science is the extremely rapid sequencing by groups in Canada and at the Centers for Disease Control and Prevention (CDC) in Atlanta of the genome of the virus that causes Severe Acute Respiratory Syndrome (SARS). The sequencing of the SARS virus genome provides insight into this new and deadly disease at a speed never before possible in science. In turn, this should lead to the rapid development of diagnostic tests and, in time, vaccines and effective treatments.<br /><br /><strong>Links for the addition material available on Net</strong></p><p><a href="http://pevsnerlab.kennedykrieger.org/bioinformatics/bioinf10_genomes.htm">Genomes and genomics:</a></p><p><a href="http://www.123genomics.com/learning.html">Bioinformatics and Genomics:</a></p><p><a href="http://www.ebi.ac.uk/pdbe/docs/roadshow_tutorial/strgenomics/tutorial.html">Structural genomics tutorial:</a></p><p><a href="http://www.hgu.mrc.ac.uk/Users/Philippe.Gautier/tutorial/index.html">Comparative Genomics Tutorial:</a></p><p><a href="http://www.scfbio-iitd.res.in/tutorial/genomics.html">GENOME TUTORIAL:</a></p><p><a href="http://genomebiology.com/content/pdf/gb-2001-3-1-reviews2001.pdf">Tools and resources for identifying protein families, domains and motifs</a></p><p><a href="http://www.ornl.gov/sci/techresources/Human_Genome/posters/chromosome/tools.shtml">Bioinformatics Tools</a><a href="http://www.ornl.gov/sci/techresources/Human_Genome/posters/chromosome/tools.shtml">&nbsp;<br />Tips, Tutorials, and Terminology for Using Selected Resources in Genome Database Guide:</a></p><p><a href="http://www.doe-mbi.ucla.edu/Reprints/R31%20Strong%20A%20Web-based%20Comparative%20Genomics%20tutorial%20Microbiology%20Eduction%202004.pdf">A Web-Based Comparative Genomics Tutorial for Investigating Microbial Genomes:</a></p><p><a href="http://www.genome.gov/27530225">Free Online Tutorials Teach Anyone How to Use Genome Databases:</a></p><p><a href="http://mkweb.bcgsc.ca/circos/?tutorials">Circos to create concise, explanatory, unique and print-ready visualizations of your data:</a></p><p><a href="http://www.igd.cornell.edu/Comparative%20Genomics/Comparative%20Genomics%20Proj.html">Genomics and Comparative Genomics</a><a href="http://www.igd.cornell.edu/Comparative%20Genomics/Comparative%20Genomics%20Proj.html">&nbsp;Learning Module:</a></p><p><a href="http://psb.stanford.edu/psb10/conference-materials/tutorials/compgen-notes.pdf">Computational Challenges in Comparative Genomics</a></p><p><a href="http://psb.stanford.edu/psb10/conference-materials/tutorials/compgen-notes.pdf">A Tutorial:</a></p><p><a href="http://gramene.agrinome.org/tutorials/modules_tutorial.pdf">A Comparative Genomics Resource for Grains</a>:</p><p><a href="http://www.plantcell.org/cgi/content/full/21/12/3718">PLAZA: A Comparative Genomics Resource to Study Gene and Genome Evolution in Plants:</a></p><p><a href="http://en.wikipedia.org/wiki/VISTA_(comparative_genomics)">VISTA</a><a href="http://en.wikipedia.org/wiki/VISTA_(comparative_genomics)">:</a></p><p>Software for Genomics</p><ol>
<li><strong>Artemis</strong>&nbsp;Artemis is a free genome viewer and annotation tool that allows visualization of sequence features and the results of analyses within the context of the sequence, and its six-frame translation.</li>
<li><strong>Chromas&nbsp;</strong>It will display and prints chromatogram files from ABI automated DNA sequencers, and Staden SCF files which the analysis programs for ALF, Li-Cor and Visible Genetics OpenGene sequencers can create.</li>
<li><strong>Glimmer</strong>&nbsp;A system for finding genes in microbial DNA, especially the genomes of bacteria and archaea.Glimmer (Gene Locator and Interpolated Markov Modeler) uses interpolated Markov models (IMMs) to identify the coding regions and distinguish them from noncoding DN</li>
<li><strong>Glimmer</strong>&nbsp;HMM&nbsp;A fast and accurate gene finder based on a GHMM architecture, developed specifically for eukaryotes. It incorporates splice site models adapted from the GeneSplicer program and uses interpolated Markov models for evaluating the coding regions.</li>
<li><strong>Glimmer</strong>&nbsp;M&nbsp;A gene finder derived from Glimmer, but developed specifically for eukaryotes. It is based on a dynamic programming algorithm that considers all combinations of possible exons for inclusion in a gene model and chooses the best of these combinations. The d</li>
<li><strong>MUMmer</strong>&nbsp;MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form.</li>
<li><strong>pDRAW</strong>&nbsp;pDRAW32 is being developed as a free time hobby project. It is far from finished, but as it has reached a point where it could be helpful for many labs, it is now available to the scientific community.</li>
<li><strong>Sequin</strong>&nbsp;Sequin is a stand-alone software tool developed by the NCBI for submitting and updating entries to the GenBank, EMBL, or DDBJ sequence databases. It is capable of handling simple submissions that contain a single short mRNA sequence, and complex submissio</li>
<li><strong>Staden&nbsp;</strong>The Staden Package consists of a series of tools for DNA sequence preparation (pregap4), assembly (gap4), editing (gap4) and DNA/protein sequence analysis (spin).</li>
</ol><p>For more software @&nbsp;<a href="http://bioinformaticsonline.com/bookmarks/view/926/list-of-popular-bioinformatics-softwaretools">http://bioinformaticsonline.com/bookmarks/view/926/list-of-popular-bioinformatics-softwaretools</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32709/cabog-celera-assembler-with-best-overlap-graph</guid>
	<pubDate>Mon, 15 May 2017 05:04:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32709/cabog-celera-assembler-with-best-overlap-graph</link>
	<title><![CDATA[CABOG: Celera Assembler with Best Overlap Graph]]></title>
	<description><![CDATA[<p>CABOG (Celera Assembler with Best Overlap Graph) is scientific software for&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/24/24/2818.abstract">DNA research</a>. CABOG has been a critical component of many genome sequencing projects. CABOG operates on small genomes such as bacterial as well as large genomes such as mammalian. CABOG is an extension of the Celera Assembler software that was originally developed at&nbsp;<a href="http://www.celera.com/">Celera</a>&nbsp;for the 2001 publication of the first draft human genome sequence. The software was released to the public domain in 2004. Its open source&nbsp;<a href="http://wgs-assembler.sf.net/">repository</a>&nbsp;on Source Forge is an internet resource for scientists around the world.&nbsp;</p>
<p>CABOG is one of many software programs called genome assemblers. These programs exist to overcome the fundamental limitation of all sequencing machines, namely, that they read out very few DNA letters at a time. These programs reconstruct genomes that are billions of letters long from the hundreds of letters per read that modern sequencers provide. What these programs do is often described as a scaled up version of a family solving a jigsaw puzzle.</p>
<p>The CABOG software was the first to accomplish many scientific goals. It was the first to assemble the genome of a multicellular organism (<em>Drosophila melanogaster</em>, 2000). It was the first to assemble both parental haplotypes of one human genome (J. Craig Venter, 2007). It was the first to assemble environmental sequence from the oceans (Sargasso Sea in 2004 and Global Ocean Sampling in 2007). It was first to combine reads from first-generation Sanger sequencing machines and second-generation pyrosequencing machines (Marine microbes, 2006). Today, CABOG is one of the leading assembly programs for data sets that include paired end data from the Roche 454 line of sequencing machines.</p><p>Address of the bookmark: <a href="http://www.jcvi.org/cms/research/projects/cabog/overview/" rel="nofollow">http://www.jcvi.org/cms/research/projects/cabog/overview/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33741/diya-a-bacterial-annotation-pipeline-for-any-genomics-lab</guid>
	<pubDate>Fri, 30 Jun 2017 08:48:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33741/diya-a-bacterial-annotation-pipeline-for-any-genomics-lab</link>
	<title><![CDATA[DIYA: a bacterial annotation pipeline for any genomics lab]]></title>
	<description><![CDATA[<p><span>DIY Genomics is an open source bioinformatics consortium intended to bring a collection of tools and libraries into the hands of small scale genomics labs for the process of sequence assembly and annotation. Projects include DIYA, MGAP, CRISPR, and DIYGV</span></p>
<p><span>http://gmod.org/wiki/Diya</span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/diyg/" rel="nofollow">https://sourceforge.net/projects/diyg/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/34368/srbioinformatics-analyst-ngs-at-ocimum</guid>
  <pubDate>Fri, 17 Nov 2017 07:50:44 -0600</pubDate>
  <link></link>
  <title><![CDATA[Sr.Bioinformatics Analyst (NGS) at Ocimum]]></title>
  <description><![CDATA[
<p>JOB FUNCTIONBio Tech/R&amp;D/Scientist<br />INDUSTRYBiotechnology/Pharmaceutical/Medicine<br />SPECIALIZATIONBasic Research,Bio-Statistician,Clinical Research<br />QUALIFICATION<br />Any Post Graduate<br />BA (Arts), B.Com. (Commerce), BE/ B.Tech (Engineering), B.Pharm. (Pharmacy), B.Sc. (Science), BL/LLB, BDS (Dental Surgery), B.Ed. (Education), BHM (Hotel Management), BBA/ BBM/ BBS, B.Arch. (Architecture), BCA (Computer Application), Diploma-Other Diploma, B.Plan. (Planning), BGL, B.V.Sc. (Veterinary Science), Other School/ Graduation, BHMS (Homeopathy), BAMS (Ayurveda)<br />Job Description</p>

<p>1.  Must have basic understanding of molecular biology and Genomics.<br />2. Experience in application development or must have expertise in programming using either of Perl/Python.<br />3.  Experience in statistical programming using R/Bioconductor/Matlab.<br />4. Strong concept in statistical and mathematical modelling.<br />5.  Experience in designing and developing the bioinformatics pipeline.<br />6.  Must have minimum 2+ years of hands on experience in NSG data analysis such as RNA-Seq,Exome-Seq ,Chip-Seq and downstream analysis.<br />7. Knowledge in WGS ,WES, Targeted re-sequencing,GWAS and population genomics will be preferred.<br />8. Must have experience working on opensource software/Framework and commercial software for NGS data analysis and reporting.<br />9. Should be aware of handling big data and guiding team members on multiple projects simultaneously.<br />10. Should have experience coordinating with different groups of clinical research scientist for various project requirements.<br />11. Ability to work as team as well as independently with minimal support.</p>

<p>More at http://www3.ocimumbio.com/</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/7674/useful-publications-and-websites-for-deep-sequencing-data-analysis</guid>
	<pubDate>Sun, 29 Dec 2013 22:30:45 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/7674/useful-publications-and-websites-for-deep-sequencing-data-analysis</link>
	<title><![CDATA[Useful Publications and Websites for Deep Sequencing Data Analysis]]></title>
	<description><![CDATA[<h3>Global overview papers</h3><p>Next generation quantitative genetics in plants. Jim&eacute;nez-G&oacute;mez, Frontiers in Plant Science 2:77, 2011 <span style="text-decoration: underline;"><a href="http://www.frontiersin.org/Plant_Physiology/10.3389/fpls.2011.00077/full">Full Text</a> </span><em>[equally relevant to animal and microbial systems]</em></p><p>Sense from sequence reads: methods for alignment and assembly. Flicek &amp; Birney, Nat Methods 6(11 Suppl):S6-S12, 2009. <a href="http://www.nature.com/nmeth/journal/v6/n11s/full/nmeth.1376.html"><span style="text-decoration: underline;">Full Text</span></a></p><h3>Library construction and experimental design</h3><p>Statistical design and analysis of RNA sequencing data. Auer &amp; Doerge, Genetics 185(2):405-16, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881125"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Biases in Illumina transcriptome sequencing caused by random hexamer priming. Hansen et al., Nucleic Acids Res. 38(12): e131, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2896536"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Aird et al, Genome Biology 12:R18, 2011 <a href="http://genomebiology.com/2011/12/2/R18"><span style="text-decoration: underline;">Full Text</span></a></p><p>Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of GC-biased genomes. Kozarewa et al, Nature Methods 6(4):291-5, 2009 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2664327/"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Rohland &amp; Reich, Genome Research 22(5): 939&ndash;946. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3337438/"><span style="text-decoration: underline;">PubMedCentral</span></a></p><h3>Data formats, data management, and alignment software tools<span style="text-decoration: underline;"> </span></h3><p>The Sequence Alignment/Map format and SAMtools. Li et al, Bioinformatics 25(16):2078-9, 2009 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723002"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>SAM format specification <a href="http://samtools.sourceforge.net/SAM1.pdf"><span style="text-decoration: underline;">file</span></a></p><p>Efficient storage of high throughput sequencing data using reference-based compression. Fritz et al, Genome Res 21(5):734-40, 2011. <a href="http://genome.cshlp.org/content/21/5/734.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>Compression of DNA sequence reads in FASTQ format. Deorowicz &amp; Grabowski, Bioinformatics 27(6):860-2, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21252073"><span style="text-decoration: underline;">PubMed</span></a></p><p>Fast and accurate short read alignment with Burrows-Wheeler transform. Li &amp; Durbin, Bioinformatics 25(14):1754-60, 2009. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2705234"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Improving SNP discovery by base alignment quality. Li H, Bioinformatics 27(8):1157-8, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21320865"><span style="text-decoration: underline;">PubMed</span></a></p><p>BEDTools: a flexible suite of utilities for comparing genomic features. Quinlan and Hall, Bioinformatics 26:841-842, 2010. <a href="http://bioinformatics.oxfordjournals.org/content/26/6/841.full.pdf+html"><span style="text-decoration: underline;">Publisher Website</span></a></p><h3>Data quality assessment, filtering, and correction</h3><p>SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. Cox et al, BMC Bioinformatics 11:485, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2956736"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>TileQC: a system for tile-based quality control of Solexa data. Dolan &amp; Denver, BMC Bioinformatics 9:250, 2008 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2443380"><span style="text-decoration: underline;">PubMedCentral</span></a> <em>[requires a reference sequence]</em></p><p>Quake: quality-aware detection and correction of sequencing errors. Kelley et al, Genome Biol 11(11):R116, 2010. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21114842"> <span style="text-decoration: underline;">PubMed</span></a></p><p>FastQC: a quality control tool for high-throughput sequence data. <a href="http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/"><span style="text-decoration: underline;">Home Page</span></a></p><p>FASTX-toolkit: FASTQ/A short-reads pre-processing tools <a href="http://hannonlab.cshl.edu/fastx_toolkit/"><span style="text-decoration: underline;">Home Page</span></a></p><p>Reference-free validation of short read data. Schr&ouml;der et al, PLoS One 5(9):e12681, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2943903"> <span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Correction of sequencing errors in a mixed set of reads. Salmela, Bioinformatics 26(10):1284, 2010. <a href="http://bioinformatics.oxfordjournals.org/content/26/10/1284.long"><span style="text-decoration: underline;">Full Text</span></a> <em>[includes error correction of SOLiD reads in colorspace]</em></p><p>Repeat-aware modeling and correction of short read errors. Yang et al, BMC Bioinformatics 12(Supp1):S52, 2011 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044310"> <span style="text-decoration: underline;">PubMedCentral</span></a> <em>[requires a reference sequence]</em></p><p>HiTEC: accurate error correction in high-throughput sequencing data. Ilie et al, Bioinformatics 27(3):295, 2011 <a href="http://bioinformatics.oxfordjournals.org/content/27/3/295.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>Error correction of high-throughput sequencing datasets with non-uniform coverage. Medvedev et al., Bioinformatics 27(13):i137-41, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117386"><span style="text-decoration: underline;">PubMedCentral</span></a></p><h3>De novo assembly<span style="text-decoration: underline;"> </span></h3><p>Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Zerbino &amp; Birney, Genome Res 18(5):821-9, 2008. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2336801">u&gt;PubMedCentral</a></p><p>Assembly of large genomes using second-generation sequencing. Schatz et al, Genome Res 20(9):1165-73, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928494"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Gnerre et al, PNAS 108(4): 1513-18, 2011 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3029755"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Genome assembly has a major impact on gene content: a comparison of annotation in two <em>Bos taurus </em> assemblies. Florea&nbsp; et al., PLoS One 6(6):e21400, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3120881/"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Carver et al, Bioinformatics 28(4):464 - 469, 2012 <span style="text-decoration: underline;"><a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3278759/">PubMedCentral</a></span></p><p>Efficient de novo assembly of large genomes using compressed data structures. Simpson &amp; Durbin, Genome Research 22:549-556, 2012 <span style="text-decoration: underline;"><a href="http://genome.cshlp.org/content/22/3/549.full">Full Text</a></span> <em>[Describes the String Graph Assembler (SGA), which assembled a human genome in less than 6 days using 54 Gb of RAM and a 123-processor compute cluster for calculation of an FM-index of the 1.2 billion reads]</em></p><p>Readjoiner: a fast and memory efficient string graph-based sequence assembler. Gonnella &amp; Kurtz, BMC Bioinformatics 13: 82, 2012 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3507659"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Assemblathon 1: A competitive assessment of de novo short read assembly methods. Earl et al, Genome Research 21:2224-2241, 2011 <span style="text-decoration: underline;"><a href="http://genome.cshlp.org/content/early/2011/09/16/gr.126599.111.full.pdf+html">Full Text</a></span></p><h3>Chromatin immunoprecipation analysis: ChIP-seq</h3><p>ChIP-seq: advantages and challenges of a maturing technology. Park, Nat Rev Genet. 10:669-80, 2009 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3191340/"><span style="text-decoration: underline;">PubMed</span></a></p><p>ChIP-seq and Beyond: new and improved methodologies to detect and characterize protein-DNA interactions. Furey, Nat Rev Genet 13: 840&ndash;852, 2012 <a href="http://www.nature.com/nrg/journal/v13/n12/full/nrg3306.html"> <span style="text-decoration: underline;">Publisher Web Site</span></a></p><p>MuMoD: a Bayesian approach to detect multiple modes of protein&ndash;DNA binding from genome-wide ChIP data. Narlikar, Nucleic Acids Res 41:21&ndash;32, 2013 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3592440/"><span style="text-decoration: underline;">PubMed</span></a></p><h3>Transcriptome analysis</h3><h3>Assembly and comparison to genome</h3><p>Full-length transcriptome assembly from RNA-Seq data without a reference genome. Grabherr et al, Nature Biotechnology 29:644 - 652, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21572440"><span style="text-decoration: underline;">PubMed</span></a> <em>[The software is called <a href="http://trinityrnaseq.sourceforge.net/"><span style="text-decoration: underline;">Trinity</span></a>, and is available on Sourceforge.]</em></p><p>Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome. Peng et al, Nature Biotechnology 30:253 - 260, 2012. <span style="text-decoration: underline;"><a href="http://www.ncbi.nlm.nih.gov/pubmed/22327324">PubMed</a></span> <em>[Several comments on this paper question whether the reported differences are in fact evidence of editing or are simply sequencing errors - the authors stand by their conclusions, but the controversy demonstrates the importance of robust data analysis methods.] </em></p><p>Optimization of de novo transcriptome assembly from next-generation sequencing data. Surget-Groba &amp; Montoya-Burgos, Genome Res 20(10):1432-40, 2010. <a href="http://genome.cshlp.org/content/20/10/1432.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>Rnnotator: an automated <em>de novo</em> transcriptome assembly pipeline from stranded RNA-Seq reads. Martin et al, BMC Genomics 11:663, 2010 <a href="http://www.biomedcentral.com/1471-2164/11/663"><span style="text-decoration: underline;">Full Text</span></a></p><p><em>De novo</em> assembly and analysis of RNA-seq data. Robertson et al, Nature Methods 7:909-912, 2010 <a href="http://www.nature.com/nmeth/journal/v7/n11/full/nmeth.1517.html"><span style="text-decoration: underline;">Full Text</span></a> <em>[describes Trans-ABySS, a pipeline to use the ABySS parallel assembler for de novo transcriptome analysis]</em></p><h3>Differential expression analysis</h3><p>R-SAP: a multi-threading computational pipeline for the characterization of high-throughput RNA-sequencing data. Mittal &amp; McDonald, Nucleic Acids Res, 2012 <span style="text-decoration: underline;"><a href="http://nar.oxfordjournals.org/content/early/2012/01/28/nar.gks047.long">Full Text</a></span></p><p>Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Mercer et al, Nature Biotechnology 30:99 - 104, 2012 <span style="text-decoration: underline;"><a href="http://www.nature.com/nbt/journal/v30/n1/full/nbt.2024.html"> Publisher Website</a></span></p><p>Differential gene and transcript expression analysis of RNA-Seq experiments with TopHat and Cufflinks. Trapnell et al, Nature Protocols 7:562 - 578, 2012 <span style="text-decoration: underline;"><a href="http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html"> Publisher Website</a></span></p><p>Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling. Łabaj et al, Bioinformatics 27:i383 - i391, 2011 <span style="text-decoration: underline;"><a href="http://bioinformatics.oxfordjournals.org/content/27/13/i383.full.pdf+html"> Full Text</a></span></p><p>Improving RNA-Seq expression estimates by correcting for fragment bias. Roberts et al, Genome Biol 12:R22, 2011 <span style="text-decoration: underline;"><a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3129672/">PubMed Central</a></span></p><p>Cloud-scale RNA-sequencing differential expression analysis with Myrna. Langmead et al, Genome Biol 11:R83, 2010 <a href="http://genomebiology.com/2010/11/8/R83"><span style="text-decoration: underline;">Full Text</span></a></p><p>From RNA-seq reads to differential expression results. Oshlack et al, Genome Biol 11(12):220, 2010 <a href="http://genomebiology.com/content/11/12/220"><span style="text-decoration: underline;">Full Text</span></a></p><p>DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Wang et al., Bioinformatics. 26(1):136-8. 2010 <a href="http://www.ncbi.nlm.nih.gov/pubmed/19855105"><span style="text-decoration: underline;"> PubMed</span></a></p><p>DEseq: Differential expression analysis for sequence count data. Anders and Huber, Genome Biology 11:R106, 2010 <a href="http://genomebiology.com/2010/11/10/R106"><span style="text-decoration: underline;">Full Text</span></a></p><p>edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Robinson et al., Bioinformatics 26(1):139-40 2010 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2796818"> <span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Two-stage Poisson model for testing RNA-seq data. Auer and Doerge, SAGMB 10(1), article 26 <a href="http://www.bepress.com/sagmb/vol10/iss1/art26/"><span style="text-decoration: underline;">Full Text</span></a></p><p>Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments. McCormick et al., Silence2(1):2, 2011 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3055805"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>RNA-Seq gene expression estimation with read mapping uncertainty. Li et al, Bioinformatics 26:493-500, 2010 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820677">PubMedCentral</a> <em>[describes the RSEM software package]</em></p><h3>Comparing genomes and assemblies; variant detection<span style="text-decoration: underline;"> </span></h3><p>Versatile and open software for comparing large genomes. Kurtz et al, Genome Biol (5(2):R12, 2004. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC395750"><span style="text-decoration: underline;">PubMedCentral</span></a> <em>[describes the MUMmer software for full-genome alignment &amp; comparisons]</em></p><p>Searching for SNPs with cloud computing. Langmead et al, Genome Biol 10(11):R134, 2009 <a href="http://genomebiology.com/content/10/11/R134"><span style="text-decoration: underline;">Full Text</span></a></p><p>Calling SNPs without a reference sequence. Ratan et al, BMC Bioinformatics 11:130, 2010 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2851604"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Microindel detection in short-read sequence data. Krawitz et al, Bioinformatics 26(6):722-9, 2010. <a href="http://bioinformatics.oxfordjournals.org/content/26/6/722.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>vipR: variant identification in pooled DNA using R. Altmann et al., Bioinformatics 27: i77-i84, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3117388"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Geoseq: a tool for dissecting deep-sequencing datasets. Gurtowski et al, BMC Bioinformatics 11:506, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2972303/"><span style="text-decoration: underline;">PubMedCentral</span></a> <em>[Geoseq is a web service that allows searching deep sequencing datasets with a reference sequence of a gene of interest]</em></p><p>Detecting and annotating genetic variations using the HugeSeq pipeline. Lam et al, Nature Biotechnology 30:226 - 229, 2012 <span style="text-decoration: underline;"><a href="http://www.nature.com/nbt/journal/v30/n3/full/nbt.2134.html">Publisher Website</a></span>, <span style="text-decoration: underline;"><a href="http://hugeseq.snyderlab.org/">Home Page</a></span></p><p>Genome-wide LORE1 retrotransposon mutagenesis and high-throughput insertion detection in <em>Lotus japonicus</em>. Urbański et al, Plant J 64:731-741, 2012. <span style="text-decoration: underline;"><a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1365-313X.2011.04827.x/abstract">Publisher Website</a></span> <em>[This paper describes a 2-dimensional pooling strategy with barcoding to allow use of Illumina sequencing to screen for retrotransposon insertion mutations, and includes a software package called FSTpoolit for analysis of the resulting sequence reads.]</em></p><h3>Genotyping by sequencing</h3><p>Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Davey et al., Nat Rev Genet 12(7):499-510, 2011 <a href="http://www.ncbi.nlm.nih.gov/pubmed/21681211"><span style="text-decoration: underline;">PubMed</span></a> <em>[A review of methods available at the time]</em></p><p>A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. Elshire et al., PLoS One 6(5):e19379, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3087801"><span style="text-decoration: underline;">Full Text</span></a></p><p>Development of high-density genetic maps for barley and wheat using a novel two-enzyme genotyping-by-sequencing approach. Poland et al., PLoS One 7(2): e32253, 2012. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3289635/"><span style="text-decoration: underline;">Full Text</span></a></p><p>Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. Peterson et al, PLoS One 7(5):e37135, . 2012. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3365034/"><span style="text-decoration: underline;">Full Text</span></a></p><p>Imputation of unordered markers and the impact on genomic selection accuracy. Rutkowski et al, G3 3(3):427-39, 2013. <a href="http://www.g3journal.org/content/3/3/427.long"><span style="text-decoration: underline;">Full Text</span></a></p><p>Diversity Arrays Technology (DArT) and next-generation sequencing combined: genome-wide, high-throughput, highly informative genotyping for molecular breeding of <em>Eucalyptus</em>. Sansaloni et al., BMC Proceedings 5(Suppl 7):P54, 2011 <span style="text-decoration: underline;"><a href="http://www.biomedcentral.com/1753-6561/5/S7/P54">Full Text</a></span></p><p>High-throughput genotyping by whole-genome resequencing. Huang et al., Genome Res 19(6):1068-76, 2009. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2694477"><span style="text-decoration: underline;">Full Text</span></a></p><p>Multiplexed shotgun genotyping for rapid and efficient genetic mapping. Andolfatto et al. Genome Res 21(4):610-7, 2011. <a href="http://genome.cshlp.org/content/21/4/610.long"><span style="text-decoration: underline;">Full Text</span></a></p><h3>Restriction-site Associated DNA (RAD) markers</h3><p>Rapid SNP discovery and genetic mapping using sequenced RAD markers. Baird et al, PLoS One 3(10):e3376, 2008 <span style="text-decoration: underline;"><a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0003376">Full Text</a></span></p><p>Linkage mapping and comparative genomics using next-generation RAD sequencing of a non-model organism. Baxter et al., PLoS One 6(4):e19315, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3082572"><span style="text-decoration: underline;">Full Text</span></a></p><p>Genome evolution and meiotic maps by massively parallel DNA sequencing: spotted gar, an outgroup for the teleost genome duplication. Amores et al, Genetics 188(4):799-808, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21828280"><span style="text-decoration: underline;"> PubMed</span></a></p><p>Construction and application for QTL analysis of a Restriction-site Associated DNA (RAD) linkage map in barley. Chutimanitsakun et al, BMC Genomics 4; 12:4, 2011. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3023751"><span style="text-decoration: underline;">Full Text</span></a></p><p>RAD tag sequencing as a source of SNP markers in <em>Cynara cardunculus </em>L. Scaglione et al., BMC Genomics 13:3, 2012. <span style="text-decoration: underline;"><a href="http://www.biomedcentral.com/1471-2164/13/3">Full Text</a></span></p><p>Paired-end RAD-seq for de novo assembly and marker design without available reference. Willing et al., Bioinformatics 27(16):2187-93, 2011. <a href="http://bioinformatics.oxfordjournals.org/content/27/16/2187.long"><span style="text-decoration: underline;">Publisher Website</span></a></p><p>Local de novo assembly of RAD paired-end contigs using short sequencing reads. Etter et al., PLOS ONE 6(4): e18561, 2011. <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0018561"><span style="text-decoration: underline;">Full Text</span></a></p><p>Stacks: building and genotyping loci de novo from short-read sequences. Catchen et al., G3: Genes, Genomes, Genetics, 1:171-182, 2011. <span style="text-decoration: underline;"> Full Text</span>, <a href="http://creskolab.uoregon.edu/stacks/"><span style="text-decoration: underline;">Home Page</span></a></p><p>Rainbow: an integrated tool for efficient clustering and assembling RAD-seq reads. Chong et al, Bioinformatics 28(21):2732-7, 2012. <a href="http://bioinformatics.oxfordjournals.org/content/28/21/2732.long"> <span style="text-decoration: underline;">Publisher Website</span></a></p><p>UK RAD Sequencing Wiki page, with bibliography and RADTools software download <a href="https://www.wiki.ed.ac.uk/display/RADSequencing/Home"><span style="text-decoration: underline;">Home Page</span></a></p><h3>Workspace environments</h3><p><span style="text-decoration: underline;">Papers</span></p><p>Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Goecks et al, Genome Biol 11(8):R86, 2010 <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2945788"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>Galaxy Cloudman: Delivering compute clusters. BMC Bioinformatics 11(Suppl. 12):S4, 2010 <a href="http://www.biomedcentral.com/content/pdf/1471-2105-11-S12-S4.pdf"><span style="text-decoration: underline;">Full Text</span></a></p><p><a href="http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit"><span style="text-decoration: underline;">The Genome Analysis Toolkit</span></a>: a MapReduce framework for analyzing next-generation DNA sequencing data. McKenna et al, Genome Res 20(9):1297-303, 2010. <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928508"><span style="text-decoration: underline;">PubMedCentral</span></a></p><p>A framework for variation discovery and genotyping using next-generation DNA sequencing data. DePristo et al., Nat Genet 43(5):491-8, 2011. <a href="http://www.ncbi.nlm.nih.gov/pubmed/21478889"><span style="text-decoration: underline;"> PubMed</span></a></p><p><span style="text-decoration: underline;">Online resources</span></p><p>The <a href="http://cran.r-project.org/"><span style="text-decoration: underline;">R statistical computing</span></a> environment includes<a href="http://www.bioconductor.org/"><span style="text-decoration: underline;"> Bioconductor</span></a>, a specialized set of tools for analysis of microarray and high-throughput sequencing data. Introductory materials from on-line or short workshops are widely available online; examples are <span style="text-decoration: underline;"><a href="http://bioconductor.org/help/course-materials/2012/Evomics2012/Bioconductor-tutorial.pdf">Evomics2012 Bioconductor-tutorial.pdf</a></span>, and <a href="http://bcb.dfci.harvard.edu/%7Eaedin/courses/Bioconductor/"><span style="text-decoration: underline;">Intro to Bioconductor</span></a>. Materials from an advanced course on high-throughput genetic data analysis are at <span style="text-decoration: underline;"><a href="http://bioconductor.org/help/course-materials/2012/SeattleFeb2012/">Seattle 2012 materials</a></span>. Thomas Girke of UC-Riverside has written a very complete set of manuals describing the use of R and Bioconductor for analysis of genomic datasets, available at <a href="http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual">R and Bioconductor Manuals</a>. <br /> <a href="http://cran.r-project.org/manuals.html"><span style="text-decoration: underline;">Manuals</span></a> and contributed <a href="http://cran.r-project.org/other-docs.html"><span style="text-decoration: underline;">documentation</span></a> for R are available at the R-project.org website, and video tutorials are also available on Youtube; those posted by Tutorlol are brief, clear, and to the point. <br /> Materials from a series of mini-courses in R taught in 2010 at UCLA are available:</p><ul>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0141/10S-basicR.pdf">Intro to programming and graphics</a></li>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0143/S10_RProgII.pdf">Data manipulation and functions</a></li>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0185/Graphics_course.pdf">Graphics for exploratory data analysis</a></li>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0147/20100503_IntroStats.pdf">Introductory statistics</a></li>
<li><a href="http://scc.stat.ucla.edu/page_attachments/0000/0188/reg_R_1_09S_slides.pdf">Linear regression</a></li>
</ul><p><a href="http://a-little-book-of-r-for-bioinformatics.readthedocs.org/en/latest/"> <span style="text-decoration: underline;">A Little Book of R for Bioinformatics</span></a> is an on-line resource with information and exercises to provide practice in bioinformatics analysis of DNA sequences and other biological data in R. <br /> Many books on specific topics in R programming are also available through Amazon or other vendors.</p><h3>Cloud computing resources</h3><p>The case for cloud computing in genome informatics. Lincoln Stein, Genome Biol. 11(5):207, 2010 <a href="http://www.ncbi.nlm.nih.gov/pubmed/20441614"><span style="text-decoration: underline;">Pubmed</span></a></p><p>Galaxy Cloudman: delivering cloud compute clusters. Afgan et al, BMC Bioinformatics <span style="text-decoration: underline;">11</span>(Suppl 12):S4, 2010 <a href="http://www.biomedcentral.com/1471-2105/11/S12/S4"><span style="text-decoration: underline;">Full Text</span></a></p><p><a href="http://cloudbiolinux.com/">CloudBioLinux</a> is an open-source project that provides a bioinformatics Linux system for cloud computing, pre-configured with a variety of software tools installed and ready to use.</p><p>A <a href="https://github.com/chapmanb/cloudbiolinux/blob/master/doc/intro/gettingStarted_CloudBioLinux.pdf?raw=true"><span style="text-decoration: underline;">tutorial</span></a> on getting started with CloudBioLinux on the Amazon Web Services Elastic Compute Cloud (EC2)</p><p><a href="http://userwww.service.emory.edu/%7Eeafgan/content/ppt/EnisAfgan_BOSC_2010.pdf"><span style="text-decoration: underline;">Deploying Galaxy on the Cloud</span></a>  slides from a presentation by Enis Afgan (Emory University) at the <br /> &nbsp;Bioinformatics Open Source Conference in Boston, July 2010</p><p>A <a href="http://screencast.g2.bx.psu.edu/cloud/"><span style="text-decoration: underline;"> screencast</span></a> that provides a step-by-step guide to starting a Galaxy cluster in the EC2 environment</p><p>A <a href="https://bitbucket.org/galaxy/galaxy-central/wiki/cloud"><span style="text-decoration: underline;">webpage</span></a> that has the same information in text form, and is the basis for the screencast</p><p>The iPlant Collaborative, an NSF-funded project to create computational resources for plant biology research, provides access to cloud computing resources through <span style="text-decoration: underline;"><a href="http://www.iplantcollaborative.org/discover/atmosphere">Atmosphere</a></span></p><p>SeqWare Query Engine: storing and searching sequence data in the cloud. OConnor et al, BMC Bioinformatics <strong>11</strong>(Suppl 12)<strong>:</strong>S2, 2010 <a href="http://www.biomedcentral.com/1471-2105/11/S12/S2"><span style="text-decoration: underline;">Full Text</span></a></p><p>An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. Taylor, BMC Bioinformatics <strong>11</strong>(Suppl 12)<strong>:</strong>S1, 2010 <a href="http://www.biomedcentral.com/1471-2105/11/S12/S1"><span style="text-decoration: underline;">Full Text</span></a></p><h3>Links to Linux command-line tutorials and resources</h3><p>Tutorials for AWK, a powerful tool for handling data tables</p><ul>
<li>A set of <a href="http://people.bu.edu/scottm/AWK.NOTES"><span style="text-decoration: underline;">awk notes</span></a> from Boston University</li>
<li>Bruce Barnett's <a href="http://www.grymoire.com/Unix/Awk.html"><span style="text-decoration: underline;">awk tutorial</span></a></li>
<li>Greg Goebel's <a href="http://www.vectorsite.net/tsawk.html"><span style="text-decoration: underline;">awk tutorial</span></a></li>
<li><a href="http://teaching.software-carpentry.org/2013/01/16/1433/"><span style="text-decoration: underline;">Executing an awk command from R</span></a> to simplify data exploratory analysis, from Lex Nederbragt</li>
</ul><p>Tutorials for bash shell scripting</p><ul>
<li>A <a href="http://www.linuxconfig.org/bash-scripting-tutorial"><span style="text-decoration: underline;">tutorial</span></a> at linuxconfig.org</li>
<li>A <a href="http://www.hypexr.org/bash_tutorial.php"><span style="text-decoration: underline;">Getting Started With Bash</span></a> tutorial at hypexr.org</li>
<li>Mendel Cooper's <a href="http://tldp.org/LDP/abs/html/"><span style="text-decoration: underline;">Advanced Bash Shell-Scripting Guide</span></a></li>
</ul><p>Tutorials for sed, the command-line stream editor</p><ul>
<li>A <a href="http://www.panix.com/%7Eelflord/unix/sed.html"><span style="text-decoration: underline;">tutorial</span></a> at Rutgers</li>
<li>Peteris Krumins claims to have the <a href="http://www.catonmat.net/blog/worlds-best-introduction-to-sed/"><span style="text-decoration: underline;"> World's Best Introduction to Sed</span></a>; take a look and judge for yourself.</li>
<li>Bruce Barnett's <a href="http://www.grymoire.com/Unix/Sed.html"><span style="text-decoration: underline;">sed tutorial</span></a>.</li>
</ul><h3>Links to other useful sites</h3><p>The<a href="http://seqanswers.com/"><span style="text-decoration: underline;"> SEQanswers</span></a> online community has forums on several topics related to sequencing; the bioinformatics forum is the most active.</p><p>The SEQanswers <span style="text-decoration: underline;"><a href="http://seqanswers.com/wiki/Software">Software Wiki</a></span> is a list of software for analysis of sequencing data</p><p><a href="http://biostar.stackexchange.com/">Biostar</a> is another online community for questions and answers on bioinformatics and computational genomics.</p><p>Information on file formats used by the University of California - Santa Cruz Genome Browser is on the <a href="http://genome.ucsc.edu/FAQ/FAQformat"><span style="text-decoration: underline;"> FAQ list</span></a></p><p>A manual for the Integrated Genome Browser visualization tool is <a href="http://wiki.transvar.org/confluence/display/igbman/Home"><span style="text-decoration: underline;">here</span></a></p><p>Course materials for a short course entitled <a href="http://bioconductor.org/help/course-materials/2010/SeattleIntro/"><span style="text-decoration: underline;">Introduction to R and Bioconductor</span></a>, held in Seattle in Dec 2010</p><p><a href="http://great.stanford.edu/"><span style="text-decoration: underline;">Genomic Regions Enrichment of Annotations Tool</span></a> - A web service to test for over-representation of specific ontology categories among genes near ChIP-seq peaks</p><p><a href="http://www.animalgenome.org/bioinfo/resources/nextgensoft.html"><span style="text-decoration: underline;">Next-gen-seq software</span></a> - a list of software packages, both commercial and open-source, related to analysis of deep sequencing datasets</p><p><a href="http://www.cbcb.umd.edu/software/"><span style="text-decoration: underline;">Software</span></a> from the Center for Bioinformatics and Computational Biology, University of Maryland - many useful programs, all open-source</p><p><a href="http://bioinformatics.psb.ugent.be/plaza/"><span style="text-decoration: underline;"> PLAZA</span></a>: a comparative genomics resource to study gene and genome evolution in plants; described by Proost et al, Plant Cell 21:3718, 2010 <a href="http://www.plantcell.org/content/21/12/3718.full"><span style="text-decoration: underline;">Full Text</span></a></p><p>The European Bioinformatics Institute provides tools <a href="http://www.ebi.ac.uk/Tools/rcloud/"><span style="text-decoration: underline;">ArrayExpressHTS</span><span style="text-decoration: underline;"> and R-Cloud</span></a> for analysis of transcriptome data</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/40228/bioinformatics-services-cro-services</guid>
	<pubDate>Wed, 06 Nov 2019 00:33:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/40228/bioinformatics-services-cro-services</link>
	<title><![CDATA[Bioinformatics Services / CRO Services]]></title>
	<description><![CDATA[<p>RASA is set to provide premium technical and scientific services in a form of solutions, product development and training. .We are also very proficient in providing the high quality Research &amp; Development services in life science informatics field like Next Generation Sequencing (NGS) Data Analysis,Computational Drug Discovery, Bioinformatics, Chemo-informatics and BIO-IT.</p><p>RASA offers faster, better and cost effective cutting edge technology solutions to chemical and life science research and industry. We provide our customers with A seamless model of wide expertise and comprehensive platforms. Our Value is to take our customers</p>]]></description>
	<dc:creator>RASA Life Sciences</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/39606/amity-university-bioinformatics-summer-program-kolkata</guid>
	<pubDate>Tue, 11 Jun 2019 21:27:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/39606/amity-university-bioinformatics-summer-program-kolkata</link>
	<title><![CDATA[Amity University Bioinformatics Summer Program - Kolkata]]></title>
	<description><![CDATA[<p>Registrations are now open for the 2019 Summer Bioinformatics Training program at Amity University, Kolkata. The program will focus on introductory topics for life science students. We will review important history, topics and challenges bioinformatics can help address in the context of basic research, discovery and industry.</p><p>Read more: https://edu.t-bio.info/amity-university-summer-bioinformatics-program-registrations-are-open/</p>]]></description>
	<dc:creator>eliabrodsky</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/10394/bioinformatics-protocols</guid>
	<pubDate>Mon, 05 May 2014 10:21:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/10394/bioinformatics-protocols</link>
	<title><![CDATA[Bioinformatics Protocols]]></title>
	<description><![CDATA[<h2><span> RNA Seq </span></h2>
<p><strong> Basic Galaxy Tutorial </strong></p>
<ul>
<li><a href="https://docs.google.com/document/pub?id=1KbTiBHtvHLfPRZ39AY3uriazrINA8TJzgjjwn1zPP7Y">RNA-Seq tutorial</a> based on <a href="http://www.nature.com/protocolexchange/protocols/2327">Trapnell et al. (2012)</a> <em>Nature Protocols</em></li>
</ul>
<dl><dd>In this tutorial we cover the concepts of <a href="http://en.wikipedia.org/wiki/RNA-Seq">RNA-Seq</a> differential gene expression (DGE) analysis using a very small synthetic dataset from a well studied organism.</dd></dl>
<p><strong> Advanced Galaxy Tutorial </strong></p>
<ul>
<li><a href="https://docs.google.com/document/d/1fQ1XfeOKhezJUDTzMXtZVY20c3RGoHe-HLvFOGzqU4s/pub">RNA-Seq (Advanced) Tutorial</a></li>
</ul>
<dl><dd>In this tutorial we compare the performance of three statistically-based differential expression tools:</dd><dd>* CuffDiff</dd><dd>* EdgeR</dd><dd>* DESeq2</dd></dl>
<p><strong> Advanced Command Line Tutorial </strong></p>
<ul>
<li><a href="https://docs.google.com/document/d/1ayJXtgBP1OXtnV7o7lq4QHKMNk5SdPHFq4hGkqndBtI/pub">Graphical Output with CummeRbund</a> introduces some basic commands using the cummeRbund package of the R programming language</li>
</ul>
<dl><dd>You will need to install R, RStudio and cummeRbund on your PC (explained in the Tutorial). You will learn how to produce graphical output from RNA-Seq analysis previously done using a Cuffdiff analysis.</dd></dl>
<h2><span> Variant Detection </span></h2>
<p><strong> Basic Galaxy Tutorial </strong></p>
<ul>
<li><a href="https://docs.google.com/document/pub?id=1ZRzrjjOCvtAu3m-IKL-rbJ1f4On60dDL_IEwG7oejdI">Variant Detection tutorial</a></li>
</ul>
<dl><dd>In this tutorial we cover the concepts of detecting small variants (SNVs and indels) in human genomic DNA using a small set of reads from chromosome 22.</dd></dl>
<p><strong>Advanced Galaxy Tutorial</strong></p>
<ul>
<li><a href="https://docs.google.com/document/pub?id=1CuKkKylVDb03tnN7RSWl5EUzleetn0ctjmvaidPKLxM">Variant Detection (Advanced) Tutorial</a></li>
</ul>
<dl><dd>In this tutorial we compare the performance of three statistically-based variant detection tools:</dd><dd>* SAMtools: Mpileup</dd><dd>* GATK: Unified Genotyper</dd><dd>* FreeBayes</dd><dd>Each of these tools takes as its input a BAM file of aligned reads and generates a list of likely variants in VCF format</dd></dl>
<p><strong>Pipelines</strong> are for those who are comfortable with using the UNIX command line; and often allow more control over branching and iteration logic.</p>
<ul>
<li><a href="https://github.com/claresloggett/variant_calling_pipeline">WGS/exome GATK-based variant calling pipeline</a></li>
</ul>
<dl><dd>This is a basic variant-calling and annotation pipeline developed at the Victorian Life Sciences Computation Initiative (VLSCI), University of Melbourne. It is based around BWA, GATK and ENSEMBL and was originally designed for human (or similar) data. The master branch is configured for WGS data; there is an exome branch configured for variant calling in exome data.</dd><dd>To run the pipeline you will need Rubra: <a href="https://github.com/bjpop/rubra">https://github.com/bjpop/rubra</a>. Rubra uses the python Ruffus library: <a href="http://www.ruffus.org.uk/">http://www.ruffus.org.uk/</a>.</dd></dl>
<p><strong>Protocols</strong></p>
<ul>
<li><a href="https://docs.google.com/document/d/1lfDYNzHjfDA1pHTHd-0w3xHhg7L4TipT1gRfzgiV8es/pub">Familial Variant Calling</a></li>
</ul>
<dl><dd>In this protocol we discuss and outline the process of calling familial related mutations.</dd></dl>
<ul>
<li><a href="https://docs.google.com/document/d/1PIhm8NrFGaSK0hxpDcp8wUOz11ZkOaHIrpnJshMgDec/pub">Somatic Variant Calling</a></li>
</ul>
<dl><dd>In this protocol we discuss and outline the process of identifying somatic variants or mutations.</dd></dl>
<h2><span> Assembly </span></h2>
<p><strong> Basic Galaxy Tutorial </strong></p>
<ul>
<li><a href="https://docs.google.com/document/pub?id=1N3AB9ptISUu4zULqe1kXpVF0BDyGb5f5yzxWSJd_WNM">Genome assembly tutorial</a></li>
</ul>
<dl><dd>In this tutorial we carry out de novo assembly of a microbial genome. We have also written a <a href="https://docs.google.com/document/d/1xs-TI5MejQARqo0pcocGlymsXldwJbJII890gnmjI0o/pub">De novo Genome Assembly for Illumina Data</a> Protocol for a more generic description of the method.</dd></dl>
<p><strong> Protocol </strong></p>
<ul>
<li><a href="https://docs.google.com/document/d/1xs-TI5MejQARqo0pcocGlymsXldwJbJII890gnmjI0o/pub">De novo Genome Assembly for Illumina Data</a></li>
</ul>
<dl><dd>In this protocol we discuss and outline the process of de novo assembly for small to medium sized genomes. Use our <a href="https://docs.google.com/document/pub?id=1N3AB9ptISUu4zULqe1kXpVF0BDyGb5f5yzxWSJd_WNM">Genome assembly tutorial</a> to learn a specific case of using Galaxy to carry out de novo assembly of a microbial genome.</dd></dl>
<h2><span> Small RNAs </span></h2>
<p><strong> Basic Galaxy Tutorial </strong></p>
<ul>
<li><a href="https://docs.google.com/document/d/1WAObJr7M0m8U-2ku-0Y0Sdt_IHmqd1h8WaJHPhnJ1lM/pub">Quality control for small RNA</a></li>
</ul>
<dl><dd>This tutorial covers initial steps of the workflow for analysis of short RNA expression such as a quality control of the raw reads, processing of the raw reads for the subsequent analysis and initial quality assessment of the library.</dd></dl>
<h2><span> ChIP Seq </span></h2>
<p><strong> Protocol </strong></p>
<ul>
<li><a href="https://docs.google.com/document/d/1UPJC8dsiDeP5R9MH9U0IvoDgPF2Q3EOstAuzS3e6WCE/pub">ChIP-Seq</a></li>
</ul>
<dl><dd>In this protocol we discuss ChIP-Seq: a method to analyze the interaction between proteins and DNA.</dd></dl>
<h2><span> Amplicons </span></h2>
<p><strong>Protocol</strong></p>
<ul>
<li><a href="https://docs.google.com/document/d/1uW7JzxG86QzS92hTyeuNsLhX_d1XFbaZPSjh7jWxcSg/pub">Amplicon Alignment</a></li>
</ul>
<dl><dd>In this protocol we discuss and outline the process of aligning custom amplicons using primers for high precision.</dd></dl>
<h2><span> Learn Galaxy </span></h2>
<p><a href="https://docs.google.com/document/d/1wsdJDYfjZVg2uJxm9AHi_j0mY3X1M1F4gB-elkuYL7c/pub">Introduction to Galaxy,</a> for those who are very new to Galaxy.</p>
<p><a href="https://docs.google.com/document/d/1t7vVqa3mdeZYPv5-8hiHBFBYhNiynV_3mWByno9-wUM/pub">Using Histories and Workflows,</a> for those with some Galaxy knowledge.</p>
<p>The Galaxy project website has many <a href="http://wiki.galaxyproject.org/Learn">tutorials</a> and <a href="http://wiki.galaxyproject.org/Learn/Screencasts">screencasts</a> about using Galaxy and the tools, and developing new tools.</p><p>Address of the bookmark: <a href="https://genome.edu.au/wiki/Learn" rel="nofollow">https://genome.edu.au/wiki/Learn</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/10741/managing-and-analyzing-next-generation-sequence-data</guid>
	<pubDate>Sat, 10 May 2014 06:28:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/10741/managing-and-analyzing-next-generation-sequence-data</link>
	<title><![CDATA[Managing and Analyzing Next-Generation Sequence Data]]></title>
	<description><![CDATA[<p>Centralized Bioinformatics Core Facilities provide shared resources for the computational and IT requirements of the investigators in their department or institution. As such, they must be able to effectively react to new types of experimental technology. Recently faced with an unprecedented flood of data generated by the next generation of DNA sequencers, these groups found it necessary to respond quickly and efficiently to the informatics and infrastructure demands. Centralized Facilities newly facing this challenge need to anticipate time and design considerations of necessary components, including infrastructure upgrades, staffing, and tools for data analyses and management ...</p>
<p>More at http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000369</p><p>Address of the bookmark: <a href="http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000369" rel="nofollow">http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000369</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>

</channel>
</rss>