<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37411?offset=340</link>
	<atom:link href="https://bioinformaticsonline.com/related/37411?offset=340" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27847/anvio</guid>
	<pubDate>Thu, 16 Jun 2016 18:15:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27847/anvio</link>
	<title><![CDATA[Anvio]]></title>
	<description><![CDATA[<p>In a nutshell</p>
<p>Anvi&rsquo;o is an analysis and visualization platform for &lsquo;omics data.</p>
<p>Please find the methods paper here: https://peerj.com/articles/1319/</p>
<p>Anvi&rsquo;o would not have been possible without the help of many people who directly or indirectly contributed to its development. Here is the acknowledgements section of our methods paper</p>
<p><span>An analysis and visualization platform for 'omics data</span><span>&nbsp;</span><span><a href="http://merenlab.org/projects/anvio">http://merenlab.org/projects/anvio</a></span></p>
<p><span>Paper&nbsp;https://peerj.com/articles/1839/</span></p><p>Address of the bookmark: <a href="https://github.com/meren/anvio" rel="nofollow">https://github.com/meren/anvio</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/28112/ngs-glossary</guid>
	<pubDate>Mon, 27 Jun 2016 08:56:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/28112/ngs-glossary</link>
	<title><![CDATA[NGS Glossary !!]]></title>
	<description><![CDATA[<p><strong>alignment</strong>: the mapping of a raw sequence read to a location within a reference genome. The mapping occurs because the sequences within the raw read match or align to sequences within the reference genome. Alignment information is stored in the <strong>SAM</strong> or <strong>BAM</strong> file formats.</p><p><strong>bcftools</strong>: a set of companion tools, currently bundled with SAMtools, for identifying and filtering genomics variants.</p><p><strong>bowtie</strong>: widely used, open source alignment software for aligning raw sequence reads to a reference genome.</p><p><strong>BAM Format</strong>: binary, compressed format for storing <strong>SAM</strong> data.</p><p><strong>BCF Format</strong>: Binary call format. Binary, compressed format for storing <strong>VCF</strong> data.</p><p><strong>CIGAR String</strong>: Compact Idiosyncratic Gapped Alignment Report. A compact string that (partially) summarizes the alignment of a raw sequence read to the reference genome. Three core abbreviations are used: M for alignment match; I for insertion; and D for Deletion. For example, a CIGAR string of 5M2I63M indicates that the first 5 base pairs of the read align to the reference, followed by 2 base pairs, which are unique to the read, and not in the reference genome, followed by an additional 63 base pairs of alignment.</p><p><strong>FASTA Format</strong>: text format for storing raw sequence data. For example, the FASTA file at: <a href="http://www.ncbi.nlm.nih.gov/nuccore/NC_008253">http://www.ncbi.nlm.nih.gov/nuccore/NC_008253</a> contains entire genome for Escherichia coli 536.</p><p><strong>FASTQ Format</strong>: text format for storing raw sequence data along with quality scores for each base; usually generated by sequencing machines.</p><p><strong>genotype likelihood</strong>: the probability that a specific genotype is present in the sample of interest. Genotype likelihoods are usually expressed as a <strong>Phred-scaled probability</strong>, where P = 10 ^ (-Q/10). For example, if the genotype TT (both alleles are T) at position 1,299,132 in human chromosome 12 (reference G) is 37, this translates to a probability of 10<sup>-37/10</sup> = 0.0001995, meaning that there is very low probability that the reads in your sample support a TT genotype. On the other hand, a genotype of AA at the same position with a score of 0 translates into a probability of 10<sup>-0</sup> = 1, indicating extremely high probability that your sample contains a homozygous mutation of G to A.</p><p><strong>mate-pair</strong>: in paired-end sequencing, both ends of a single DNA or RNA fragment are sequenced, but the intermediate region is not. The two ends which are sequenced form a pair, and are frequently referred to as mate-pairs.</p><p><strong>QNAME</strong>: unique identifier of a raw sequence read (also known as the Query Name). Used in <strong>FASTQ</strong> and <strong>SAM</strong> files.</p><p><strong>paired-end sequencing</strong>: sequencing process where both ends of a single DNA or RNA fragment are sequenced, but the intermediate region is not. Particularly useful for identifying structural rearrangements, including gene fusions.</p><p><strong>Phred-scaled probability</strong>: a scaled value (Q) used to compactly summarize a probability, where P = 10<sup>-Q/10</sup>. For example, a Phred Q score of 10 translates to probability (P) = 10<sup>-10/10</sup> = 0.1. Phred-scaled probabilities are common in next-generation sequencing, and are used to represent multiple types of quality metrics, including quality of base calls, quality of mappings, and probabilities associated with specific genotypes. The name Phred refers to the original Phred base-calling software, which first used and developed the scale.</p><p><strong>Phred quality score</strong>: a score assigned to each base within a sequence, quantifying the probability that the base was called incorrectly. Scores use a <strong>Phred-scaled probability</strong> metric. For example, a Phred Q score of 10 translates to P=10<sup>-10/10</sup> = 0.1, indicating that the base has a 0.1 probability of being incorrect. Higher Phred score correspond to higher accuracy. In the <strong>FASTQ format</strong>, Phred scores are represented as single ASCII letters. For details on translating between Phred scores and ASCII values, refer to <a href="http://www.somewhereville.com/?p=1508">Table 1 of this useful blog post from Damian Gregory Allis</a>.</p><p><strong>read-length</strong>: the number of base pairs that are sequenced in an individual sequence read.</p><p><strong>read-depth</strong>: the number of sequence reads that pile up at the same genomic location. For example, 30X read-depth coverage indicates that the genomic location is covered by 30 independent sequencing reads. Increased read-depth translates into higher confidence for calling genomic variants.</p><p><strong>RNAME</strong>: reference genome identifier (also known as the Reference Name). Within a SAM formatted file, the RNAME identifies the reference genome where the raw read aligns.</p><p><strong>SAM Flag</strong>: a single integer value (e.g. 16), which encodes multiple elements of meta-data regarding a read and its alignment. Elements include: whether the read is one part of a paired-end read, whether the read aligns to the genome, and whether the read aligns to the forward or reverse strand of the genome. A <a href="http://picard.sourceforge.net/explain-flags.html">useful online utility</a> decodes a single SAM flag value into plain English.</p><p><strong>SAM Format</strong>: Text file format for storing sequence alignments against a reference genome. See also <strong>BAM</strong> Format.</p><p><strong>SAMtools</strong>: widely used, open source command line tool for manipulating SAM/BAM files. Includes options for converting, sorting, indexing and viewing SAM/BAM files. The SAMtools distribution also includes bcftools, a set of command line tools for identifying and filtering genomics variants. Created by <a href="http://lh3lh3.users.sourceforge.net/">Heng Li</a>, currently of the Broad Institute.</p><p><strong>single-read sequencing</strong>: sequencing process where only one end of a DNA or RNA fragment is sequenced. Contrast with <strong>paired-end</strong> sequencing.</p><p><strong>VCF Format</strong>: Variant call format. Text file format for storing genomic variants, including single nucleotide polymorphisms, insertions, deletions and structural rearrangements. See also <strong>BCF</strong> format.</p><p><strong>Next</strong><strong>Generation</strong><strong>Sequencing</strong><br /> A high-throughput sequencing method which parallelizes the sequencing process, producing thousands or millions of sequences at once.</p><p><strong>Deep</strong><strong>Sequencing</strong><br /> Techniques of nucleotide sequence analysis that increase the range, complexity, sensitivity, and accuracy of results by greatly increasing the scale of operations and thus the number of nucleotides, and the number of copies of each nucleotide sequenced.</p><p><strong>Paired-End</strong><strong>Sequencing</strong><br /> Sequence both ends of the same fragment and keep track of the paired data.</p><p><strong>Adapter</strong><br /> Short oligonucleotides which are attached to the DNA to be sequenced. An adapter can provide a priming site for both amplification and sequencing of the adjoining, unknown nucleic acid.</p><p><strong>Library</strong><br /> A collection of DNA fragments with adapters ligated to each end.</p><p><strong>Bridge</strong><strong>Amplification</strong><br /> Generation of in situ copies of a specific DNA molecule on an oligo-decorated solid support.</p><p><strong>Emulsion</strong><strong>PCR</strong><br /> A method for bead-based amplification of a library. A single adapter-bound fragment is attached to the surface of a bead, and an oil emulsion containing necessary amplification reagents is formed around the bead/fragment component. Parallel amplification of millions of beads with millions of single strand fragments produces a sequencer-ready library.</p><p><strong>Alignment</strong><br /> Mapping of sequence reads to a known reference sequence</p><p><strong>Reference</strong><strong>sequence</strong><strong>/</strong><strong>genome</strong><strong>&nbsp; </strong><br /> A fully assembled version of a genome that can be used for mapping short DNA sequence reads for comparisons of genomes from various individuals</p><p><strong>Coverage</strong><strong>Depth</strong><br /> The number of nucleotides from reads that are mapped to a given position of reference genome.</p><p><strong>Specificity</strong><strong>&nbsp; </strong><br /> The percentage of sequences that map to the intended targets out of total bases per run.</p><p><strong>Uniformity</strong><strong>&nbsp; </strong><br /> The variability in sequence coverage across target regions.</p><p><strong>Homopolymer</strong><br /> Uninterrupted stretch of a single nucleotide type (e.g., TTT or GGGGGG)</p><p><strong>InDel</strong><br /> InDel stands for Insertion or deletion. A form of structural variation in which a DNA segment is either deleted or inserted.</p><p><strong>SNP</strong><strong>&nbsp; </strong></p><p>SNP stands for Single Nucleotide Polymorphism. A single base difference found when comparing the same DNA sequence from two different individuals.</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28290/bioinformatics-tools-and-software</guid>
	<pubDate>Tue, 05 Jul 2016 10:02:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28290/bioinformatics-tools-and-software</link>
	<title><![CDATA[Bioinformatics tools and software]]></title>
	<description><![CDATA[<p><a href="http://drive5.com/usearch">USEARCH &gt;</a><br><span>Extreme high-throughput sequence analysis. Orders of magnitude faster than BLAST.</span>&nbsp;<a href="http://drive5.com/muscle">MUSCLE &gt;</a><br><span>Multiple sequence alignment. Faster and more accurate than CLUSTALW.</span></p>
<p>&nbsp;<a href="http://drive5.com/uparse">UPARSE &gt;</a><br><span>OTU clustering for 16S and other marker genes. Highly accurate OTU sequences and improved diversity measures.</span>&nbsp;<a href="http://drive5.com/uchime">UCHIME &gt;</a><br><span>Chimeric sequence detection.</span>&nbsp;<a href="http://drive5.com/piler">PILER &gt;</a><br><span>De novo genome repeat finder.</span>&nbsp;<a href="http://drive5.com/pilercr">PILER-CR &gt;</a><br><span>Detection of CRISPR repeats in bacterial genomes.</span>&nbsp;<a href="http://drive5.com/qscore">QSCORE &gt;</a><br><span>Compare two multiple alignments for benchmarking.</span>&nbsp;<a href="http://drive5.com/pals">PALS &gt;</a><br><span>Whole-genome alignment.</span>&nbsp;<a href="http://drive5.com/muscle/prefab.htm">PREFAB &gt;</a><br><span>Protein Reference Alignment Database.</span>&nbsp;<a href="http://drive5.com/bench">MSA benchmark collection &gt;</a><br><span>Selected multiple alignment benchmarks in a standardized FASTA format.</span></p><p>Address of the bookmark: <a href="http://drive5.com/software.html" rel="nofollow">http://drive5.com/software.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28554/megan6</guid>
	<pubDate>Mon, 25 Jul 2016 05:45:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28554/megan6</link>
	<title><![CDATA[MEGAN6]]></title>
	<description><![CDATA[<p>Microbiome analysis using a single application</p>
<p>MEGAN6 is a comprehensive toolbox for interactively analyzing microbiome data. All the interactive tools you need in one application.</p>
<ul>
<li>Taxonomic analysis using the NCBI taxonomy or a customized taxonomy such as SILVA</li>
<li>Functional analysis using InterPro2GO, SEED, eggNOG or KEGG</li>
<li>Bar charts, word clouds, Voronoi tree maps and many other charts</li>
<li>PCoA, clustering and networks</li>
<li>Supports metadata</li>
<li>MEGAN parses many different types of input</li>
</ul>
<p>Why use MEGAN6?</p>
<div>&nbsp;The software is:</div>
<div><ol>
<li>Easy to use. MEGAN6 is a single application and all features are available through menus, toolbars and graphics. No scripting skills required.</li>
<li>Powerful. MEGAN6 allows you to work with hundreds of samples containing&nbsp;hundreds of millions of sequencing reads. Blast-like analysis can be performed using DIAMOND.</li>
<li>Comprehensive. MEGAN6 offers a large range of analysis tools, and is under active development.</li>
</ol></div><p>Address of the bookmark: <a href="https://ab.inf.uni-tuebingen.de/software/megan6" rel="nofollow">https://ab.inf.uni-tuebingen.de/software/megan6</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29583/graph-genome-suite</guid>
	<pubDate>Fri, 28 Oct 2016 07:59:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29583/graph-genome-suite</link>
	<title><![CDATA[Graph Genome Suite]]></title>
	<description><![CDATA[<p><span>Seven Bridges is the biomedical data analysis company accelerating breakthroughs in genomics research for cancer, drug development and precision medicine. We build self-improving systems to analyze millions of genomes, including the&nbsp;</span><strong>Graph Genome Suite</strong><span>&nbsp;&mdash; the most advanced population genomics tools in the world.</span></p><p>Address of the bookmark: <a href="https://www.sbgenomics.com/graph/" rel="nofollow">https://www.sbgenomics.com/graph/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/29638/r-graphical-cookbook-by-winston-chang</guid>
	<pubDate>Fri, 04 Nov 2016 12:50:30 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/29638/r-graphical-cookbook-by-winston-chang</link>
	<title><![CDATA[R Graphical Cookbook by Winston Chang]]></title>
	<description><![CDATA[<p>R Graphical Cookbook by Winston Chang</p><p>A very nice book by Winston Chang for R ethusiast. The R code presented in these pages is the R code actually used to produce the Figures in the book. There will be differences compared to the code chunks shown in the text of the book, but in most cases the differences will be that these pages contain additional code to lay out multiple plots on a single "page".</p><p>The code presented for each figure is self-contained, i.e., all code required to produce the figure is included. This means that there is sometimes considerable overlap of code between several figures  In some cases, it may be necessary to install an add-on package from CRAN to get the code to run.</p><p>More books at http://www.e-reading.club/bookreader.php/137370/C486x_APPb.pdf</p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/29638" length="37521" type="image/png" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30012/swalo</guid>
	<pubDate>Wed, 30 Nov 2016 05:06:05 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30012/swalo</link>
	<title><![CDATA[SWALO]]></title>
	<description><![CDATA[<p>SWALO (scaffolding with assembly likelihood optimization) is a method for scaffolding based on likelihood of genome assemblies computed using generative models for sequencing.</p>
<p><a href="https://atifrahman.github.io/SWALO/swalo-0.9.7-beta.tar.gz"><strong>Download</strong></a></p>
<p><strong>Git repository of SWALO is at <a href="https://github.com/atifrahman/SWALO">https://github.com/atifrahman/SWALO</a>.</strong></p><p>Address of the bookmark: <a href="https://atifrahman.github.io/SWALO/" rel="nofollow">https://atifrahman.github.io/SWALO/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30140/cutadapt</guid>
	<pubDate>Wed, 14 Dec 2016 09:59:52 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30140/cutadapt</link>
	<title><![CDATA[Cutadapt]]></title>
	<description><![CDATA[<p>Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.</p>
<p>Cutadapt helps with these trimming tasks by finding the adapter or primer sequences in an error-tolerant way. It can also modify and filter reads in various ways. Adapter sequences can contain IUPAC wildcard characters. Also, paired-end reads and even colorspace data is supported. If you want, you can also just demultiplex your input data, without removing adapter sequences at all.</p>
<p>Cutadapt comes with an extensive suite of automated tests and is available under the terms of the MIT license.</p>
<p>If you use cutadapt, please cite&nbsp;<a href="http://dx.doi.org/10.14806/ej.17.1.200">DOI:10.14806/ej.17.1.200</a>&nbsp;.</p>
<p>More at&nbsp;https://github.com/marcelm/cutadapt</p><p>Address of the bookmark: <a href="http://cutadapt.readthedocs.io/en/stable/guide.html" rel="nofollow">http://cutadapt.readthedocs.io/en/stable/guide.html</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31564/htslib</guid>
	<pubDate>Wed, 15 Mar 2017 11:38:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31564/htslib</link>
	<title><![CDATA[HTSlib]]></title>
	<description><![CDATA[<p>Samtools is a suite of programs for interacting with high-throughput sequencing data. It consists of three separate repositories:</p>
<dl><dt>Samtools</dt><dd>Reading/writing/editing/indexing/viewing SAM/BAM/CRAM format</dd><dt>BCFtools</dt><dd>Reading/writing BCF2/VCF/gVCF files and calling/filtering/summarising SNP and short indel sequence variants</dd><dt>HTSlib</dt><dd>A C library for reading/writing high-throughput sequencing data</dd></dl>
<p>Samtools and BCFtools both use HTSlib internally, but these source packages contain their own copies of htslib so they can be built independently.</p><p>Address of the bookmark: <a href="http://www.htslib.org/" rel="nofollow">http://www.htslib.org/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32420/fastq-format</guid>
	<pubDate>Wed, 03 May 2017 04:23:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32420/fastq-format</link>
	<title><![CDATA[Fastq format]]></title>
	<description><![CDATA[<p><strong>FASTQ format</strong>&nbsp;is a text-based&nbsp;<a href="https://en.wikipedia.org/wiki/File_format" title="File format">format</a>&nbsp;for storing both a biological sequence (usually&nbsp;<a href="https://en.wikipedia.org/wiki/Nucleotide_sequence" title="Nucleotide sequence">nucleotide sequence</a>) and its corresponding quality scores. Both the sequence letter and quality score are each encoded with a single&nbsp;<a href="https://en.wikipedia.org/wiki/ASCII" title="ASCII">ASCII</a>&nbsp;character for brevity.</p>
<p>It was originally developed at the&nbsp;<a href="https://en.wikipedia.org/wiki/Wellcome_Trust_Sanger_Institute" title="Wellcome Trust Sanger Institute">Wellcome Trust Sanger Institute</a>&nbsp;to bundle a&nbsp;<a href="https://en.wikipedia.org/wiki/FASTA_format" title="FASTA format">FASTA</a>&nbsp;sequence and its quality data, but has recently become the&nbsp;<em>de facto</em>&nbsp;standard for storing the output of high-throughput sequencing instruments such as the&nbsp;<a href="https://en.wikipedia.org/wiki/Illumina_(company)" title="Illumina (company)">Illumina</a>&nbsp;Genome Analyzer.<sup id="cite_ref-Cock2009_1-0"><a href="https://en.wikipedia.org/wiki/FASTQ_format#cite_note-Cock2009-1">[1]</a></sup></p><p>Address of the bookmark: <a href="https://en.wikipedia.org/wiki/FASTQ_format" rel="nofollow">https://en.wikipedia.org/wiki/FASTQ_format</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>