<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43957?</link>
	<atom:link href="https://bioinformaticsonline.com/related/43957?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37493/fastq-stats-in-emoji</guid>
	<pubDate>Mon, 06 Aug 2018 10:20:20 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37493/fastq-stats-in-emoji</link>
	<title><![CDATA[Fastq stats in Emoji :)]]></title>
	<description><![CDATA[<p>Read one or more FASTQ files,&nbsp;<a href="https://fastqe.com/">fastqe</a>&nbsp;will compute quality stats for each file and print those stats as emoji... for some reason.</p>
<p>Given a fastq file in Illumina 1.8+/Sanger format, calculate the mean (rounded) score for each position and print a corresponding emoji!</p>
<p><a href="https://github.com/lonsbio/fastqe/blob/master/docs/img/fastqe_binned.png" target="_blank"><img src="https://github.com/lonsbio/fastqe/raw/master/docs/img/fastqe_binned.png" alt="Example" style="border: 0px;"></a></p>
<p><a href="https://fastqe.com/">https://fastqe.com/</a></p><p>Address of the bookmark: <a href="https://github.com/lonsbio/fastqe" rel="nofollow">https://github.com/lonsbio/fastqe</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35144/converting-fastq-to-fasta</guid>
	<pubDate>Fri, 12 Jan 2018 03:49:09 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35144/converting-fastq-to-fasta</link>
	<title><![CDATA[Converting FASTQ to FASTA]]></title>
	<description><![CDATA[<div id="block-system-main"><div><div><div><div><div><div><p>There are several ways you can convert fastq to fasta sequences. Some methods are listed below.</p><h3>Using SED</h3><p><span><code><span>sed</span></code></span>&nbsp;can be used to selectively print the desired lines from a file, so if you print the first and 2rd line of every 4 lines, you get the sequence header and sequence needed for fasta format.</p><pre>sed -n '1~4s/^@/&gt;/p;2~4p' INFILE.fastq &gt; OUTFILE.fasta
</pre><h3>Using PASTE</h3><p>You can linerize every 4 lines in a tabular format and print first and second field using&nbsp;<span><code>paste</code></span></p><pre>cat INFILE.fastq | paste - - - - |cut -f 1, 2| sed 's/@/&gt;/'g | tr -s "/t" "/n" &gt; OUTFILE.fasta
</pre><h3>EMBOSS:seqret</h3><p>Standard script that can be used for many purposes. One such use is fastq-fasta conversion</p><pre>seqret -sequence reads.fastq -outseq reads.fasta
</pre><p><span><code><span>awk</span></code></span>&nbsp;can be used for conversion as follows:</p><h3>Using AWK</h3><pre>cat infile.fq | awk '{if(NR%4==1) {printf("&gt;%s\n",substr($0,2));} else if(NR%4==2) print;}' &gt; file.fa
</pre><h3>FASTX-toolkit</h3><p><span><code>fastq_to_fasta</code></span>&nbsp;is available in the FASTX-toolkit that scales really well with the huge datasets</p><pre>fastq_to_fasta -h
usage: fastq_to_fasta [-h] [-r] [-n] [-v] [-z] [-i INFILE] [-o OUTFILE]
# Remember to use -Q33 for illumina reads!
version 0.0.6
       [-h]         = This helpful help screen.
       [-r]         = Rename sequence identifiers to numbers.
       [-n]         = keep sequences with unknown (N) nucleotides.
                   Default is to discard such sequences.
       [-v]         = Verbose - report number of sequences.
                   If [-o] is specified,  report will be printed to STDOUT.
                   If [-o] is not specified (and output goes to STDOUT),
                   report will be printed to STDERR.
       [-z]         = Compress output with GZIP.
       [-i INFILE]  = FASTA/Q input file. default is STDIN.
       [-o OUTFILE] = FASTA output file. default is STDOUT.
</pre><h3>Bioawk</h3><p>Another option to convert fastq to fasta format using&nbsp;<span><code>bioawk</code></span></p><pre>bioawk -c fastx '{print "&gt;"$name"\n"$seq}' input.fastq &gt; output.fasta
</pre><h3>Seqtk</h3><p>From the same developer, there is another option using a tool called&nbsp;<span><code>seqtk</code></span></p><pre>seqtk seq -a input.fastq &gt; output.fasta
</pre><p>Note that you can use either compressed or uncompressed files for this tool</p></div></div></div></div></div></div></div>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/37396/converting-a-vcf-into-a-fasta-given-some-reference</guid>
	<pubDate>Fri, 20 Jul 2018 10:03:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/37396/converting-a-vcf-into-a-fasta-given-some-reference</link>
	<title><![CDATA[Converting a VCF into a FASTA given some reference !]]></title>
	<description><![CDATA[<p>Samtools/BCFtools (Heng Li) provides a Perl script&nbsp;<a href="https://github.com/lh3/samtools/blob/master/bcftools/vcfutils.pl"><code>vcfutils.pl</code></a>&nbsp;which does this, the function&nbsp;<code>vcf2fq</code>&nbsp;(lines 469-528)</p><p>This script has been modified by others to convert InDels as well, e.g.&nbsp;<a href="https://github.com/gringer/bioinfscripts/blob/master/vcf2fq.pl">this</a>&nbsp;by David Eccles</p><pre><code><span>./</span><span>vcf2fq</span><span>.</span><span>pl </span><span>-</span><span>f </span><span>&lt;</span><span>input</span><span>.</span><span>fasta</span><span>&gt;</span><span> </span><span>&lt;</span><span>all</span><span>-</span><span>site</span><span>.</span><span>vcf</span><span>&gt;</span><span> </span><span>&gt;</span><span> </span><span>&lt;</span><span>output</span><span>.</span><span>fastq</span><span>&gt;</span></code></pre><p>https://github.com/gringer/bioinfscripts/blob/master/vcf2fq.pl</p><p>https://github.com/lh3/samtools/blob/master/bcftools/vcfutils.pl</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38666/mcat-motif-combining-and-association-tool</guid>
	<pubDate>Sun, 13 Jan 2019 06:27:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38666/mcat-motif-combining-and-association-tool</link>
	<title><![CDATA[MCAT: Motif Combining and Association Tool]]></title>
	<description><![CDATA[<p>This is a pipeline for finding motifs in fasta files.<br>It can be run from the command line as follows:</p>
<p>usage: orange_pipeline_refine.py [-h] [-w W] [--nmotifs NMOTIFS] [--iter ITER] [-c C]<br>[-s S] [-d] [-ff] [-v V]<br>positive_seq negative_seq</p>
<p>positional arguments:<br>positive_seq the fasta file for the positive sequences<br>negative_seq the fasta file for the negative sequences</p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://github.com/yanshen43/MCAT" rel="nofollow">https://github.com/yanshen43/MCAT</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27080/mrfast-micro-read-fast-alignment-search-tool</guid>
	<pubDate>Tue, 26 Apr 2016 03:50:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27080/mrfast-micro-read-fast-alignment-search-tool</link>
	<title><![CDATA[mrFAST:  Micro Read Fast Alignment Search Tool]]></title>
	<description><![CDATA[<p><span>mrFAST is a read mapper that is designed to map short reads to reference genome with a special emphasis on the discovery of structural variation and segmental duplications. mrFAST maps short reads with respect to user defined error threshold, including indels up to 4+4 bp. This manual, describes how to choose the parameters and tune mrFAST with respect to the library settings. mrFAST is designed to find&nbsp;</span><strong><span style="text-decoration: underline;">'all'</span></strong><span>&nbsp; mappings for a given set of reads, however it can return one "best" map location if the relevant parameter is invoked.</span></p>
<p><span>More at&nbsp;http://mrfast.sourceforge.net/manual.html</span></p><p>Address of the bookmark: <a href="http://mrfast.sourceforge.net/manual.html" rel="nofollow">http://mrfast.sourceforge.net/manual.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30085/fqtools</guid>
	<pubDate>Thu, 08 Dec 2016 09:31:12 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30085/fqtools</link>
	<title><![CDATA[fqtools]]></title>
	<description><![CDATA[<p><code>fqtools</code><span>&nbsp;is a software suite for fast processing of&nbsp;</span><code>FASTQ</code><span>&nbsp;files. Various file manipulations are supported. See below for a full list of the subcommands available and a brief description of their purpose. Most of the individual subcommands will take either a single file or a pair of files as input. If no input file is specified, fqtools will attempt to read data from&nbsp;</span><code>stdin</code><span>. In this case, it is advisabe to specify the format of the data provided. For subcommands that generate FASTQ data, either a single file or a pair of files will be generated. If no&nbsp;</span><code>-o</code><span>&nbsp;argument is provided, single files will be writted to&nbsp;</span><code>stdout</code><span>.</span></p><p>Address of the bookmark: <a href="https://github.com/alastair-droop/fqtools" rel="nofollow">https://github.com/alastair-droop/fqtools</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32420/fastq-format</guid>
	<pubDate>Wed, 03 May 2017 04:23:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32420/fastq-format</link>
	<title><![CDATA[Fastq format]]></title>
	<description><![CDATA[<p><strong>FASTQ format</strong>&nbsp;is a text-based&nbsp;<a href="https://en.wikipedia.org/wiki/File_format" title="File format">format</a>&nbsp;for storing both a biological sequence (usually&nbsp;<a href="https://en.wikipedia.org/wiki/Nucleotide_sequence" title="Nucleotide sequence">nucleotide sequence</a>) and its corresponding quality scores. Both the sequence letter and quality score are each encoded with a single&nbsp;<a href="https://en.wikipedia.org/wiki/ASCII" title="ASCII">ASCII</a>&nbsp;character for brevity.</p>
<p>It was originally developed at the&nbsp;<a href="https://en.wikipedia.org/wiki/Wellcome_Trust_Sanger_Institute" title="Wellcome Trust Sanger Institute">Wellcome Trust Sanger Institute</a>&nbsp;to bundle a&nbsp;<a href="https://en.wikipedia.org/wiki/FASTA_format" title="FASTA format">FASTA</a>&nbsp;sequence and its quality data, but has recently become the&nbsp;<em>de facto</em>&nbsp;standard for storing the output of high-throughput sequencing instruments such as the&nbsp;<a href="https://en.wikipedia.org/wiki/Illumina_(company)" title="Illumina (company)">Illumina</a>&nbsp;Genome Analyzer.<sup id="cite_ref-Cock2009_1-0"><a href="https://en.wikipedia.org/wiki/FASTQ_format#cite_note-Cock2009-1">[1]</a></sup></p><p>Address of the bookmark: <a href="https://en.wikipedia.org/wiki/FASTQ_format" rel="nofollow">https://en.wikipedia.org/wiki/FASTQ_format</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35345/rgfa-powerful-and-convenient-handling-of-assembly-graphs</guid>
	<pubDate>Thu, 25 Jan 2018 05:47:53 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35345/rgfa-powerful-and-convenient-handling-of-assembly-graphs</link>
	<title><![CDATA[RGFA: powerful and convenient handling of assembly graphs]]></title>
	<description><![CDATA[<p><span>RGFA, an implementation of the proposed GFA specification in Ruby. It allows the user to conveniently parse, edit and write GFA files. Complex operations such as the separation of the implicit instances of repeats and the merging of linear paths can be performed. A typical application of RGFA is the editing of a graph, to finish the assembly of a sequence, using information not available to the assembler. We illustrate a use case, in which the assembly of a repetitive metagenomic fosmid insert was completed using a script based on RGFA.</span></p>
<p><span>https://github.com/ggonnella/rgfa</span></p><p>Address of the bookmark: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5103826/" rel="nofollow">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5103826/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36800/genomemapper-simultaneous-alignment-of-short-reads-against-multiple-genomes</guid>
	<pubDate>Fri, 25 May 2018 09:29:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36800/genomemapper-simultaneous-alignment-of-short-reads-against-multiple-genomes</link>
	<title><![CDATA[GenomeMapper: Simultaneous alignment of short reads against multiple genomes]]></title>
	<description><![CDATA[GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. It can be used to align against multiple genomes simulanteously or against a single reference. If you are unsure which one is the appropriate GenomeMapper, you might want to use the latter

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2768987/<p>Address of the bookmark: <a href="http://1001genomes.org/software/genomemapper.html" rel="nofollow">http://1001genomes.org/software/genomemapper.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/opportunity/view/24178/essentials-of-statistics-and-data-analysis-using-r</guid>
  <pubDate>Mon, 31 Aug 2015 01:32:12 -0500</pubDate>
  <link></link>
  <title><![CDATA[Essentials of Statistics and Data Analysis using R]]></title>
  <description><![CDATA[
<p>Clinical Development Services Agency (CDSA) is an extramural unit of Translational Health Science and Technology Institute (THSTI), Department of Biotechnology, Ministry of Science &amp; Technology, Government of India. CDSA has a national mandate of strengthening capacity and capability building in the area of Clinical development and Translational Research.</p>

<p>CDSA is pleased to announce a 4 days hands-on training program on “Essentials of Statistics and Data Analysis using R” at ICGEB, Aruna Asaf Ali Road, New Delhi on December 1 – 4, 2015. This will involve developing and enhancing skills to understand basic principles of statistics for summarizing data and use of appropriate statistical tests as well as providing an understanding of data analysis using R. Didactic lectures with practical sessions will be delivered by experienced faculties from AIIMS and Novartis. Live classroom with power point presentations, case studies, mock exercise, practical sessions on R, group work with time for discussion and Q&amp;A sessions are added advantages of this workshop.</p>

<p>Please contact gayatrivishwakarma.cdsa@thsti.res.in or vineetabaloni.cdsa@thsti.res.in for program and registration details.</p>

<p>Please nominate personage or register yourself on or before November 6, 2015 along with the electronic transfer of registration fee.</p>
]]></description>
</item>

</channel>
</rss>