<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/38666?offset=60</link>
	<atom:link href="https://bioinformaticsonline.com/related/38666?offset=60" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43634/illumina-based-assembly-pipeline-steps</guid>
	<pubDate>Fri, 10 Dec 2021 06:22:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43634/illumina-based-assembly-pipeline-steps</link>
	<title><![CDATA[Illumina based assembly pipeline steps !]]></title>
	<description><![CDATA[<h3 id="illumina">Illumina<a href="https://nf-co.re/viralrecon#illumina"><span></span></a></h3><ol>
<li>Merge re-sequenced FastQ files (<a href="http://www.linfo.org/cat.html"><code>cat</code></a>)</li>
<li>Read QC (<a href="https://www.bioinformatics.babraham.ac.uk/projects/fastqc/"><code>FastQC</code></a>)</li>
<li>Adapter trimming (<a href="https://github.com/OpenGene/fastp"><code>fastp</code></a>)</li>
<li>Removal of host reads (<a href="http://ccb.jhu.edu/software/kraken2/"><code>Kraken 2</code></a>; <em>optional</em>)</li>
<li>Variant calling<ol>
<li>Read alignment (<a href="http://bowtie-bio.sourceforge.net/bowtie2/index.shtml"><code>Bowtie 2</code></a>)</li>
<li>Sort and index alignments (<a href="https://sourceforge.net/projects/samtools/files/samtools/"><code>SAMtools</code></a>)</li>
<li>Primer sequence removal (<a href="https://github.com/andersen-lab/ivar"><code>iVar</code></a>; <em>amplicon data only</em>)</li>
<li>Duplicate read marking (<a href="https://broadinstitute.github.io/picard/"><code>picard</code></a>; <em>optional</em>)</li>
<li>Alignment-level QC (<a href="https://broadinstitute.github.io/picard/"><code>picard</code></a>, <a href="https://sourceforge.net/projects/samtools/files/samtools/"><code>SAMtools</code></a>)</li>
<li>Genome-wide and amplicon coverage QC plots (<a href="https://github.com/brentp/mosdepth/"><code>mosdepth</code></a>)</li>
<li>Choice of multiple variant calling and consensus sequence generation routes (<a href="https://github.com/andersen-lab/ivar"><code>iVar variants and consensus</code></a>; <em>default for amplicon data</em> <em>||</em> <a href="http://samtools.github.io/bcftools/bcftools.html"><code>BCFTools</code></a>, <a href="https://github.com/arq5x/bedtools2/"><code>BEDTools</code></a>; <em>default for metagenomics data</em>)
<ul>
<li>Variant annotation (<a href="http://snpeff.sourceforge.net/SnpEff.html"><code>SnpEff</code></a>, <a href="http://snpeff.sourceforge.net/SnpSift.html"><code>SnpSift</code></a>)</li>
<li>Consensus assessment report (<a href="http://quast.sourceforge.net/quast"><code>QUAST</code></a>)</li>
<li>Lineage analysis (<a href="https://github.com/cov-lineages/pangolin"><code>Pangolin</code></a>)</li>
<li>Clade assignment, mutation calling and sequence quality checks (<a href="https://github.com/nextstrain/nextclade"><code>Nextclade</code></a>)</li>
<li>Individual variant screenshots with annotation tracks (<a href="https://asciigenome.readthedocs.io/en/latest/"><code>ASCIIGenome</code></a>)</li>
</ul>
</li>
<li>Intersect variants across callers (<a href="http://samtools.github.io/bcftools/bcftools.html"><code>BCFTools</code></a>)</li>
</ol></li>
<li><em>De novo</em> assembly<ol>
<li>Primer trimming (<a href="https://cutadapt.readthedocs.io/en/stable/guide.html"><code>Cutadapt</code></a>; <em>amplicon data only</em>)</li>
<li>Choice of multiple assembly tools (<a href="http://cab.spbu.ru/software/spades/"><code>SPAdes</code></a> <em>||</em> <a href="https://github.com/rrwick/Unicycler"><code>Unicycler</code></a> <em>||</em> <a href="https://github.com/GATB/minia"><code>minia</code></a>)
<ul>
<li>Blast to reference genome (<a href="https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch"><code>blastn</code></a>)</li>
<li>Contiguate assembly (<a href="https://www.sanger.ac.uk/science/tools/pagit"><code>ABACAS</code></a>)</li>
<li>Assembly report (<a href="https://github.com/BU-ISCIII/plasmidID"><code>PlasmidID</code></a>)</li>
<li>Assembly assessment report (<a href="http://quast.sourceforge.net/quast"><code>QUAST</code></a>)</li>
</ul>
</li>
</ol></li>
<li>Present QC and visualisation for raw read, alignment, assembly and variant calling results (<a href="http://multiqc.info/"><code>MultiQC</code></a>)</li>
</ol>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44561/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</guid>
	<pubDate>Sat, 08 Jun 2024 16:25:08 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44561/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</link>
	<title><![CDATA[Bactopia: a flexible pipeline for complete analysis of bacterial genomes]]></title>
	<description><![CDATA[<p>Bactopia is a flexible pipeline for complete analysis of bacterial genomes. The goal of Bactopia is process your data with a broad set of tools, so that you can get to the fun part of analyses quicker!</p>
<p>Bactopia was inspired by&nbsp;<a href="https://staphopia.github.io/">Staphopia</a>, a workflow we (Tim Read and myself) released that is targeted towards&nbsp;<em>Staphylococcus aureus</em>&nbsp;genomes. Using what we learned from Staphopia and user feedback, Bactopia was developed from scratch with usability, portability, and speed in mind from the start.</p>
<p>Bactopia uses&nbsp;<a href="https://www.nextflow.io/">Nextflow</a>&nbsp;to manage the workflow, allowing for support of many types of environments (e.g. cluster or cloud). Bactopia allows for the usage of many public datasets as well as your own datasets to further enhance the analysis of your sequencing. Bactopia only uses software packages available from&nbsp;<a href="https://bioconda.github.io/">Bioconda</a>&nbsp;and&nbsp;<a href="https://conda-forge.org/">Conda-Forge</a>&nbsp;to make installation as simple as possible for&nbsp;<em>all</em>&nbsp;users.</p>
<p>To highlight the use of&nbsp;<a href="https://bactopia.github.io/latest/full-guide/">Bactopia</a>&nbsp;and&nbsp;<a href="https://bactopia.github.io/latest/bactopia-tools/">Bactopia Tools</a>, we performed an analysis of 1,664 public&nbsp;<em>Lactobacillus</em>&nbsp;genomes, focusing on&nbsp;<em>Lactobacillus crispatus</em>, a species that is a common part of the human vaginal microbiome. The results from this analysis are published in mSystems under the title:&nbsp;<em><a href="https://doi.org/10.1128/mSystems.00190-20">Bactopia: a flexible pipeline for complete analysis of bacterial genomes</a></em></p>
<p><a href="https://bactopia.github.io/latest/assets/bactopia-workflow.png"><img src="https://bactopia.github.io/latest/assets/bactopia-workflow.png" alt="Bactopia Workflow" style="border: 0px;"></a></p><p>Address of the bookmark: <a href="https://bactopia.github.io/latest/" rel="nofollow">https://bactopia.github.io/latest/</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35144/converting-fastq-to-fasta</guid>
	<pubDate>Fri, 12 Jan 2018 03:49:09 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35144/converting-fastq-to-fasta</link>
	<title><![CDATA[Converting FASTQ to FASTA]]></title>
	<description><![CDATA[<div id="block-system-main"><div><div><div><div><div><div><p>There are several ways you can convert fastq to fasta sequences. Some methods are listed below.</p><h3>Using SED</h3><p><span><code><span>sed</span></code></span>&nbsp;can be used to selectively print the desired lines from a file, so if you print the first and 2rd line of every 4 lines, you get the sequence header and sequence needed for fasta format.</p><pre>sed -n '1~4s/^@/&gt;/p;2~4p' INFILE.fastq &gt; OUTFILE.fasta
</pre><h3>Using PASTE</h3><p>You can linerize every 4 lines in a tabular format and print first and second field using&nbsp;<span><code>paste</code></span></p><pre>cat INFILE.fastq | paste - - - - |cut -f 1, 2| sed 's/@/&gt;/'g | tr -s "/t" "/n" &gt; OUTFILE.fasta
</pre><h3>EMBOSS:seqret</h3><p>Standard script that can be used for many purposes. One such use is fastq-fasta conversion</p><pre>seqret -sequence reads.fastq -outseq reads.fasta
</pre><p><span><code><span>awk</span></code></span>&nbsp;can be used for conversion as follows:</p><h3>Using AWK</h3><pre>cat infile.fq | awk '{if(NR%4==1) {printf("&gt;%s\n",substr($0,2));} else if(NR%4==2) print;}' &gt; file.fa
</pre><h3>FASTX-toolkit</h3><p><span><code>fastq_to_fasta</code></span>&nbsp;is available in the FASTX-toolkit that scales really well with the huge datasets</p><pre>fastq_to_fasta -h
usage: fastq_to_fasta [-h] [-r] [-n] [-v] [-z] [-i INFILE] [-o OUTFILE]
# Remember to use -Q33 for illumina reads!
version 0.0.6
       [-h]         = This helpful help screen.
       [-r]         = Rename sequence identifiers to numbers.
       [-n]         = keep sequences with unknown (N) nucleotides.
                   Default is to discard such sequences.
       [-v]         = Verbose - report number of sequences.
                   If [-o] is specified,  report will be printed to STDOUT.
                   If [-o] is not specified (and output goes to STDOUT),
                   report will be printed to STDERR.
       [-z]         = Compress output with GZIP.
       [-i INFILE]  = FASTA/Q input file. default is STDIN.
       [-o OUTFILE] = FASTA output file. default is STDOUT.
</pre><h3>Bioawk</h3><p>Another option to convert fastq to fasta format using&nbsp;<span><code>bioawk</code></span></p><pre>bioawk -c fastx '{print "&gt;"$name"\n"$seq}' input.fastq &gt; output.fasta
</pre><h3>Seqtk</h3><p>From the same developer, there is another option using a tool called&nbsp;<span><code>seqtk</code></span></p><pre>seqtk seq -a input.fastq &gt; output.fasta
</pre><p>Note that you can use either compressed or uncompressed files for this tool</p></div></div></div></div></div></div></div>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/9639/find-certain-filesdocuments-in-linux-os</guid>
	<pubDate>Sun, 06 Apr 2014 23:56:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/9639/find-certain-filesdocuments-in-linux-os</link>
	<title><![CDATA[Find certain files/documents in Linux OS]]></title>
	<description><![CDATA[<p>As bioinformatician I know the fact that we usually handle the large dataset and lost in the huge numbers of files and folders. In order to search the missing file a strong search command is required. The Linux Find Command is one of the most important and much used command in Linux sytems. Find command used to search and locate list of files and directories based on conditions you specify for files that match the arguments. Find can be used in variety of conditions like you can find files by permissions, users, groups, file type, date, size and other possible criteria.<br /><br />Through this article we are sharing our day-to-day Linux find command experience and its usage in the form of examples. In this article we will show you the most used 35 Find Commands examples in Linux. We have divided the section into Five parts from basic to advance usage of find command.</p><p><strong>Part I &ndash; Basic Find Commands for Finding Files with Names</strong><br />1. Find Files Using Name in Current Directory<br /><br />Find all the files whose name is gene.txt in a current working directory.<br /><br /># find . -name gene.txt<br /><br />./gene.txt<br /><br />2. Find Files Under Home Directory<br /><br />Find all the files under /home directory with name gene.txt.<br /><br /># find /home -name gene.txt<br /><br />/home/gene.txt<br /><br />3. Find Files Using Name and Ignoring Case<br /><br />Find all the files whose name is gene.txt and contains both capital and small letters in /home directory.<br /><br /># find /home -iname gene.txt<br /><br />./gene.txt<br />./Gene.txt<br /><br />4. Find Directories Using Name<br /><br />Find all directories whose name is Gene in / directory.<br /><br /># find / -type d -name Gene<br /><br />/Gene<br /><br />5. Find fasta Files Using Name<br /><br />Find all php files whose name is gene.fasta in a current working directory.<br /><br /># find . -type f -name gene.fasta<br /><br />./gene.fasta<br /><br />6. Find all PHP Files in Directory<br /><br />Find all fasta files in a directory.<br /><br /># find . -type f -name "*.fasta"<br /><br />./gene.fasta<br />./cancer.fasta<br />./allgene.fasta<br /><br /><strong>Part II &ndash; Find Files Based on their Permissions</strong><br />7. Find Files With 777 Permissions<br /><br />Find all the files whose permissions are 777.<br /><br /># find . -type f -perm 0777 -print<br /><br />8. Find Files Without 777 Permissions<br /><br />Find all the files without permission 777.<br /><br /># find / -type f ! -perm 777<br /><br />9. Find SGID Files with 644 Permissions<br /><br />Find all the SGID bit files whose permissions set to 644.<br /><br /># find / -perm 2644<br /><br />10. Find Sticky Bit Files with 551 Permissions<br /><br />Find all the Sticky Bit set files whose permission are 551.<br /><br /># find / -perm 1551<br /><br />11. Find SUID Files<br /><br />Find all SUID set files.<br /><br /># find / -perm /u=s<br /><br />12. Find SGID Files<br /><br />Find all SGID set files.<br /><br /># find / -perm /g+s<br /><br />13. Find Read Only Files<br /><br />Find all Read Only files.<br /><br /># find / -perm /u=r<br /><br />14. Find Executable Files<br /><br />Find all Executable files.<br /><br /># find / -perm /a=x<br /><br />15. Find Files with 777 Permissions and Chmod to 644<br /><br />Find all 777 permission files and use chmod command to set permissions to 644.<br /><br /># find / -type f -perm 0777 -print -exec chmod 644 {} \;<br /><br />16. Find Directories with 777 Permissions and Chmod to 755<br /><br />Find all 777 permission directories and use chmod command to set permissions to 755.<br /><br /># find / -type d -perm 777 -print -exec chmod 755 {} \;<br /><br />17. Find and remove single File<br /><br />To find a single file called gene.txt and remove it.<br /><br /># find . -type f -name "gene.txt" -exec rm -f {} \;<br /><br />18. Find and remove Multiple File<br /><br />To find and remove multiple files such as .fa or .gb, then use.<br /><br /># find . -type f -name "*.fa" -exec rm -f {} \;<br /><br />OR<br /><br /># find . -type f -name "*.gb" -exec rm -f {} \;<br /><br />19. Find all Empty Files<br /><br />To file all empty files under certain path.<br /><br /># find /tmp -type f -empty<br /><br />20. Find all Empty Directories<br /><br />To file all empty directories under certain path.<br /><br /># find /tmp -type d -empty<br /><br />21. File all Hidden Files<br /><br />To find all hidden files, use below command.<br /><br /># find /tmp -type f -name ".*"<br /><br /><strong>Part III &ndash; Search Files Based On Owners and Groups</strong><br />22. Find Single File Based on User<br /><br />To find all or single file called gene.txt under / root directory of owner root.<br /><br /># find / -user root -name gene.txt<br /><br />23. Find all Files Based on User<br /><br />To find all files that belongs to user Rahul under /home directory.<br /><br /># find /home -user rahul<br /><br />24. Find all Files Based on Group<br /><br />To find all files that belongs to group Developer under /home directory.<br /><br /># find /home -group developer<br /><br />25. Find Particular Files of User<br /><br />To find all .txt files of user Rahul under /home directory.<br /><br /># find /home -user rahul -iname "*.txt"<br /><br /><strong>Part IV &ndash; Find Files and Directories Based on Date and Time</strong><br />26. Find Last 50 Days Modified Files<br /><br />To find all the files which are modified 50 days back.<br /><br /># find / -mtime 50<br /><br />27. Find Last 50 Days Accessed Files<br /><br />To find all the files which are accessed 50 days back.<br /><br /># find / -atime 50<br /><br />28. Find Last 50-100 Days Modified Files<br /><br />To find all the files which are modified more than 50 days back and less than 100 days.<br /><br /># find / -mtime +50 &ndash;mtime -100<br /><br />29. Find Changed Files in Last 1 Hour<br /><br />To find all the files which are changed in last 1 hour.<br /><br /># find / -cmin -60<br /><br />30. Find Modified Files in Last 1 Hour<br /><br />To find all the files which are modified in last 1 hour.<br /><br /># find / -mmin -60<br /><br />31. Find Accessed Files in Last 1 Hour<br /><br />To find all the files which are accessed in last 1 hour.<br /><br /># find / -amin -60<br /><br /><strong>Part V &ndash; Find Files and Directories Based on Size</strong><br />32. Find 50MB Files<br /><br />To find all 50MB files, use.<br /><br /># find / -size 50M<br /><br />33. Find Size between 50MB &ndash; 100MB<br /><br />To find all the files which are greater than 50MB and less than 100MB.<br /><br /># find / -size +50M -size -100M<br /><br />34. Find and Delete 100MB Files<br /><br />To find all 100MB files and delete them using one single command.<br /><br /># find / -size +100M -exec rm -rf {} \;<br /><br />35. Find Specific Files and Delete<br /><br />Find all .gb files with more than 10MB and delete them using one single command.<br /><br /># find / -type f -name *.gb -size +10M -exec rm {} \;</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43398/waafle-a-workflow-to-annotate-assemblies-and-find-lgt-events</guid>
	<pubDate>Thu, 23 Sep 2021 14:31:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43398/waafle-a-workflow-to-annotate-assemblies-and-find-lgt-events</link>
	<title><![CDATA[WAAFLE: a Workflow to Annotate Assemblies and Find LGT Events.]]></title>
	<description><![CDATA[<p><span>Lateral gene transfer (LGT) is an important mechanism for genome diversification in microbial communities, including the human microbiome. While methods exist to identify LGTs from sequenced isolate genomes, identifying LGTs from community metagenomes remains an open problem. To address this, we developed&nbsp;</span><span>WAAFLE</span><span>: a&nbsp;</span><span>W</span><span>orkflow to&nbsp;</span><span>A</span><span>nnotate&nbsp;</span><span>A</span><span>ssemblies and&nbsp;</span><span>F</span><span>ind&nbsp;</span><span>L</span><span>GT&nbsp;</span><span>E</span><span>vents.</span></p><p>Address of the bookmark: <a href="http://huttenhower.sph.harvard.edu/waafle" rel="nofollow">http://huttenhower.sph.harvard.edu/waafle</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44227/common-methods-to-discover-tandem-repeats</guid>
	<pubDate>Thu, 09 Mar 2023 02:40:52 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44227/common-methods-to-discover-tandem-repeats</link>
	<title><![CDATA[Common methods to discover tandem repeats]]></title>
	<description><![CDATA[<div><div><div><div><div><div><div><div><div><div><p>Tandem repeats are DNA sequences that are repeated in a contiguous manner in the genome. These sequences are often used as genetic markers and are important in many areas of genetics and genomics research. Here are some methods for discovering tandem repeats in genomes:</p><ol>
<li>
<p>Tandem Repeat Finder: Tandem Repeat Finder is a software tool that identifies tandem repeats in DNA sequences. It is available for free download and can be used on both nucleotide and protein sequences. The tool uses a statistical algorithm to identify repeats based on their length, copy number, and overall composition.</p>
</li>
<li>
<p>RepeatMasker: RepeatMasker is another software tool that can identify tandem repeats in DNA sequences. It works by comparing the input sequence to a database of known repeats and then identifies any tandem repeats that match those in the database.</p>
</li>
<li>
<p>PCR-based methods: Polymerase chain reaction (PCR) can be used to amplify and detect tandem repeats in genomic DNA. PCR primers are designed to flank the tandem repeat region, and amplification of the target DNA fragment can be visualized on a gel. This method can be useful for detecting novel tandem repeats and for genotyping.</p>
</li>
<li>
<p>Southern blotting: Southern blotting is a classic method for detecting DNA fragments in a sample. It can be used to detect tandem repeats by digesting genomic DNA with a restriction enzyme, separating the fragments by gel electrophoresis, and then probing the blot with a tandem repeat-specific probe.</p>
</li>
</ol><p>Overall, a combination of these methods can be used to comprehensively identify tandem repeats in genomes.</p></div></div></div></div></div></div></div></div></div></div>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/11457/commercial-and-public-next-gen-seq-ngs-software</guid>
	<pubDate>Tue, 03 Jun 2014 20:45:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/11457/commercial-and-public-next-gen-seq-ngs-software</link>
	<title><![CDATA[Commercial and public next-gen-seq (NGS) software]]></title>
	<description><![CDATA[<p><strong>Integrated solutions</strong><br /> <a href="http://www.clcbio.com/index.php?id=1240" target="_blank">CLCbio Genomics Workbench</a> - <em>de novo</em> and reference assembly of Sanger, Roche FLX, Illumina, Helicos, and SOLiD data. Commercial next-gen-seq software that extends the CLCbio Main Workbench software. Includes SNP detection, CHiP-seq, browser and other features. Commercial. Windows, Mac OS X and Linux.<br /><a href="http://g2.trac.bx.psu.edu/" target="_blank">Galaxy</a> - Galaxy = interactive and reproducible genomics. A job webportal.<br /> <a href="http://www.genomatix.de/products/index.html" target="_blank">Genomatix</a> - Integrated Solutions for Next Generation Sequencing data analysis.<br /> <a href="http://www.jmp.com/software/genomics/" target="_blank">JMP Genomics</a> - Next gen visualization and statistics tool from SAS. They are <a href="http://www.marketwatch.com/news/story/JMPR-Genomics-NCGR-Partnership-Foster/story.aspx?guid=%7B7AC9DE36-B6AA-4EDE-9CD5-633B29FE6154%7D" target="_blank">working with NCGR</a> to refine this tool and produce others.<br /> <a href="http://softgenetics.com/NextGENe.html" target="_blank">NextGENe</a> - <em>de novo</em> and reference assembly of Illumina, SOLiD and Roche FLX data. Uses a novel Condensation Assembly Tool approach where reads are joined via "anchors" into mini-contigs before assembly. Includes SNP detection, CHiP-seq, browser and other features. Commercial. Win or MacOS.<br /><a href="http://www.partek.com" target="_blank" title="Partek Incorporated">Partek</a>&nbsp;<span>- Commercial software for NGS, microarray, and qPCR data analysis. Streamlined analysis workflows for: ChIP-Seq, RNA-Seq, DNA-Seq, DNA Methylation, Gene Expression, Exon, miRNA Expression, Copy Number, Allele-Specific Copy Number, LOH, Association, Trio Analysis, and Tiling. Supports all commercial sequencing and microarray technologies.&nbsp;</span><br /> <a href="http://www.dnastar.com/products/SMGA.php" target="_blank">SeqMan Genome Analyser</a> - Software for Next Generation sequence assembly of Illumina, Roche FLX and Sanger data integrating with Lasergene Sequence Analysis software for additional analysis and visualization capabilities. Can use a hybrid templated/de novo approach. Commercial. Win or Mac OS X.<br /><a href="http://1001genomes.org/downloads/shore.html" target="_blank">SHORE</a> - SHORE, for Short Read, is a mapping and analysis pipeline for short DNA sequences produced on a Illumina Genome Analyzer. A suite created by the 1001 Genomes project. Source for POSIX.<br /> <a href="http://www.realtimegenomics.com/" target="_blank">SlimSearch</a> - Fledgling commercial product.<br />Synamatix has SXOligoSearch (<a href="http://synasite.mgrc.com.my:8080/sxog/NewSXOligoSearch.php" target="_blank">http://synasite.mgrc.com.my:8080/sxo...ligoSearch.php</a>)<br />The SWIFT suit is a software collection for fast index-based sequence comparison. It contains the following programs: SWIFT &mdash; fast local alignment search, guaranteeing to find epsilon-matches between two sequences; SWIFT BALSAM &mdash; a very fast program to find semiglobal non-gapped alignments based on k-mer seeds. <a href="http://bibiserv.techfak.uni-bielefeld.de/swift/" target="_blank">http://bibiserv.techfak.uni-bielefeld.de/swift/</a><br /><a href="http://http//bioinf.comav.upv.es/svn/biolib/biolib/src/" target="_blank">biolib</a>.is library and a set of script targeted to NGS. There are modules to: clean sequences (sanger, 454, ilumina), parse caf, ace and bowtie map files, clean and filter contigs, look for snps and indels., filter snps, do statistics for: reads, contigs and snps.</p><p><br /> <strong>Align/Assemble to a reference</strong><br /> <a href="https://secure.genome.ucla.edu/index.php/BFAST" target="_blank">BFAST</a> - Blat-like Fast Accurate Search Tool. Written by Nils Homer, Stanley F. Nelson and Barry Merriman at UCLA.<br /><a href="http://bowtie-bio.sourceforge.net/" target="_blank">Bowtie</a> - Ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of 25 million reads per hour on a typical workstation with 2 gigabytes of memory. Uses a Burrows-Wheeler-Transformed (BWT) index. <a href="http://seqanswers.com/forums/showthread.php?t=706" target="_blank">Link to discussion thread here</a>. Written by Ben Langmead and Cole Trapnell. Linux, Windows, and Mac OS X.<br /> <a href="http://maq.sourceforge.net/" target="_blank">BWA</a> - Heng Lee's BWT Alignment program - a progression from Maq. BWA is a fast light-weighted tool that aligns short sequences to a sequence database, such as the human reference genome. By default, BWA finds an alignment within edit distance 2 to the query sequence. C++ source.<br /> <a href="http://bioinfo.cgrb.oregonstate.edu/docs/solexa/" target="_blank">ELAND</a> - Efficient Large-Scale Alignment of Nucleotide Databases. Whole genome alignments to a reference genome. Written by Illumina author Anthony J. Cox for the Solexa 1G machine.<br /> <a href="http://www.ebi.ac.uk/%7Eguy/exonerate/" target="_blank">Exonerate</a> - Various forms of pairwise alignment (including Smith-Waterman-Gotoh) of DNA/protein against a reference. Authors are Guy St C Slater and Ewan Birney from EMBL. C for POSIX.<br /> <a href="http://1001genomes.org/downloads/genomemapper.html" target="_blank">GenomeMapper</a> - GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. A tool created by the 1001 Genomes project. Source for POSIX.<br /> <a href="http://www.gene.com/share/gmap/" target="_blank">GMAP</a> - GMAP (Genomic Mapping and Alignment Program) for mRNA and EST Sequences. Developed by Thomas Wu and Colin Watanabe at Genentec. C/Perl for Unix.<br /> <a href="http://dna.cs.byu.edu/gnumap/" target="_blank">gnumap</a> - The Genomic Next-generation Universal MAPper (gnumap) is a program designed to accurately map sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. It seeks to align reads from nonunique repeats using statistics. From authors at Brigham Young University. C source/Unix.<br /> <a href="http://sourceforge.net/projects/maq/" target="_blank">MAQ</a> - Mapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina with preliminary functions to handle ABI SOLiD data. Written by Heng Li from the Sanger Centre. Features extensive supporting tools for DIP/SNP detection, etc. C++ source<br /> <a href="http://bioinformatics.bc.edu/marthlab/Mosaik" target="_blank">MOSAIK</a> - MOSAIK produces gapped alignments using the Smith-Waterman algorithm. Features a number of support tools. Support for Roche FLX, Illumina, SOLiD, and Helicos. Written by Michael Str&ouml;mberg at Boston College. Win/Linux/MacOSX<br /> <a href="http://mrfast.sourceforge.net/" target="_blank">MrFAST and MrsFAST</a> - mrFAST &amp; mrsFAST are designed to map short reads generated with the Illumina platform to reference genome assemblies; in a fast and memory-efficient manner. Robust to INDELs and MrsFAST has a bisulphite mode. Authors are from the University of Washington. C as source.<br /> <a href="http://mummer.sourceforge.net/" target="_blank">MUMmer</a> - MUMmer is a modular system for the rapid whole genome alignment of finished or draft sequence. Released as a package providing an efficient suffix tree library, seed-and-extend alignment, SNP detection, repeat detection, and visualization tools. Version 3.0 was developed by Stefan Kurtz, Adam Phillippy, Arthur L Delcher, Michael Smoot, Martin Shumway, Corina Antonescu and Steven L Salzberg - most of whom are at The Institute for Genomic Research in Maryland, USA. POSIX OS required.<br /> <a href="http://www.novocraft.com/index.html" target="_blank">Novocraft</a> - Tools for reference alignment of paired-end and single-end Illumina reads. Uses a Needleman-Wunsch algorithm. Can support Bis-Seq. Commercial. Available free for evaluation, educational use and for use on open not-for-profit projects. Requires Linux or Mac OS X.<br /> <a href="http://pass.cribi.unipd.it/cgi-bin/pass.pl" target="_blank">PASS</a> - It supports Illumina, SOLiD and Roche-FLX data formats and allows the user to modulate very finely the sensitivity of the alignments. Spaced seed intial filter, then NW dynamic algorithm to a SW(like) local alignment. Authors are from CRIBI in Italy. Win/Linux.<br /> <a href="http://rulai.cshl.edu/rmap/" target="_blank">RMAP</a> - Assembles 20 - 64 bp Illumina reads to a FASTA reference genome. By Andrew D. Smith and Zhenyu Xuan at CSHL. (published in BMC Bioinformatics). POSIX OS required.<br /> <a href="http://biogibbs.stanford.edu/%7Ejiangh/SeqMap/" target="_blank">SeqMap</a> - Supports up to 5 or more bp mismatches/INDELs. Highly tunable. Written by Hui Jiang from the Wong lab at Stanford. Builds available for most OS's.<br /> <a href="http://compbio.cs.toronto.edu/shrimp/" target="_blank">SHRiMP</a> - Assembles to a reference sequence. Developed with Applied Biosystem's colourspace genomic representation in mind. Authors are Michael Brudno and Stephen Rumble at the University of Toronto. POSIX.<br /> <a href="http://www.bcgsc.ca/platform/bioinfo/software/slider" target="_blank"><span style="text-decoration: underline;">Slider</span></a>- An application for the Illumina Sequence Analyzer output that uses the probability files instead of the sequence files as an input for alignment to a reference sequence or a set of reference sequences. Authors are from BCGSC. Paper is <a href="http://seqanswers.com/forums/showthread.php?t=740" target="_blank">here</a>.<br /> <a href="http://soap.genomics.org.cn/" target="_blank">SOAP</a> - SOAP (Short Oligonucleotide Alignment Program). A program for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The updated version uses a BWT. Can call SNPs and INDELs. Author is Ruiqiang Li at the Beijing Genomics Institute. C++, POSIX.<br /> <a href="http://www.sanger.ac.uk/Software/analysis/SSAHA/" target="_blank">SSAHA</a> - SSAHA (Sequence Search and Alignment by Hashing Algorithm) is a tool for rapidly finding near exact matches in DNA or protein databases using a hash table. Developed at the Sanger Centre by Zemin Ning, Anthony Cox and James Mullikin. C++ for Linux/Alpha.<br /> <a href="http://socs.biology.gatech.edu/" target="_blank">SOCS</a> - Aligns SOLiD data. SOCS is built on an iterative variation of the Rabin-Karp string search algorithm, which uses hashing to reduce the set of possible matches, drastically increasing search speed. Authors are Ondov B, Varadarajan A, Passalacqua KD and Bergman NH.<br /> <a href="http://bibiserv.techfak.uni-bielefeld.de/swift/welcome.html" target="_blank">SWIFT</a> - The SWIFT suit is a software collection for fast index-based sequence comparison. It contains: SWIFT &mdash; fast local alignment search, guaranteeing to find epsilon-matches between two sequences. SWIFT BALSAM &mdash; a very fast program to find semiglobal non-gapped alignments based on k-mer seeds. Authors are Kim Rasmussen (SWIFT) and Wolfgang Gerlach (SWIFT BALSAM)<br /> <a href="http://synasite.mgrc.com.my:8080/sxog/NewSXOligoSearch.php" target="_blank">SXOligoSearch</a> - SXOligoSearch is a commercial platform offered by the Malaysian based <a href="http://www.synamatix.com/" target="_blank">Synamatix</a>. Will align Illumina reads against a range of Refseq RNA or NCBI genome builds for a number of organisms. Web Portal. OS independent.<br /> <a href="http://www.vmatch.de/" target="_blank">Vmatch</a> - A versatile software tool for efficiently solving large scale sequence matching tasks. Vmatch subsumes the software tool REPuter, but is much more general, with a very flexible user interface, and improved space and time requirements. Essentially a large string matching toolbox. POSIX.<br /> <a href="http://www.bioinformaticssolutions.com/products/zoom/index.php" target="_blank">Zoom</a> - ZOOM (Zillions Of Oligos Mapped) is designed to map millions of short reads, emerged by next-generation sequencing technology, back to the reference genomes, and carry out post-analysis. ZOOM is developed to be highly accurate, flexible, and user-friendly with speed being a critical priority. Commercial. Supports Illumina and SOLiD data.<br />NCGR uses GMAP (<a href="http://www.gene.com/share/gmap/" target="_blank">http://www.gene.com/share/gmap/</a>) to alignment Solexa reads. GMAP is free, though.<br />Exonerate (<a href="http://www.ebi.ac.uk/%7Eguy/exonerate/" target="_blank">http://www.ebi.ac.uk/~guy/exonerate/</a>)<br /> MUMmer (<a href="http://mummer.sourceforge.net/" target="_blank">http://mummer.sourceforge.net/</a>)<br /> The mapping short reads called gnumap (<a href="http://dna.cs.byu.edu/gnumap/" target="_blank">http://dna.cs.byu.edu/gnumap/</a>) made to increase the accuracy with duplicate matches. Open source, creates viewable output (with Affy's Integrated Genome Browser), and produces results very similar to novocraft's.<br /><a href="http://socs.biology.gatech.edu/" target="_blank">SOCS</a> (short oligonucleotides in color space)<br />BFAST <a href="https://secure.genome.ucla.edu/index.php/BFAST" target="_blank">https://secure.genome.ucla.edu/index.php/BFAST</a></p><p><br /> <strong><em>De novo</em> Align/Assemble</strong><br /> <a href="http://www.bcgsc.ca/platform/bioinfo/software/abyss" target="_blank">ABySS</a> - Assembly By Short Sequences. ABySS is a de novo sequence assembler that is designed for very short reads. The single-processor version is useful for assembling genomes up to 40-50 Mbases in size. The parallel version is implemented using MPI and is capable of assembling larger genomes. By Simpson JT and others at the Canada's Michael Smith Genome Sciences Centre. C++ as source. <br /> <a href="http://www.broad.mit.edu/science/programs/genome-biology/computational-rd/computational-research-and-development" target="_blank">ALLPATHS</a> - ALLPATHS: De novo assembly of whole-genome shotgun microreads. ALLPATHS is a whole genome shotgun assembler that can generate high quality assemblies from short reads. Assemblies are presented in a graph form that retains ambiguities, such as those arising from polymorphism, thereby providing information that has been absent from previous genome assemblies. Broad Institute.<br /> <a href="http://www.genomic.ch/edena.php" target="_blank">Edena</a> - Edena (Exact DE Novo Assembler) is an assembler dedicated to process the millions of very short reads produced by the Illumina Genome Analyzer. Edena is based on the traditional overlap layout paradigm. By D. Hernandez, P. Fran&ccedil;ois, L. Farinelli, M. Osteras, and J. Schrenzel. Linux/Win.<br /> <a href="http://euler-assembler.ucsd.edu/portal/" target="_blank">EULER-SR</a> - Short read <em>de novo</em> assembly. By Mark J. Chaisson and Pavel A. Pevzner from UCSD (published in Genome Research). Uses a de Bruijn graph approach.<br /> <a href="http://chevreux.org/projects_mira.html" target="_blank">MIRA2</a> - MIRA (Mimicking Intelligent Read Assembly) is able to perform true hybrid de-novo assemblies using reads gathered through 454 sequencing technology (GS20 or GS FLX). Compatible with 454, Solexa and Sanger data. Linux OS required.<br /> <a href="http://www.seqan.de/projects/consensus.html" target="_blank">SEQAN</a> - A Consistency-based Consensus Algorithm for De Novo and Reference-guided Sequence Assembly of Short Reads. By Tobias Rausch and others. C++, Linux/Win.<br /> <a href="http://sharcgs.molgen.mpg.de/" target="_blank">SHARCGS</a> - De novo assembly of short reads. Authors are Dohm JC, Lottaz C, Borodina T and Himmelbauer H. from the Max-Planck-Institute for Molecular Genetics.<br /> <a href="http://www.bcgsc.ca/platform/bioinfo/software/ssake" target="_blank">SSAKE</a> - The Short Sequence Assembly by K-mer search and 3' read Extension (SSAKE) is a genomics application for aggressively assembling millions of short nucleotide sequences by progressively searching for perfect 3'-most k-mers using a DNA prefix tree. Authors are Ren&eacute; Warren, Granger Sutton, Steven Jones and Robert Holt from the Canada's Michael Smith Genome Sciences Centre. Perl/Linux.<br /> <a href="http://soap.genomics.org.cn/" target="_blank">SOAPdenovo</a> - Part of the SOAP suite. See above. <br /> <a href="https://sourceforge.net/projects/vcake" target="_blank">VCAKE</a> - De novo assembly of short reads with robust error correction. An improvement on early versions of SSAKE.<br /> <a href="http://www.ebi.ac.uk/%7Ezerbino/velvet/" target="_blank">Velvet</a> - Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454. Need about 20-25X coverage and paired reads. Developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI).<br />SOAP (<a href="http://soap.genomics.org.cn" target="_blank">http://soap.genomics.org.cn</a>) by Ruiqiang Li, as has been pointed by ECO.<br />Euler-SR (Euler-Short Reads Assembly, <a href="http://euler-assembler.ucsd.edu/portal/" target="_blank">http://euler-assembler.ucsd.edu/portal/</a>) by Mark J. Chaisson and Pavel A. Pevzner from UCSD. (published in Genome Research)<br />RMAP (A program for mapping Solexa reads, <a href="http://rulai.cshl.edu/rmap/" target="_blank">http://rulai.cshl.edu/rmap/</a>) by Andrew D. Smith and Zhenyu Xuan at CSHL. (published in BMC Bioinformatics)<br />Short read aligner called Bowtie (<a href="http://bowtie-bio.sourceforge.net/" target="_blank">http://bowtie-bio.sourceforge.net/</a>) designed for fast mapping of Illumina reads<br /> <br /> <strong>SNP/Indel Discovery</strong><br /> <a href="http://www.sanger.ac.uk/Software/analysis/ssahaSNP/" target="_blank">ssahaSNP</a> - ssahaSNP is a polymorphism detection tool. It detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence. Highly repetitive elements are filtered out by ignoring those kmer words with high occurrence numbers. More tuned for ABI Sanger reads. Developers are Adam Spargo and Zemin Ning from the Sanger Centre. Compaq Alpha, Linux-64, Linux-32, Solaris and Mac<br /> <a href="http://bioinformatics.bc.edu/marthlab/PbShort" target="_blank">PolyBayesShort</a> - A re-incarnation of the PolyBayes SNP discovery tool developed by Gabor Marth at Washington University. This version is specifically optimized for the analysis of large numbers (millions) of high-throughput next-generation sequencer reads, aligned to whole chromosomes of model organism or mammalian genomes. Developers at Boston College. Linux-64 and Linux-32.<br /> <a href="http://bioinformatics.bc.edu/marthlab/PyroBayes" target="_blank">PyroBayes</a> - PyroBayes is a novel base caller for pyrosequences from the 454 Life Sciences sequencing machines. It was designed to assign more accurate base quality estimates to the 454 pyrosequences. Developers at Boston College.<br />Maq is also able to find SNPs with its own alignment. It has a graphical viewer, but again for its own alignment format.<br />SSAHA has been optimized for short-reads, too. But yes, SSAHASNP appears in your "SNP/INDEL discovery" category.<br /> <br /> <strong>Genome Annotation/Genome Browser/Alignment Viewer/Assembly Database</strong><br /> <a href="http://bioinformatics.bc.edu/marthlab/EagleView" target="_blank">EagleView</a> - An information-rich genome assembler viewer. EagleView can display a dozen different types of information including base quality and flowgram signal. Developers at Boston College.<br /> <a href="http://www.sanger.ac.uk/Software/analysis/lookseq/" target="_blank">LookSeq</a> - LookSeq is a web-based application for alignment visualization, browsing and analysis of genome sequence data. LookSeq supports multiple sequencing technologies, alignment sources, and viewing modes; low or high-depth read pileups; and easy visualization of putative single nucleotide and structural variation. From the Sanger Centre.<br /> <a href="http://evolution.sysu.edu.cn/mapview/" target="_blank">MapView</a> - MapView: visualization of short reads alignment on desktop computer. From the Evolutionary Genomics Lab at Sun-Yat Sen University, China. Linux.<br /> <a href="http://www.bcgsc.ca/platform/bioinfo/software/sam" target="_blank">SAM</a> - Sequence Assembly Manager. Whole Genome Assembly (WGA) Management and Visualization Tool. It provides a generic platform for manipulating, analyzing and viewing WGA data, regardless of input type. Developers are Rene Warren, Yaron Butterfield, Asim Siddiqui and Steven Jones at Canada's Michael Smith Genome Sciences Centre. MySQL backend and Perl-CGI web-based frontend/Linux. <br /> <a href="http://staden.sourceforge.net/" target="_blank">STADEN</a> - Includes GAP4. GAP5 once completed will handle next-gen sequencing data. A partially implemented test version is available <a href="https://sourceforge.net/project/show...kage_id=256957" target="_blank">here</a><br /> <a href="http://www.bcgsc.ca/platform/bioinfo/software/xmatchview" target="_blank">XMatchView</a> - A visual tool for analyzing cross_match alignments. Developed by Rene Warren and Steven Jones at Canada's Michael Smith Genome Sciences Centre. Python/Win or Linux.<br /> <br /> <strong>Counting e.g. CHiP-Seq, Bis-Seq, CNV-Seq</strong><br /> <a href="http://epigenomics.mcdb.ucla.edu/BS-Seq/download.html" target="_blank">BS-Seq</a> - The source code and data for the "Shotgun Bisulphite Sequencing of the Arabidopsis Genome Reveals DNA Methylation Patterning" Nature paper by <a href="http://www.ncbi.nlm.nih.gov/sites/entrez?holding=&amp;db=pubmed&amp;cmd=search&amp;term=Shotgun%20Bisulphite%20Sequencing" target="_blank">Cokus et al.</a> (Steve Jacobsen's lab at UCLA). POSIX.<br /> <a href="http://woldlab.caltech.edu/chipseq/" target="_blank">CHiPSeq</a> - Program used by Johnson et al. (2007) in their Science publication<br /> <a href="http://tiger.dbs.nus.edu.sg/cnv-seq/" target="_blank">CNV-Seq</a> - CNV-seq, a new method to detect copy number variation using high-throughput sequencing. Chao Xie and Martti T Tammi at the National University of Singapore. Perl/R.<br /> <a href="http://www.bcgsc.ca/platform/bioinfo/software/findpeaks" target="_blank">FindPeaks</a> - perform analysis of ChIP-Seq experiments. It uses a naive algorithm for identifying regions of high coverage, which represent Chromatin Immunoprecipitation enrichment of sequence fragments, indicating the location of a bound protein of interest. Original algorithm by Matthew Bainbridge, in collaboration with Gordon Robertson. Current code and implementation by Anthony Fejes. Authors are from the Canada's Michael Smith Genome Sciences Centre. JAVA/OS independent. Latest versions available as part of the <a href="http://vancouvershortr.sourceforge.net/" target="_blank">Vancouver Short Read Analysis Package</a><br /> <a href="http://liulab.dfci.harvard.edu/MACS/" target="_blank">MACS</a> - Model-based Analysis for ChIP-Seq. MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. Written by Yong Zhang and Tao Liu from Xiaole Shirley Liu's Lab. <br /> <a href="http://www.gersteinlab.org/proj/PeakSeq/" target="_blank">PeakSeq</a> - PeakSeq: Systematic Scoring of ChIP-Seq Experiments Relative to Controls. a two-pass approach for scoring ChIP-Seq data relative to controls. The first pass identifies putative binding sites and compensates for variation in the mappability of sequences across the genome. The second pass filters out sites that are not significantly enriched compared to the normalized input DNA and computes a precise enrichment and significance. By Rozowsky J et al. C/Perl.<br /> <a href="http://mendel.stanford.edu/sidowlab/downloads/quest/" target="_blank">QuEST</a> - Quantitative Enrichment of Sequence Tags. Sidow and Myers Labs at Stanford. From the 2008 publication <a href="http://www.ncbi.nlm.nih.gov/pubmed/18711362" target="_blank">Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data</a>. (C++)<br /> <a href="http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/" target="_blank">SISSRs</a> - Site Identification from Short Sequence Reads. BED file input. Raja Jothi @ NIH. Perl.<br />SeqMap (<a href="http://biogibbs.stanford.edu/%7Ejiangh/SeqMap/" target="_blank">http://biogibbs.stanford.edu/~jiangh/SeqMap/</a>) - work like ELand, can do 3 or more bp mismatches and also insdel<br />ChIPSeq analysis is:&nbsp; <a href="http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/" target="_blank">http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/</a></p><p>See also <a href="http://seqanswers.com/forums/showthread.php?t=742" target="_blank">this thread</a> for ChIP-Seq, until I get time to update this list.<br /> <br /> <strong>Alternate Base Calling</strong><br /> <a href="http://svitsrv25.epfl.ch/R-doc/library/Rolexa/html/00Index.html" target="_blank">Rolexa</a> - R-based framework for base calling of Solexa data. Project <a href="http://www.biomedcentral.com/1471-2105/9/431" target="_blank">publication</a><br /> <a href="http://hannonlab.cshl.edu/Alta-Cyclic/main.html" target="_blank">Alta-cyclic</a> - "a novel Illumina Genome-Analyzer (Solexa) base caller"<br /> <br /> <strong>Transcriptomics</strong><br /> <a href="http://woldlab.caltech.edu/rnaseq/" target="_blank">ERANGE</a> - Mapping and Quantifying Mammalian Transcriptomes by RNA-Seq. Supports Bowtie, BLAT and ELAND. From the Wold lab.<br /> <a href="http://www.genoscope.cns.fr/externe/gmorse/" target="_blank">G-Mo.R-Se</a> - G-Mo.R-Se is a method aimed at using RNA-Seq short reads to build de novo gene models. First, candidate exons are built directly from the positions of the reads mapped on the genome (without any ab initio assembly of the reads), and all the possible splice junctions between those exons are tested against unmapped reads. From CNS in France.<br /> <a href="http://evolution.sysu.edu.cn/english/software/mapnext.htm" target="_blank">MapNext</a> - MapNext: A software tool for spliced and unspliced alignments and SNP detection of short sequence reads. From the Evolutionary Genomics Lab at Sun-Yat Sen University, China.<br /> <a href="http://www.fml.tuebingen.mpg.de/raetsch/suppl/qpalma" target="_blank">QPalma</a> - Optimal Spliced Alignments of Short Sequence Reads. Authors are Fabio De Bona, Stephan Ossowski, Korbinian Schneeberger, and Gunnar R&auml;tsch. A paper is <a href="http://www.fml.tuebingen.mpg.de/raetsch/suppl/qpalma/qpalma-final.pdf" target="_blank">available</a>.<br /> <a href="http://biogibbs.stanford.edu/%7Ejiangh/rsat/" target="_blank">RSAT</a> - RSAT: RNA-Seq Analysis Tools. RNASAT is developed and maintained by Hui Jiang at Stanford University.<br /> <a href="http://tophat.cbcb.umd.edu/" target="_blank">TopHat</a> - TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons. TopHat is a collaborative effort between the University of Maryland and the University of California, Berkeley<br />NGS-Trex: Next Generation Sequencing Transcriptome profile explorer http://www.biomedcentral.com/1471-2105/14/S7/S10</p><p>Reference</p><p>Illumina has a software list: <a href="http://www.illumina.com/pagesnrn.ilmn?ID=245" target="_blank">http://www.illumina.com/pagesnrn.ilmn?ID=245</a>.</p><p>Some softwares in his blog (<a href="http://www.fejes.ca/labels/DNA.html" target="_blank">http://www.fejes.ca/labels/DNA.html</a>)</p><p><a href="http://seqanswers.com/wiki/Software" target="_blank">http://seqanswers.com/wiki/Software</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27099/rasttk-algorithm-for-building-custom-annotation-pipelines-and-annotating-batches-of-genomes</guid>
	<pubDate>Wed, 27 Apr 2016 11:07:59 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27099/rasttk-algorithm-for-building-custom-annotation-pipelines-and-annotating-batches-of-genomes</link>
	<title><![CDATA[RASTtk : algorithm for building custom annotation pipelines and annotating batches of genomes]]></title>
	<description><![CDATA[<p>The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.</p>
<p>More at http://www.nature.com/articles/srep08365</p><p>Address of the bookmark: <a href="http://rast.nmpdr.org/" rel="nofollow">http://rast.nmpdr.org/</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31209/dial</guid>
	<pubDate>Wed, 01 Mar 2017 08:42:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31209/dial</link>
	<title><![CDATA[DIAL]]></title>
	<description><![CDATA[<p>A computational pipeline for identifying single-base substitutions between two closely related genomes without the help of a reference genome. DIAL works even when the depth of coverage is insufficient for de novo assembly, and it can be extended to determine small insertions/deletions. Our main motivation is to use this tool to survey the genetic diversity of endangered species as the identified sequence differences can be used to design genotyping arrays to assist in the species' management.</p>
<p>http://www.bx.psu.edu/~ratan/</p><p>Address of the bookmark: <a href="http://www.bx.psu.edu/miller_lab/" rel="nofollow">http://www.bx.psu.edu/miller_lab/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36632/tulip-the-uncorrected-long-read-integration-pipeline</guid>
	<pubDate>Tue, 15 May 2018 09:06:37 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36632/tulip-the-uncorrected-long-read-integration-pipeline</link>
	<title><![CDATA[TULIP - The Uncorrected Long read Integration Pipeline]]></title>
	<description><![CDATA[TULIP currently consists of two Perl scripts, tulipseed.perl and tulipbulb.perl. These are very much intended as prototypes, and additional components and/or implementations are likely to follow.

Tulipseed takes as input alignments files of long reads to sparse short seeds, and outputs a graph and scaffold structures.<p>Address of the bookmark: <a href="https://github.com/Generade-nl/TULIP" rel="nofollow">https://github.com/Generade-nl/TULIP</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>