<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44223?offset=180</link>
	<atom:link href="https://bioinformaticsonline.com/related/44223?offset=180" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27035/spades</guid>
	<pubDate>Tue, 19 Apr 2016 08:37:08 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27035/spades</link>
	<title><![CDATA[SPAdes]]></title>
	<description><![CDATA[<p>SPAdes &ndash; St. Petersburg genome assembler &ndash; is intended for both standard isolates and single-cell MDA bacteria assemblies. This manual will help you to install and run SPAdes. SPAdes version 3.7.1 was released under GPLv2 on March 8, 2016 and can be downloaded from <a href="http://bioinf.spbau.ru/en/spades" target="_blank">http://bioinf.spbau.ru/en/spades</a>.</p>
<p>Manual at http://spades.bioinf.spbau.ru/release3.7.1/manual.html</p><p>Address of the bookmark: <a href="http://bioinf.spbau.ru/spades" rel="nofollow">http://bioinf.spbau.ru/spades</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27328/platanus</guid>
	<pubDate>Fri, 13 May 2016 05:12:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27328/platanus</link>
	<title><![CDATA[Platanus]]></title>
	<description><![CDATA[<p>Platanus is a novel <em>de novo</em> sequence assembler that can reconstruct genomic sequences of<br> highly heterozygous diploids from massively parallel shotgun sequencing data.</p>
<p>The latest version is <a href="http://platanus.bio.titech.ac.jp/platanus/?page_id=14">1.2.4</a>.</p>
<p>To cite Platanus, please use the following:</p>
<p>Kajitani R, Toshimoto K, Noguchi H, Toyoda A, Ogura Y, Okuno M, Yabana M, Harada M, Nagayasu E, Maruyama H, Kohara Y, Fujiyama A, Hayashi T, Itoh T, &ldquo;Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads&rdquo;.&nbsp;Genome Res. 2014 Aug;24(8):1384-95. doi: 10.1101/gr.170720.113. [<a href="http://www.ncbi.nlm.nih.gov/pubmed/24755901">abstract</a> |<a href="http://genome.cshlp.org/content/24/8/1384.long"> full text</a>]</p><p>Address of the bookmark: <a href="http://platanus.bio.titech.ac.jp/" rel="nofollow">http://platanus.bio.titech.ac.jp/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/30140/cutadapt</guid>
	<pubDate>Wed, 14 Dec 2016 09:59:52 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/30140/cutadapt</link>
	<title><![CDATA[Cutadapt]]></title>
	<description><![CDATA[<p>Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.</p>
<p>Cutadapt helps with these trimming tasks by finding the adapter or primer sequences in an error-tolerant way. It can also modify and filter reads in various ways. Adapter sequences can contain IUPAC wildcard characters. Also, paired-end reads and even colorspace data is supported. If you want, you can also just demultiplex your input data, without removing adapter sequences at all.</p>
<p>Cutadapt comes with an extensive suite of automated tests and is available under the terms of the MIT license.</p>
<p>If you use cutadapt, please cite&nbsp;<a href="http://dx.doi.org/10.14806/ej.17.1.200">DOI:10.14806/ej.17.1.200</a>&nbsp;.</p>
<p>More at&nbsp;https://github.com/marcelm/cutadapt</p><p>Address of the bookmark: <a href="http://cutadapt.readthedocs.io/en/stable/guide.html" rel="nofollow">http://cutadapt.readthedocs.io/en/stable/guide.html</a></p>]]></description>
	<dc:creator>Bulbul</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32633/a-post-assembly-genome-improvement-toolkit-pagit-to-obtain-annotated-genomes-from-contigs</guid>
	<pubDate>Fri, 12 May 2017 10:50:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32633/a-post-assembly-genome-improvement-toolkit-pagit-to-obtain-annotated-genomes-from-contigs</link>
	<title><![CDATA[A Post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs]]></title>
	<description><![CDATA[<p>PAGIT addresses the need for software to generate high quality draft genomes. It is based on a series of programs that we developed:</p>
<p><a href="https://sourceforge.net/projects/abacas/files/">ABACAS</a>, that is able to contiguate contigs from a de novo assembly against a closely related reference.</p>
<p><a href="https://sourceforge.net/projects/image2/files/">IMAGE</a>, an iterative approach for closing gaps in assembled genomes using mate pair information. It is able to close gaps left open by the assembler in a draft genome, even when using the same data sets as used by the original assembler.</p>
<p><a href="http://icorn.sourceforge.net/">iCORN</a>, that enables errors in the consensus sequence to be corrected by iteratively mapping reads to the current assembly. An improved version, especially correction Pacfic Bioscience assemblies (PacBio) can be found&nbsp;<a href="ftp://ftp.sanger.ac.uk/pub4/resources/software/pagit/ICORN2/icorn2.V0.95.tgz">here</a>.</p>
<p><a href="https://ratt.svn.sourceforge.net/svnroot/ratt">RATT</a>, a tool to transfer the annotation from a reference genome, or an earlier assembly, onto the latest assembly.</p>
<p>PAGIT bundles these software and makes them more accessible for users.</p><p>Address of the bookmark: <a href="http://www.sanger.ac.uk/science/tools/pagit" rel="nofollow">http://www.sanger.ac.uk/science/tools/pagit</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35540/hinge-long-read-assembly-achieves-optimal-repeat-resolution</guid>
	<pubDate>Wed, 07 Feb 2018 09:40:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35540/hinge-long-read-assembly-achieves-optimal-repeat-resolution</link>
	<title><![CDATA[HINGE: Long-Read Assembly Achieves Optimal Repeat Resolution]]></title>
	<description><![CDATA[<p>Software accompanying "HINGE: Long-Read Assembly Achieves Optimal Repeat Resolution"</p>
<ul>
<li>
<p>Preprint:&nbsp;<a href="http://biorxiv.org/content/early/2016/08/01/062117">http://biorxiv.org/content/early/2016/08/01/062117</a></p>
</li>
<li>
<p>Paper:&nbsp;<a href="http://genome.cshlp.org/content/27/5/747.full">http://genome.cshlp.org/content/27/5/747.full</a></p>
</li>
<li>
<p>An ipython notebook to reproduce results in the paper can be found in this&nbsp;<a href="https://github.com/govinda-kamath/HINGE-analyses">repository</a>.</p>
</li>
</ul>
<p>HINGE is an OLC(Overlap-Layout-Consensus) assembler. The idea of the pipeline is shown below.</p>
<p><a href="https://github.com/HingeAssembler/HINGE/blob/master/misc/High_level_overview.png" target="_blank"><img src="https://github.com/HingeAssembler/HINGE/raw/master/misc/High_level_overview.png" alt="image" style="border: 0px;"></a></p><p>Address of the bookmark: <a href="https://github.com/HingeAssembler/HINGE" rel="nofollow">https://github.com/HingeAssembler/HINGE</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36514/evidentialgene-tr2aacds-mrna-transcript-assembly-software</guid>
	<pubDate>Tue, 08 May 2018 04:39:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36514/evidentialgene-tr2aacds-mrna-transcript-assembly-software</link>
	<title><![CDATA[EvidentialGene: tr2aacds, mRNA Transcript Assembly Software]]></title>
	<description><![CDATA[<p><span>EvidentialGene is a genome informatics project, "Evidence Directed Gene Construction for Eukaryotes", to construct high quality, accurate gene sets for animals and plants, developed by Don Gilbert at Indiana University, see</span><br><a href="http://arthropods.eugenes.org/EvidentialGene/" target="_blank">http://arthropods.eugenes.org/EvidentialGene/<span></span></a><br><br><span>Construction refers to the combination of classical gene prediction, and more recent gene assembly (de-novo and genome-assisted) methods. The basic Evigene methods involve using available best-of-breed gene prediction and assembly software, combining all evidence for genes, from expressed sequences, genome assembly sequences, related species protein sequences, and any other, to annotate and score gene constructions. Over-produced constructions are classified by gene evidence for best qualities per "locus", including genome-aligned and gene-transcript aligned (genome-free) locus identification. All software developed for EvidentialGene is publicly available. See project wiki/blog for notes.</span></p>
<p><span>Download&nbsp;</span></p>
<p>http://arthropods.eugenes.org/EvidentialGene/trassembly.html</p>
<p>https://sourceforge.net/p/evidentialgene/blog/</p><p>Address of the bookmark: <a href="http://arthropods.eugenes.org/EvidentialGene/trassembly.html" rel="nofollow">http://arthropods.eugenes.org/EvidentialGene/trassembly.html</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/42626/spades-team-announce-new-version-spades-v315</guid>
	<pubDate>Fri, 15 Jan 2021 10:24:27 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/42626/spades-team-announce-new-version-spades-v315</link>
	<title><![CDATA[SPADes team announce new version SPADes v3.15]]></title>
	<description><![CDATA[<p>New SPAdes 3.15.0.0. announced by the SPADes team This release includes such new features as:&nbsp;<br />- CoronaSPAdes pipeline for the assembly of transcriptomic and metatranscriptomic data of full-length coronaviridae genomes;&nbsp;<br />- Meta-Viral and RNA-Viral pipelines for metagenomic and metatranscriptomic data defining viral genomes;&nbsp;<br />-New trusted contiguous use algorithm;&nbsp;<br />-Switched to the memory allocator mimalloc;&nbsp;<br />- PlasmidSPAdes and bgcSPAdes are now provided as an input assembly graph;&nbsp;<br />- Important improvements and corrections to the metaplasmid pipeline;&nbsp;<br />- Multiple performance improvements in procedures for simplification and repeat resolving.&nbsp;<br />Please, consider updating.</p><p>Check out more at&nbsp;https://cab.spbu.ru/software/spades/</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</guid>
	<pubDate>Tue, 23 Mar 2021 05:32:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</link>
	<title><![CDATA[Public Databases for Bioinformatics !]]></title>
	<description><![CDATA[<pre>https://www.nature.com/articles/s41467-020-17155-y<br><br>Server Infrastructure:

File Server:

dhara: Synology 3614 Storage Appliance
4 Core Xeon
108TB disk storage
10Gb ethernet to SCG3
Access atx: dhara:5000
Has btsync server (try it - its much better than dropbox)

Compute Servers:

nandi: Kundaje and Phi Server
24 intel cores
256GB RAM
500GB of SSD storage 
36TB RAID6 local storage
4 Intel Phi's (space for 4 more GPU's)


durga: Montgomery and sensitive data
24 intel cores
256GB RAM
500GB of SSD RAID0 storage 
60TB RAID6 local storage

mitra: Bassik and Web/DB Server
24 core
256GB RAM 
500GB of SSD RAID0 storage 
36TB RAID6 local storage

vayu: Kundaje GPU server
4 core
64GB RAM 
200GB of SSD storage 
8TB RAID10 local storage
4 Nvidia GTX 970 4GB GPUs

amold: Bickel and SGE server
32 AMD core
128GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

wotan: Bickel and SGE server
64 AMD core
256GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

Filesystem:

/users/$USER
default home directory
full backups nightly 
nfs mount to dhara
should store code, papers, and other highly processed data here

/mnt/data/
globally accessible data
should store common data here
e.g. genomes and indexes, annotations, ENCODE data  
if you dont want this to count towards your quote you must chown

/mnt/lab_data/$LAB/
lab accessible data
should store lab project data here 
e.g. ATAC-seq prediction data, enhancer prediction, motif calls

/srv/scratch/$USER
fast local storage
not backed up, but on raid and data will never be deleted
most analysis should be performed here

/srv/persistent/$USER
fast local storage
synced nightly, but not backed up
       ie if the hard drives fail or you delete something and notice 
       within 24 hours we can recover. Otherwise not. (vs home which is 
       properly backed up )  
intermediate analysis products that would be hard to recover should be stored here 
       e.g. stochastic analysis results that need to be kept so that paper 
       results can be reproduced

/srv/www/$LABNAME/
web accessible from mitra.stanford.edu
*NOT BACKED UP*

Some parallel programming patterns:

# gzip a bunch of files
parallel gzip -- *.FILESTOGZIP

# fork example in python:
(for more detailed examples look at 
 https://github.com/nboley/grit/ grit/lib/multiprocessing_utils.py)

import os
import time
import random

import multiprocessing

class ProcessSafeOPStream( object ):
    def __init__( self, writeable_obj ):
        self.writeable_obj = writeable_obj
        self.lock = multiprocessing.Lock()
        self.name = self.writeable_obj.name
        return
    
    def write( self, data ):
        self.lock.acquire()
        self.writeable_obj.write( data )
        self.writeable_obj.flush()
        self.lock.release()
        return
    
    def close( self ):
        self.writeable_obj.close()

def worker(queue, ofp):
    # Try without this
    random.seed()
    while True:
        i = queue.get()
        if i == 'FINISHED': return
        # simulate an expensive function
        x = random.random()
        time.sleep(x/10)
        print i, x
        ofp.write("%i\t%s\n" % (i, x))

NSIMS = 10000
NPROC = 25

# populate queue
todo = multiprocessing.Queue()
for i in xrange(NSIMS): todo.put(i)
for i in xrange(NPROC): todo.put('FINISHED')

ofp = ProcessSafeOPStream( open("output.txt", "w") )

pids = []
for i in xrange(NPROC):
    pid = os.fork()
    if pid == 0:
       worker(todo, ofp)
       os._exit(0)
    else:
       pids.append(pid)  

for pid in pids:
    os.waitpid(pid, 0)

ofp.close()

print "FINISHED"<br><br></pre>
<p>For use case 1 we obtained the following ENCODE and ROADMAP datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz">https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam">https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam">https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam</a>. Blacklisted regions were obtained from&nbsp;<a href="http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz">http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz</a>. The human genome version hg38 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz</a>.</p>
<p>For use case 2 we used the set of narrowPeak files summarized in&nbsp;<a href="https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt">https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt</a>&nbsp;(archived version v1.0.1). The human genome version hg19 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz</a></p>
<p>For use case 3 we used the ENCODE datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam">https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig">https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam">https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam</a>&nbsp;as we as the GENCODE annotation v29 from&nbsp;<a href="ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz">ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz</a>.</p><p>Address of the bookmark: <a href="http://mitra.stanford.edu/" rel="nofollow">http://mitra.stanford.edu/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34519/bandage-interactive-visualization-of-de-novo-genome-assemblies</guid>
	<pubDate>Mon, 04 Dec 2017 10:09:37 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34519/bandage-interactive-visualization-of-de-novo-genome-assemblies</link>
	<title><![CDATA[Bandage: interactive visualization of de novo genome assemblies]]></title>
	<description><![CDATA[<p>Bandage (a Bioinformatics Application for Navigating&nbsp;<em>De&nbsp;novo</em>&nbsp;Assembly Graphs Easily) is a tool for visualizing assembly graphs with connections. Users can zoom in to specific areas of the graph and interact with it by moving nodes, adding labels, changing colors and extracting sequences. BLAST searches can be performed within the Bandage graphical user interface and the hits are displayed as highlights in the graph. By displaying connections between contigs, Bandage presents new possibilities for analyzing&nbsp;<em>de novo</em>&nbsp;assemblies that are not possible through investigation of contigs alone.</p>
<p><strong>Availability and implementation:</strong>&nbsp;Source code and binaries are freely available at&nbsp;<a href="https://github.com/rrwick/Bandage" target="pmc_ext">https://github.com/rrwick/Bandage</a>. Bandage is implemented in C++ and supported on Linux, OS X and Windows. A full feature list and screenshots are available at&nbsp;<a href="http://rrwick.github.io/Bandage" target="pmc_ext">http://rrwick.github.io/Bandage</a>.</p><p>Address of the bookmark: <a href="http://rrwick.github.io/Bandage/" rel="nofollow">http://rrwick.github.io/Bandage/</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34685/tools-for-bacterial-whole-genome-annotation</guid>
	<pubDate>Sat, 16 Dec 2017 17:37:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34685/tools-for-bacterial-whole-genome-annotation</link>
	<title><![CDATA[Tools for bacterial whole genome annotation]]></title>
	<description><![CDATA[<p><a href="http://rast.nmpdr.org/">RAST</a>&nbsp;&ndash;&nbsp;Web tool (upload contigs), uses the subsystems in the SEED database and&nbsp;provides detailed annotation and pathway analysis. Takes several hours per genome but I think this is the best way to get a high quality annotation (if you have only a few genomes to annotate).</p><p><a href="http://www.vicbioinformatics.com/software.prokka.shtml">Prokka</a>&nbsp;&ndash;&nbsp;Standalone command line tool, takes just a few minutes per genome.&nbsp;This is the best way to get good quality annotation in a flash, which is particularly useful if you have loads of genomes or need to annotate a pangenome or metagenome. Note however that the quality of functional information is not as good as RAST, and you&nbsp;will need several extra steps if you want to do&nbsp;functional profiling and pathway analysis of your genome(s)&hellip; which is in-built in RAST.</p><p>NCBI Prokaryotic Genome Annotation Pipeline is designed to annotate bacterial and archaeal genomes (chromosomes and plasmids).</p><p>Genome annotation is a multi-level process that includes prediction of protein-coding genes, as well as other functional genome units such as structural RNAs, tRNAs, small RNAs, pseudogenes, control regions, direct and inverted repeats, insertion sequences, transposons and other mobile elements.</p><p><a href="https://www.ncbi.nlm.nih.gov/genome/annotation_prok/">PGAP</a>: NCBI has developed an automatic prokaryotic genome annotation pipeline that combines&nbsp;<em>ab initio</em>&nbsp;gene prediction algorithms with homology based methods. The first version of NCBI Prokaryotic Genome Automatic Annotation Pipeline (PGAAP;&nbsp;<a href="https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=pubmed&amp;dopt=Abstract&amp;list_uids=18416670">see Pubmed Article</a>) developed in 2005 has been replaced with an upgraded version that is capable of processing a larger data volume.&nbsp; NCBI's annotation pipeline depends on several internal databases and is not currently available for download or use outside of the NCBI environment.</p><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC453985">BEACON</a> (automated tool for Bacterial GEnome Annotation ComparisON), a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at:&nbsp;<a href="http://www.cbrc.kaust.edu.sa/BEACON/" target="pmc_ext">http://www.cbrc.kaust.edu.sa/BEACON/</a>.</p><p><a href="http://www.kegg.jp/blastkoala/">BlastKOLA</a>: Assigns K numbers to the user's sequence data by BLAST searches, respectively, against a nonredundant set of KEGG GENES. KOALA (KEGG Orthology And Links Annotation) is KEGG's internal annotation tool for K number assignment of KEGG GENES using SSEARCH computation. Annotate Sequence in KEGG Mapper and Pathogen Checker in KEGG Pathogen are special interfaces to this server and can be executed in an interactive mode. BlastKOALA is suitable for annotating fully sequenced genomes.</p><p><a href="http://www.sanger.ac.uk/science/tools/pagit">PAGIT</a>: Provides a toolkit for improving the quality of genome assemblies created via an assembly software. PAGIT compiled four tools: (i) ABACAS which classifies and orientates contigs and estimates the sizes of gaps between them; (ii) IMAGE uses paired-end reads to extend contigs and close gaps within the scaffolds; (iii) ICORN for identifying and correcting small errors in consensus sequences and; (iv) RATT for help annotation. The software was mainly created to analyze parasite genomes of up to about 300 Mb.</p><p><a href="http://www.yandell-lab.org/software/maker.html">MAKER: </a>A portable and easily configurable genome annotation pipeline. MAKER allows smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. It identifies repeats, aligns ESTs and proteins to a genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence-based quality values. MAKER's inputs are minimal and its ouputs can be directly loaded into a Generic Model Organism Database (GMOD). They can also be viewed in the Apollo genome browser; this feature of MAKER provides an easy means to annotate, view and edit individual contigs and BACs without the overhead of a database. MAKER is available for download and can be tested online via the MAKER Web Annotation Service (MWAS).</p><p><a href="https://www.sciencedirect.com/science/article/pii/S0167701215001207">MyPro</a> is a software pipeline for high-quality prokaryotic genome assembly and annotation. It was validated on 18 oral streptococcal strains to produce submission-ready, annotated draft genomes. MyPro installed as a virtual machine and supported by updated databases will enable biologists to perform quality prokaryotic genome assembly and annotation with ease.</p>]]></description>
	<dc:creator>Radha Agarkar</dc:creator>
</item>

</channel>
</rss>