<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43867?offset=220</link>
	<atom:link href="https://bioinformaticsonline.com/related/43867?offset=220" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27961/nearhgt</guid>
	<pubDate>Wed, 22 Jun 2016 05:41:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27961/nearhgt</link>
	<title><![CDATA[NearHGT]]></title>
	<description><![CDATA[<p>Horizontal gene transfer (HGT), the transfer of genetic material between organisms, is crucial for genetic innovation and the evolution of genome architecture. Existing HGT detection algorithms rely on a strong phylogenetic signal distinguishing the transferred sequence from ancestral (vertically derived) genes in its recipient genome. Detecting HGT between closely related species or strains is challenging, as the phylogenetic signal is usually weak and the nucleotide composition is normally nearly identical. Nevertheless, there is a great importance in detecting HGT between congeneric species or strains, especially in clinical microbiology, where understanding the emergence of new virulent and drug-resistant strains is crucial, and often time-sensitive.</p>
<p>We developed a novel, self-contained technique named&nbsp;<em>Near HGT</em>, based on the&nbsp;<em>synteny index</em>, to measure the divergence of a gene from its native genomic environment and used it to identify candidate HGT events between closely related strains. The method confirms candidate transferred genes based on the&nbsp;<em>constant relative mutability</em>&nbsp;(CRM). Using CRM, the algorithm assigns a confidence score based on &ldquo;unusual&rdquo; sequence divergence. A gene exhibiting exceptional deviations according to both synteny and mutability criteria, is considered a validated HGT product. We first employed the technique to a set of three&nbsp;<em>E. coli</em>&nbsp;strains and detected several highly probable horizontally acquired genes. We then compared the method to existing HGT detection tools using a larger strain data set.</p>
<p>When combined with additional approaches our new algorithm provides richer picture and brings us closer to the goal of detecting all newly acquired genes in a particular strain.</p>
<p><strong>Availability:</strong><span>&nbsp;The method is publicly available at</span><a href="http://research.haifa.ac.il/~ssagi/software/nearHGT.zip">http://research.haifa.ac.il/~ssagi/software/nearHGT.zip</a></p><p>Address of the bookmark: <a href="http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004408" rel="nofollow">http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004408</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37460/revigo-reduced-visualize-gene-ontology</guid>
	<pubDate>Tue, 31 Jul 2018 05:28:42 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37460/revigo-reduced-visualize-gene-ontology</link>
	<title><![CDATA[REVIGO: Reduced Visualize gene ontology]]></title>
	<description><![CDATA[<div>REViGO can take long lists of Gene Ontology terms and summarize them by removing redundant GO terms. The remaining terms can be visualized in semantic similarity-based scatterplots, interactive graphs, or tag clouds.&nbsp;<a href="http://dx.doi.org/10.1371/journal.pone.0021800">More about REViGO...</a>&nbsp;|&nbsp;<a href="http://revigo.irb.hr/about_hr.jsp"><img src="http://revigo.irb.hr/gfx/croatian-wCrown.png" alt="In Croatian" title="" width="12" height="15" style="border: 0px;"></a></div>
<div>Please enter a list of Gene Ontology IDs below, each on its own line. The GO IDs may be followed by p-values or another quantity which describes the GO term in a way meaningful to you.&nbsp;<img src="http://revigo.irb.hr/gfx/qmark.png" alt="For instance, you may provide a p-value          (statistical significance), a fold change, enrichment, or some          directly measured quantity such as average signal intensity from          microarrays, ion count from mass spec, or read count from RNA-seq.          You may also provide more than one value per line, although only the          first value will be used in GO term selection/clustering." title="" width="16" height="15" style="border: 0px;"></div><p>Address of the bookmark: <a href="http://revigo.irb.hr/" rel="nofollow">http://revigo.irb.hr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35802/bioinformatics-tools-to-detect-horizontal-gene-transfer-hgt-in-genomes</guid>
	<pubDate>Fri, 02 Mar 2018 04:56:23 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35802/bioinformatics-tools-to-detect-horizontal-gene-transfer-hgt-in-genomes</link>
	<title><![CDATA[Bioinformatics tools to detect horizontal gene transfer (HGT) in genomes]]></title>
	<description><![CDATA[<p>Horizontal gene transfer (HGT), the &ldquo;non-sexual movement of genetic material between two organisms&rdquo; , is relatively common in prokaryotes&nbsp;and single-celled eukaryotes, but a number of factors combine to make it far rarer in multicellular eukaryotes. In order for a eukaryotic species to gain a gene by HGT, foreign DNA must enter the host nucleus, integrate into the genome, and in more complex organisms it must enter the sequestered germline in order to be transmitted to offspring. Once there, it must not experience strong negative selection, despite potential for genetic incompatibility with the host genome and mismatch between the niche of the donor and the host. Over the longer term, foreign DNA may become &ldquo;domesticated&rdquo; in the recipient genome and provide novel function.</p><p>Following are the popular tool to detect HGT in genomes:</p><p><a href="http://www.trex.uqam.ca/index.php?action=hgt&amp;project=trex">T-REX</a>&nbsp;/&nbsp;<a href="http://www.trex.uqam.ca/download/hgt-detection_3.22.zip">3.22</a></p><p>HGT detection /&nbsp;download &amp; compile</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/20525630">20525630</a></p><p>&nbsp;</p><p><a href="http://compbio.engr.uconn.edu/software/RANGER-DTL/">RANGER-DTL</a>&nbsp;/&nbsp;<a href="http://compbio.engr.uconn.edu/software/RANGER-DTL/Linux.zip">2.0</a></p><p>HGT detection /&nbsp;download binary</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/22689773">22689773</a></p><p>&nbsp;</p><p><a href="https://bioinfocs.rice.edu/phylonet">PhyloNet</a>&nbsp;/&nbsp;<a href="https://bioinfocs.rice.edu/sites/g/files/bxs266/f/kcfinder/files/PhyloNet_3.6.1.jar">3.6.1</a></p><p>HGT detection /&nbsp;download binary</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/18662388">18662388</a></p><p>&nbsp;</p><p><a href="https://www.cs.hmc.edu/~hadas/jane/index.html">Jane</a>&nbsp;/&nbsp;<a href="https://www.cs.hmc.edu/~hadas/jane/form.html">4.01</a></p><p>HGT detection /&nbsp;download binary (!license!)</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/20181081">20181081</a></p><p>&nbsp;</p><p><a href="http://www.tree-puzzle.de/">TREE-PUZZLE</a>&nbsp;/&nbsp;<a href="http://www.tree-puzzle.de/tree-puzzle-5.3.rc16-linux.tar.gz">5.3.rc16</a></p><p>HGT detection /&nbsp;download &amp; compile</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/11934758">11934758</a></p><p>&nbsp;</p><p><a href="http://www.sigmath.es.osaka-u.ac.jp/shimo-lab/prog/consel/">CONSEL</a>&nbsp;/&nbsp;<a href="http://www.sigmath.es.osaka-u.ac.jp/shimo-lab/prog/consel/pub/cnsls020.tgz">0.20</a></p><p>HGT detection /&nbsp;download</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/11751242">11751242</a></p><p>&nbsp;</p><p><a href="http://darkhorse.ucsd.edu/">DarkHorse</a>&nbsp;/&nbsp;<a href="http://darkhorse.ucsd.edu/DarkHorse-1.5_rev170.tar.gz">1.5 rev170</a></p><p>HGT detection /&nbsp;download &amp; install</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/17274820">17274820</a></p><p>&nbsp;</p><p><a href="https://github.com/DittmarLab/HGTector">HGTector</a>&nbsp;/&nbsp;<a href="https://github.com/DittmarLab/HGTector/archive/wgshgt.zip">0.2.1</a></p><p>HGT detection /&nbsp;git clone</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/25159222">25159222</a></p><p>&nbsp;</p><p><a href="http://www5.esu.edu/cpsc/bioinfo/software/EGID/">EGID</a>&nbsp;/&nbsp;<a href="http://www5.esu.edu/cpsc/bioinfo/software/EGID/EGID_1.0.tar.gz">1.0</a></p><p>HGT detection /&nbsp;download</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/22355228">22355228</a></p><p>&nbsp;</p><p><a href="http://exon.gatech.edu/GeneMark/">GeneMarkS</a>&nbsp;/&nbsp;<a href="http://exon.gatech.edu/GeneMark/license_download.cgi">4.30</a></p><p>HGT detection / download binary (!license!)</p><p><a href="https://www.ncbi.nlm.nih.gov/pubmed/9461475">9461475</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38752/hgtector-an-automated-method-facilitating-genome-wide-discovery-of-putative-horizontal-gene-transfers</guid>
	<pubDate>Mon, 21 Jan 2019 06:50:05 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38752/hgtector-an-automated-method-facilitating-genome-wide-discovery-of-putative-horizontal-gene-transfers</link>
	<title><![CDATA[HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers]]></title>
	<description><![CDATA[<p>A computational pipeline for genome-wide detection of putative horizontal gene transfer (HGT) events based on sequence homology search hit distribution statistics</p>
<p>Authors: Qiyun Zhu (<a href="mailto:qiyunzhu@gmail.com">qiyunzhu@gmail.com</a>), Katharina Dittmar (<a href="mailto:katharinad@gmail.com">katharinad@gmail.com</a>)</p>
<p>Affiliation: Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, USA</p>
<p>Zhu Q, Kosoy M, Dittmar K. HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers.&nbsp;<em style="font-size: 12.8px;">BMC Genomics</em>. 2014. 15:717.</p>
<p>Usage: Simply execute&nbsp;<span style="font-size: 12.8px;">perl HGTector.pl</span>, or, open&nbsp;<span style="font-size: 12.8px;">GUI.html</span>&nbsp;in a web browser to see a step-by-step wizard.</p>
<p>Download&nbsp;<a href="https://github.com/DittmarLab/HGTector/archive/0.2.2.zip">HGTector 0.2.2</a>.</p><p>Address of the bookmark: <a href="https://github.com/DittmarLab/HGTector" rel="nofollow">https://github.com/DittmarLab/HGTector</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</guid>
	<pubDate>Tue, 23 Mar 2021 05:32:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</link>
	<title><![CDATA[Public Databases for Bioinformatics !]]></title>
	<description><![CDATA[<pre>https://www.nature.com/articles/s41467-020-17155-y<br><br>Server Infrastructure:

File Server:

dhara: Synology 3614 Storage Appliance
4 Core Xeon
108TB disk storage
10Gb ethernet to SCG3
Access atx: dhara:5000
Has btsync server (try it - its much better than dropbox)

Compute Servers:

nandi: Kundaje and Phi Server
24 intel cores
256GB RAM
500GB of SSD storage 
36TB RAID6 local storage
4 Intel Phi's (space for 4 more GPU's)


durga: Montgomery and sensitive data
24 intel cores
256GB RAM
500GB of SSD RAID0 storage 
60TB RAID6 local storage

mitra: Bassik and Web/DB Server
24 core
256GB RAM 
500GB of SSD RAID0 storage 
36TB RAID6 local storage

vayu: Kundaje GPU server
4 core
64GB RAM 
200GB of SSD storage 
8TB RAID10 local storage
4 Nvidia GTX 970 4GB GPUs

amold: Bickel and SGE server
32 AMD core
128GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

wotan: Bickel and SGE server
64 AMD core
256GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

Filesystem:

/users/$USER
default home directory
full backups nightly 
nfs mount to dhara
should store code, papers, and other highly processed data here

/mnt/data/
globally accessible data
should store common data here
e.g. genomes and indexes, annotations, ENCODE data  
if you dont want this to count towards your quote you must chown

/mnt/lab_data/$LAB/
lab accessible data
should store lab project data here 
e.g. ATAC-seq prediction data, enhancer prediction, motif calls

/srv/scratch/$USER
fast local storage
not backed up, but on raid and data will never be deleted
most analysis should be performed here

/srv/persistent/$USER
fast local storage
synced nightly, but not backed up
       ie if the hard drives fail or you delete something and notice 
       within 24 hours we can recover. Otherwise not. (vs home which is 
       properly backed up )  
intermediate analysis products that would be hard to recover should be stored here 
       e.g. stochastic analysis results that need to be kept so that paper 
       results can be reproduced

/srv/www/$LABNAME/
web accessible from mitra.stanford.edu
*NOT BACKED UP*

Some parallel programming patterns:

# gzip a bunch of files
parallel gzip -- *.FILESTOGZIP

# fork example in python:
(for more detailed examples look at 
 https://github.com/nboley/grit/ grit/lib/multiprocessing_utils.py)

import os
import time
import random

import multiprocessing

class ProcessSafeOPStream( object ):
    def __init__( self, writeable_obj ):
        self.writeable_obj = writeable_obj
        self.lock = multiprocessing.Lock()
        self.name = self.writeable_obj.name
        return
    
    def write( self, data ):
        self.lock.acquire()
        self.writeable_obj.write( data )
        self.writeable_obj.flush()
        self.lock.release()
        return
    
    def close( self ):
        self.writeable_obj.close()

def worker(queue, ofp):
    # Try without this
    random.seed()
    while True:
        i = queue.get()
        if i == 'FINISHED': return
        # simulate an expensive function
        x = random.random()
        time.sleep(x/10)
        print i, x
        ofp.write("%i\t%s\n" % (i, x))

NSIMS = 10000
NPROC = 25

# populate queue
todo = multiprocessing.Queue()
for i in xrange(NSIMS): todo.put(i)
for i in xrange(NPROC): todo.put('FINISHED')

ofp = ProcessSafeOPStream( open("output.txt", "w") )

pids = []
for i in xrange(NPROC):
    pid = os.fork()
    if pid == 0:
       worker(todo, ofp)
       os._exit(0)
    else:
       pids.append(pid)  

for pid in pids:
    os.waitpid(pid, 0)

ofp.close()

print "FINISHED"<br><br></pre>
<p>For use case 1 we obtained the following ENCODE and ROADMAP datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz">https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam">https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam">https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam</a>. Blacklisted regions were obtained from&nbsp;<a href="http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz">http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz</a>. The human genome version hg38 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz</a>.</p>
<p>For use case 2 we used the set of narrowPeak files summarized in&nbsp;<a href="https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt">https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt</a>&nbsp;(archived version v1.0.1). The human genome version hg19 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz</a></p>
<p>For use case 3 we used the ENCODE datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam">https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig">https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam">https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam</a>&nbsp;as we as the GENCODE annotation v29 from&nbsp;<a href="ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz">ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz</a>.</p><p>Address of the bookmark: <a href="http://mitra.stanford.edu/" rel="nofollow">http://mitra.stanford.edu/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34482/ribbon-visualizing-complex-genome-alignments-and-structural-variation</guid>
	<pubDate>Wed, 29 Nov 2017 07:40:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34482/ribbon-visualizing-complex-genome-alignments-and-structural-variation</link>
	<title><![CDATA[Ribbon: Visualizing complex genome alignments and structural variation:]]></title>
	<description><![CDATA[<p>Ribbon can be used for long reads, short reads, paired-end reads, and assembly/genome alignments. Instructions for each data format are available by clicking on "instructions" in each tab on the right.</p>
<p>Local installation:</p>
<p>You can install Ribbon locally from Github by following the instructions here:&nbsp;<a href="https://github.com/MariaNattestad/ribbon" target="_blank">https://github.com/MariaNattestad/Ribbon</a></p><p>Address of the bookmark: <a href="http://genomeribbon.com/" rel="nofollow">http://genomeribbon.com/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34569/ksnp30-snp-detection-and-phylogenetic-analysis-of-genomes-without-genome-alignment-or-reference-genome</guid>
	<pubDate>Fri, 08 Dec 2017 16:48:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34569/ksnp30-snp-detection-and-phylogenetic-analysis-of-genomes-without-genome-alignment-or-reference-genome</link>
	<title><![CDATA[kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome]]></title>
	<description><![CDATA[<p><span>Sept. 20, 2017 Version 3.1 released. Major upgrade. Version 3.1 fixes the problems with SNP annotation that arose when NCBI discontinued use of GI numbers. Please read carefully the Preface (page 3) and the File of annotated genomes section (pages 9-10) in the version 3.1 User Guide. Thanks to Tom Slezak for revsing the get_genbank_file3 script and to Tod Stuber (USDA) for testing version 3.1 even though he doesn't need the annotation feature. All users are encouraged to upgrade to version 3.1.&nbsp;<br></span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/ksnp/files/" rel="nofollow">https://sourceforge.net/projects/ksnp/files/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/34707/string-graph-based-genome-assembly-software-and-tools</guid>
	<pubDate>Tue, 19 Dec 2017 17:17:38 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/34707/string-graph-based-genome-assembly-software-and-tools</link>
	<title><![CDATA[String graph based genome assembly software and tools !]]></title>
	<description><![CDATA[<p>In&nbsp;<a href="https://en.wikipedia.org/wiki/Graph_theory" title="Graph theory">graph theory</a>, a&nbsp;<strong>string graph</strong>&nbsp;is an&nbsp;<a href="https://en.wikipedia.org/wiki/Intersection_graph" title="Intersection graph">intersection graph</a>&nbsp;of&nbsp;<a href="https://en.wikipedia.org/wiki/Curve" title="Curve">curves</a>&nbsp;in the plane; each curve is called a "string".&nbsp; String graphs were first proposed by E. W. Myers in a&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/21/suppl_2/ii79.full.pdf+html">2005 publication</a>.&nbsp;In&nbsp;recent&nbsp;<a href="http://genome.cshlp.org/content/early/2012/01/22/gr.126953.111">Genome Research paper</a>&nbsp;describing an innovative approach for assembling large genomes from NGS data caught our attention for several reasons. i) it give different "string graph" prospective of long lasting genome assembly problem ii) the&nbsp;paper is coauthored by Jared Simpson, the developer of&nbsp;<a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2694472/">ABySS assembler</a>&nbsp;and Richard Durbin. iii)&nbsp;Simpson-Durbin algorithm is that it does not rely on de Bruijn graphs, and instead employs a different graph construction approach called &lsquo;string graph&rsquo;.</p><p>Following are the genome assembly tools based on string graph:</p><p>1.SGA (String Graph Assembler)&nbsp;https://github.com/jts/sga</p><p>Assembles large genomes from high coverage short read data. SGA is designed as a modular set of programs, which are used to form an assembly pipeline. SGA implements a set of assembly algorithms based on the FM-index. As the FM-index is a compressed data structure, the algorithms are very memory efficient. The SGA assembly has three distinct phases. The first phase corrects base calling errors in the reads. The second phase assembles contigs from the corrected reads. The third phase uses paired end and/or mate pair data to build scaffolds from the contigs. The output of this software is a PDF report that allows the properties of the genome and data quality to be visually explored. By providing more information to the user at the start of an assembly project, this software will help increase awareness of the factors that make a given assembly easy or difficult, assist in the selection of software and parameters and help to troubleshoot an assembly if it runs into problems.</p><p>2.&nbsp;SAGE: String-overlap Assembly of GEnomes&nbsp;https://github.com/lucian-ilie/SAGE2</p><p>SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers.</p><p>3. FSG: Fast String Graph</p><p>The new integrated assembler has been assessed on a standard benchmark, showing that fast string graph (FSG) is significantly faster than SGA while maintaining a moderate use of main memory, and showing practical advantages in running FSG on multiple threads. Moreover, we have studied the effect of coverage rates on the running times.</p><p>4.&nbsp;&nbsp;BASE&nbsp;https://github.com/dhlbh/BASE</p><p>It enhances the classic seed-extension approach by indexing the reads efficiently to generate adaptive seeds that have high probability to appear uniquely in the genome. Such seeds form the basis for BASE to build extension trees and then to use reverse validation to remove the branches based on read coverage and paired-end information, resulting in high-quality consensus sequences of reads sharing the seeds. Such consensus sequences are then extended to contigs.&nbsp;BASE is a practically efficient tool for constructing contig, with significant improvement in quality for long NGS reads. It is relatively easy to extend BASE to include scaffolding.</p><p>5.&nbsp;Fermi&nbsp;https://github.com/lh3/fermi/</p><p>Fermi is a de novo assembler with a particular focus on assembling Illumina&nbsp;short sequence reads from a mammal-sized genome. In addition to the role of a&nbsp;typical assembler, fermi also aims to preserve heterozygotes which are often&nbsp;collapsed by other assemblers. Its ultimate goal is to find a minimal set of&nbsp;unitigs to represent all the information in raw reads.</p><p>If you want to learn about String Graph assembler, please read the following papers -</p><p>i)&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/21/suppl_2/ii79.full.pdf+html">The Fragment Assembly String Graph - E. W. Myers</a></p><p>This paper describes the String Graph concept.</p><p>ii)&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/26/12/i367.full#ref-20">Efficient construction of an assembly string graph using the FM-index - Jared T. Simpson and Richard Durbin</a></p><p>This earlier paper from Simpson and Durbin</p><p>iii)&nbsp;<a href="http://genome.cshlp.org/content/early/2012/01/22/gr.126953.111">Efficient de novo assembly of large genomes using compressed data structures - Jared T. Simpson and Richard Durbin</a></p><p>&nbsp;</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35135/alitv%E2%80%94interactive-visualization-of-whole-genome-comparisons</guid>
	<pubDate>Wed, 10 Jan 2018 07:08:17 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35135/alitv%E2%80%94interactive-visualization-of-whole-genome-comparisons</link>
	<title><![CDATA[AliTV—interactive visualization of whole genome comparisons]]></title>
	<description><![CDATA[<p>AliTV, which provides interactive visualization of whole genome alignments. AliTV reads multiple whole genome alignments or automatically generates alignments from the provided data. Optional feature annotations and phylo- genetic information are supported. The user-friendly, web-browser based and highly customizable interface allows rapid exploration and manipulation of the visualized data as well as the export of publication-ready high-quality figures. AliTV is freely available at&nbsp;<a href="https://github.com/AliTVTeam/AliTV">https://github.com/AliTVTeam/AliTV</a></p>
<p>https://alitvteam.github.io/AliTV/</p><p>Address of the bookmark: <a href="https://github.com/AliTVTeam/AliTV" rel="nofollow">https://github.com/AliTVTeam/AliTV</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35543/genometools-the-versatile-open-source-genome-analysis-software</guid>
	<pubDate>Wed, 07 Feb 2018 10:44:18 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35543/genometools-the-versatile-open-source-genome-analysis-software</link>
	<title><![CDATA[GenomeTools: The versatile open source genome analysis software]]></title>
	<description><![CDATA[<p>The&nbsp;<em>GenomeTools</em>&nbsp;genome analysis system is a&nbsp;<a href="http://genometools.org/license.html">free</a>&nbsp;collection of bioinformatics&nbsp;<a href="http://genometools.org/tools.html">tools</a>&nbsp;(in the realm of genome informatics) combined into a single binary named&nbsp;<em>gt</em>. It is based on a C library named &ldquo;libgenometools&rdquo; which consists of several modules.</p>
<p>If you are interested in gene prediction, have a look at&nbsp;<a href="http://genomethreader.org/" title="GenomeThreader gene prediction        software"><em>GenomeThreader</em></a>.</p><p>Address of the bookmark: <a href="http://genometools.org/" rel="nofollow">http://genometools.org/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>