<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/40465?offset=90</link>
	<atom:link href="https://bioinformaticsonline.com/related/40465?offset=90" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38449/koala-keggs-internal-annotation-tool-for-k-number-assignment-of-kegg-genes-using-ssearch-computation</guid>
	<pubDate>Wed, 12 Dec 2018 09:16:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38449/koala-keggs-internal-annotation-tool-for-k-number-assignment-of-kegg-genes-using-ssearch-computation</link>
	<title><![CDATA[KOALA: KEGG&#039;s internal annotation tool for K number assignment of KEGG GENES using SSEARCH computation]]></title>
	<description><![CDATA[<p>KOALA (KEGG Orthology And Links Annotation) is KEGG's internal annotation tool for&nbsp;<a href="https://www.kegg.jp/kegg/ko.html">K number</a>&nbsp;assignment of KEGG GENES using SSEARCH computation. BlastKOALA and GhostKOALA assign K numbers to the user's sequence data by&nbsp;<a href="http://www.ncbi.nlm.nih.gov/blast/">BLAST</a>&nbsp;and&nbsp;<a href="http://www.bi.cs.titech.ac.jp/ghostx/">GHOSTX</a>&nbsp;searches, respectively, against a nonredundant set of KEGG GENES. Annotate Sequence in KEGG Mapper and Pathogen Checker in KEGG Pathogen are special interfaces to the BlastKOALA server and can be executed in an interactive mode. &nbsp;&nbsp; See&nbsp;<a href="https://www.kegg.jp/blastkoala/help_blastkoala.html" target="_blastkoala">Step-by-step Instructions</a>.</p>
<div>Reference: Kanehisa, M., Sato, Y., and Morishima, K. (2016) BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J. Mol. Biol. 428, 726-731. [<a href="http://www.ncbi.nlm.nih.gov/pubmed/26585406">pubmed</a>] [<a href="https://doi.org/10.1016/j.jmb.2015.11.006">pdf</a>]</div><p>Address of the bookmark: <a href="https://www.kegg.jp/blastkoala/" rel="nofollow">https://www.kegg.jp/blastkoala/</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40359/minipolish-a-tool-for-racon-polishing-of-miniasm-assemblies</guid>
	<pubDate>Tue, 03 Dec 2019 02:40:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40359/minipolish-a-tool-for-racon-polishing-of-miniasm-assemblies</link>
	<title><![CDATA[Minipolish: A tool for Racon polishing of miniasm assemblies]]></title>
	<description><![CDATA[<p><a href="https://github.com/lh3/miniasm">Miniasm</a>&nbsp;is a great long-read assembly tool: straight-forward, effective and very fast. However, it does not include a polishing step, so its assemblies have a high error rate &ndash; they are essentially made of stitched-together pieces of long reads.</p>
<p><a href="https://github.com/isovic/racon">Racon</a>&nbsp;is a great polishing tool that can be used to clean up assembly errors. It's also very fast and well suited for long-read data. However, it operates on FASTA files, not the&nbsp;<a href="https://github.com/GFA-spec/GFA-spec/blob/master/GFA1.md">GFA graphs</a>&nbsp;that miniasm makes.</p>
<p>That's where Minipolish comes in. With a single command, it will use Racon to polish up a miniasm assembly, while keeping the assembly in graph form.</p>
<p>It also takes care of some of the other nuances of polishing a miniasm assembly:</p>
<ul>
<li>Adding read depth information to contigs</li>
<li>Fixing sequence truncation that can occur in Racon</li>
<li>Adding circularising links to circular contigs if not already present (so they display better in&nbsp;<a href="https://github.com/rrwick/Bandage">Bandage</a>)</li>
<li>'Rotating' circular contigs between polishing rounds to ensure clean circularisation</li>
</ul><p>Address of the bookmark: <a href="https://github.com/rrwick/Minipolish" rel="nofollow">https://github.com/rrwick/Minipolish</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41125/chromonomer-a-tool-set-for-repairing-and-enhancing-assembled-genomes-through-integration-of-genetic-maps-and-conserved-synteny</guid>
	<pubDate>Mon, 17 Feb 2020 05:38:46 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41125/chromonomer-a-tool-set-for-repairing-and-enhancing-assembled-genomes-through-integration-of-genetic-maps-and-conserved-synteny</link>
	<title><![CDATA[Chromonomer: a tool set for repairing and enhancing assembled genomes through integration of genetic maps and conserved synteny]]></title>
	<description><![CDATA[<p>Chromonomer is a program designed to integrate a genome assembly with a genetic map. Chromonomer tries very hard to identify and remove markers that are out of order in the genetic map, when considered against their local assembly order; and to identify scaffolds that have been incorrectly assembled according to the genetic map, and split those scaffolds.</p><p>Address of the bookmark: <a href="http://catchenlab.life.illinois.edu/chromonomer/" rel="nofollow">http://catchenlab.life.illinois.edu/chromonomer/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41686/catbat-tool-for-taxonomic-classification-of-contigs-and-metagenome-assembled-genomes-mags</guid>
	<pubDate>Mon, 18 May 2020 10:53:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41686/catbat-tool-for-taxonomic-classification-of-contigs-and-metagenome-assembled-genomes-mags</link>
	<title><![CDATA[CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)]]></title>
	<description><![CDATA[<p>Contig Annotation Tool (CAT) and Bin Annotation Tool (BAT) are pipelines for the taxonomic classification of long DNA sequences and metagenome assembled genomes (MAGs/bins) of both known and (highly) unknown microorganisms, as generated by contemporary metagenomics studies. The core algorithm of both programs involves gene calling, mapping of predicted ORFs against the nr protein database, and voting-based classification of the entire contig / MAG based on classification of the individual ORFs. CAT and BAT can be run from intermediate steps if files are formated appropriately (see <a href="https://github.com/dutilh/CAT#usage">Usage</a>).</p><p>Address of the bookmark: <a href="https://github.com/dutilh/CAT" rel="nofollow">https://github.com/dutilh/CAT</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42155/clustergrammer-is-a-web-based-tool-for-visualizing-high-dimensional-data-as-an-interactive-and-shareable-hierarchically-clustered-heatmap</guid>
	<pubDate>Sun, 23 Aug 2020 19:30:17 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42155/clustergrammer-is-a-web-based-tool-for-visualizing-high-dimensional-data-as-an-interactive-and-shareable-hierarchically-clustered-heatmap</link>
	<title><![CDATA[Clustergrammer is a web-based tool for visualizing high-dimensional data as an interactive and shareable hierarchically clustered heatmap]]></title>
	<description><![CDATA[<p><span>Clustergrammer is a web-based tool for visualizing high-dimensional data (e.g. a matrix) as an interactive and shareable hierarchically clustered heatmap. Clustergrammer's front end (</span><a href="http://clustergrammer.readthedocs.io/clustergrammer_js.html#clustergrammer-js">Clustergrammer-JS</a><span>) is built using&nbsp;</span><a href="https://d3js.org/">D3.js</a><span>&nbsp;and its back-end (</span><a href="http://clustergrammer.readthedocs.io/clustergrammer_py.html#clustergrammer-py">Clustergrammer-PY</a><span>) is built using Python. Clustergrammer produces highly interactive visualizations that enable intuitive exploration of high-dimensional data and has several biology-specific features (e.g. enrichment analysis, see&nbsp;</span><a href="http://clustergrammer.readthedocs.io/biology_specific_features.html#biology-specific-features">Biology-Specific Features</a><span>) to facilitate the exploration of gene-level biological data.&nbsp;</span></p><p>Address of the bookmark: <a href="https://github.com/MaayanLab/clustergrammer" rel="nofollow">https://github.com/MaayanLab/clustergrammer</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43090/loretta-a-user-friendly-tool-for-assembling-viral-genomes-from-pacbio-sequence-data</guid>
	<pubDate>Wed, 23 Jun 2021 07:54:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43090/loretta-a-user-friendly-tool-for-assembling-viral-genomes-from-pacbio-sequence-data</link>
	<title><![CDATA[LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data]]></title>
	<description><![CDATA[<p>LoReTTA (Long Read Template-Targeted Assembler), a tool designed for performing <em>de novo</em> assembly of long reads generated from viral genomes on the PacBio platform. LoReTTA exploits a reference genome to guide the assembly process, an approach that has been successful with short reads.</p>
<p>https://academic.oup.com/ve/article/7/1/veab042/6248116</p><p>Address of the bookmark: <a href="https://academic.oup.com/ve/article/7/1/veab042/6248116" rel="nofollow">https://academic.oup.com/ve/article/7/1/veab042/6248116</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44527/alvis-a-tool-for-contig-and-read-alignment-visualisation-and-chimera-detection</guid>
	<pubDate>Wed, 08 May 2024 07:02:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44527/alvis-a-tool-for-contig-and-read-alignment-visualisation-and-chimera-detection</link>
	<title><![CDATA[Alvis: a tool for contig and read ALignment VISualisation and chimera detection]]></title>
	<description><![CDATA[<p><span>Alvis, a simple command line tool that can generate visualisations for a number of common alignment analysis tasks. Alvis is a fast and portable tool that accepts input in a variety of alignment formats and will output production ready vector images. Additionally, Alvis will highlight potentially chimeric reads or contigs, a common source of misassemblies.</span></p>
<p>More at&nbsp;https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-021-04056-0</p><p>Address of the bookmark: <a href="https://github.com/SR-Martin/alvis" rel="nofollow">https://github.com/SR-Martin/alvis</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44655/ngenomesyn-an-easy-to-use-and-flexible-tool-for-publication-ready-visualization-of-syntenic-relationships-across-multiple-genomes</guid>
	<pubDate>Tue, 10 Sep 2024 04:54:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44655/ngenomesyn-an-easy-to-use-and-flexible-tool-for-publication-ready-visualization-of-syntenic-relationships-across-multiple-genomes</link>
	<title><![CDATA[NGenomeSyn: an easy-to-use and flexible tool for publication-ready visualization of syntenic relationships across multiple genomes]]></title>
	<description><![CDATA[<p>NGenomeSyn: an easy-to-use and flexible tool for publication-ready visualization of syntenic relationships across multiple genomes&nbsp;</p>
<p><img src="https://github.com/hewm2008/NGenomeSyn/raw/main/Example/example2/OUT3.png" alt="image" style="border: 0px;"></p>
<p><span>NGenomeSyn [multiple (N) Genome Synteny], for publication-ready visualization of syntenic relationships of the whole genome or local region and genomic features (e.g. repeats, structural variations, genes) across multiple genomes with a high customization. NGenomeSyn provides an easy way for its users to visualize a large amount of data with a rich layout by simply adjusting options for moving, scaling, and rotation of target genomes. Moreover, NGenomeSyn could be applied on the visualization of relationships on non-genomic data with similar input formats.</span></p>
<p>https://academic.oup.com/bioinformatics/article/39/3/btad121/7072460</p><p>Address of the bookmark: <a href="https://github.com/hewm2008/NGenomeSyn" rel="nofollow">https://github.com/hewm2008/NGenomeSyn</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</guid>
	<pubDate>Tue, 23 Mar 2021 05:32:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</link>
	<title><![CDATA[Public Databases for Bioinformatics !]]></title>
	<description><![CDATA[<pre>https://www.nature.com/articles/s41467-020-17155-y<br><br>Server Infrastructure:

File Server:

dhara: Synology 3614 Storage Appliance
4 Core Xeon
108TB disk storage
10Gb ethernet to SCG3
Access atx: dhara:5000
Has btsync server (try it - its much better than dropbox)

Compute Servers:

nandi: Kundaje and Phi Server
24 intel cores
256GB RAM
500GB of SSD storage 
36TB RAID6 local storage
4 Intel Phi's (space for 4 more GPU's)


durga: Montgomery and sensitive data
24 intel cores
256GB RAM
500GB of SSD RAID0 storage 
60TB RAID6 local storage

mitra: Bassik and Web/DB Server
24 core
256GB RAM 
500GB of SSD RAID0 storage 
36TB RAID6 local storage

vayu: Kundaje GPU server
4 core
64GB RAM 
200GB of SSD storage 
8TB RAID10 local storage
4 Nvidia GTX 970 4GB GPUs

amold: Bickel and SGE server
32 AMD core
128GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

wotan: Bickel and SGE server
64 AMD core
256GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

Filesystem:

/users/$USER
default home directory
full backups nightly 
nfs mount to dhara
should store code, papers, and other highly processed data here

/mnt/data/
globally accessible data
should store common data here
e.g. genomes and indexes, annotations, ENCODE data  
if you dont want this to count towards your quote you must chown

/mnt/lab_data/$LAB/
lab accessible data
should store lab project data here 
e.g. ATAC-seq prediction data, enhancer prediction, motif calls

/srv/scratch/$USER
fast local storage
not backed up, but on raid and data will never be deleted
most analysis should be performed here

/srv/persistent/$USER
fast local storage
synced nightly, but not backed up
       ie if the hard drives fail or you delete something and notice 
       within 24 hours we can recover. Otherwise not. (vs home which is 
       properly backed up )  
intermediate analysis products that would be hard to recover should be stored here 
       e.g. stochastic analysis results that need to be kept so that paper 
       results can be reproduced

/srv/www/$LABNAME/
web accessible from mitra.stanford.edu
*NOT BACKED UP*

Some parallel programming patterns:

# gzip a bunch of files
parallel gzip -- *.FILESTOGZIP

# fork example in python:
(for more detailed examples look at 
 https://github.com/nboley/grit/ grit/lib/multiprocessing_utils.py)

import os
import time
import random

import multiprocessing

class ProcessSafeOPStream( object ):
    def __init__( self, writeable_obj ):
        self.writeable_obj = writeable_obj
        self.lock = multiprocessing.Lock()
        self.name = self.writeable_obj.name
        return
    
    def write( self, data ):
        self.lock.acquire()
        self.writeable_obj.write( data )
        self.writeable_obj.flush()
        self.lock.release()
        return
    
    def close( self ):
        self.writeable_obj.close()

def worker(queue, ofp):
    # Try without this
    random.seed()
    while True:
        i = queue.get()
        if i == 'FINISHED': return
        # simulate an expensive function
        x = random.random()
        time.sleep(x/10)
        print i, x
        ofp.write("%i\t%s\n" % (i, x))

NSIMS = 10000
NPROC = 25

# populate queue
todo = multiprocessing.Queue()
for i in xrange(NSIMS): todo.put(i)
for i in xrange(NPROC): todo.put('FINISHED')

ofp = ProcessSafeOPStream( open("output.txt", "w") )

pids = []
for i in xrange(NPROC):
    pid = os.fork()
    if pid == 0:
       worker(todo, ofp)
       os._exit(0)
    else:
       pids.append(pid)  

for pid in pids:
    os.waitpid(pid, 0)

ofp.close()

print "FINISHED"<br><br></pre>
<p>For use case 1 we obtained the following ENCODE and ROADMAP datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz">https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam">https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam">https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam</a>. Blacklisted regions were obtained from&nbsp;<a href="http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz">http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz</a>. The human genome version hg38 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz</a>.</p>
<p>For use case 2 we used the set of narrowPeak files summarized in&nbsp;<a href="https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt">https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt</a>&nbsp;(archived version v1.0.1). The human genome version hg19 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz</a></p>
<p>For use case 3 we used the ENCODE datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam">https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig">https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam">https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam</a>&nbsp;as we as the GENCODE annotation v29 from&nbsp;<a href="ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz">ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz</a>.</p><p>Address of the bookmark: <a href="http://mitra.stanford.edu/" rel="nofollow">http://mitra.stanford.edu/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34482/ribbon-visualizing-complex-genome-alignments-and-structural-variation</guid>
	<pubDate>Wed, 29 Nov 2017 07:40:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34482/ribbon-visualizing-complex-genome-alignments-and-structural-variation</link>
	<title><![CDATA[Ribbon: Visualizing complex genome alignments and structural variation:]]></title>
	<description><![CDATA[<p>Ribbon can be used for long reads, short reads, paired-end reads, and assembly/genome alignments. Instructions for each data format are available by clicking on "instructions" in each tab on the right.</p>
<p>Local installation:</p>
<p>You can install Ribbon locally from Github by following the instructions here:&nbsp;<a href="https://github.com/MariaNattestad/ribbon" target="_blank">https://github.com/MariaNattestad/Ribbon</a></p><p>Address of the bookmark: <a href="http://genomeribbon.com/" rel="nofollow">http://genomeribbon.com/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>