<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/39281?</link>
	<atom:link href="https://bioinformaticsonline.com/related/39281?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</guid>
	<pubDate>Tue, 23 Mar 2021 05:32:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</link>
	<title><![CDATA[Public Databases for Bioinformatics !]]></title>
	<description><![CDATA[<pre>https://www.nature.com/articles/s41467-020-17155-y<br><br>Server Infrastructure:

File Server:

dhara: Synology 3614 Storage Appliance
4 Core Xeon
108TB disk storage
10Gb ethernet to SCG3
Access atx: dhara:5000
Has btsync server (try it - its much better than dropbox)

Compute Servers:

nandi: Kundaje and Phi Server
24 intel cores
256GB RAM
500GB of SSD storage 
36TB RAID6 local storage
4 Intel Phi's (space for 4 more GPU's)


durga: Montgomery and sensitive data
24 intel cores
256GB RAM
500GB of SSD RAID0 storage 
60TB RAID6 local storage

mitra: Bassik and Web/DB Server
24 core
256GB RAM 
500GB of SSD RAID0 storage 
36TB RAID6 local storage

vayu: Kundaje GPU server
4 core
64GB RAM 
200GB of SSD storage 
8TB RAID10 local storage
4 Nvidia GTX 970 4GB GPUs

amold: Bickel and SGE server
32 AMD core
128GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

wotan: Bickel and SGE server
64 AMD core
256GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

Filesystem:

/users/$USER
default home directory
full backups nightly 
nfs mount to dhara
should store code, papers, and other highly processed data here

/mnt/data/
globally accessible data
should store common data here
e.g. genomes and indexes, annotations, ENCODE data  
if you dont want this to count towards your quote you must chown

/mnt/lab_data/$LAB/
lab accessible data
should store lab project data here 
e.g. ATAC-seq prediction data, enhancer prediction, motif calls

/srv/scratch/$USER
fast local storage
not backed up, but on raid and data will never be deleted
most analysis should be performed here

/srv/persistent/$USER
fast local storage
synced nightly, but not backed up
       ie if the hard drives fail or you delete something and notice 
       within 24 hours we can recover. Otherwise not. (vs home which is 
       properly backed up )  
intermediate analysis products that would be hard to recover should be stored here 
       e.g. stochastic analysis results that need to be kept so that paper 
       results can be reproduced

/srv/www/$LABNAME/
web accessible from mitra.stanford.edu
*NOT BACKED UP*

Some parallel programming patterns:

# gzip a bunch of files
parallel gzip -- *.FILESTOGZIP

# fork example in python:
(for more detailed examples look at 
 https://github.com/nboley/grit/ grit/lib/multiprocessing_utils.py)

import os
import time
import random

import multiprocessing

class ProcessSafeOPStream( object ):
    def __init__( self, writeable_obj ):
        self.writeable_obj = writeable_obj
        self.lock = multiprocessing.Lock()
        self.name = self.writeable_obj.name
        return
    
    def write( self, data ):
        self.lock.acquire()
        self.writeable_obj.write( data )
        self.writeable_obj.flush()
        self.lock.release()
        return
    
    def close( self ):
        self.writeable_obj.close()

def worker(queue, ofp):
    # Try without this
    random.seed()
    while True:
        i = queue.get()
        if i == 'FINISHED': return
        # simulate an expensive function
        x = random.random()
        time.sleep(x/10)
        print i, x
        ofp.write("%i\t%s\n" % (i, x))

NSIMS = 10000
NPROC = 25

# populate queue
todo = multiprocessing.Queue()
for i in xrange(NSIMS): todo.put(i)
for i in xrange(NPROC): todo.put('FINISHED')

ofp = ProcessSafeOPStream( open("output.txt", "w") )

pids = []
for i in xrange(NPROC):
    pid = os.fork()
    if pid == 0:
       worker(todo, ofp)
       os._exit(0)
    else:
       pids.append(pid)  

for pid in pids:
    os.waitpid(pid, 0)

ofp.close()

print "FINISHED"<br><br></pre>
<p>For use case 1 we obtained the following ENCODE and ROADMAP datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz">https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam">https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam">https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam</a>. Blacklisted regions were obtained from&nbsp;<a href="http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz">http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz</a>. The human genome version hg38 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz</a>.</p>
<p>For use case 2 we used the set of narrowPeak files summarized in&nbsp;<a href="https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt">https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt</a>&nbsp;(archived version v1.0.1). The human genome version hg19 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz</a></p>
<p>For use case 3 we used the ENCODE datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam">https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig">https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam">https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam</a>&nbsp;as we as the GENCODE annotation v29 from&nbsp;<a href="ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz">ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz</a>.</p><p>Address of the bookmark: <a href="http://mitra.stanford.edu/" rel="nofollow">http://mitra.stanford.edu/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/37049/chromomap-an-r-package-for-interactive-visualization-and-mapping-of-human-chromosomes</guid>
	<pubDate>Mon, 25 Jun 2018 17:22:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/37049/chromomap-an-r-package-for-interactive-visualization-and-mapping-of-human-chromosomes</link>
	<title><![CDATA[chromoMap-An R package for Interactive visualization and mapping of human chromosomes]]></title>
	<description><![CDATA[
<p>chromoMap is an R package that provides interactive, configurable and elegant graphics visualization of the human chromosomes allowing users to map chromosome elements (like genes, SNPs etc.) on the chromosome plot. It introduces a special plot viz. the "chromosome heatmap" that, in addition to mapping elements, can visualize the data associated with chromosome elements (like gene expression) in the form of heat colors which can be highly advantageous in the scientific interpretations and research work. Because of the enormous size of the chromosomes, it is impractical to visualize each element on the same plot. But chromoMap plots provide a magnified view for each of chromosome location to render additional information and visualization specific for that location. You can map thousands of genes and can view all mappings easily. Users can investigate the detailed information about the mappings (like gene names or total genes mapped on a location) or can view the magnified single or double stranded view of the chromosome at a location showing each mapped element in sequential order (You will see in the demos below). Not ony that, the plots can be saved as HTML documents that can be customized and shared easily. In addition, you can include them in R Markdown or in R Shiny applications.</p>

<p>https://cran.r-project.org/web/packages/chromoMap/index.html</p>
]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36827/sex-detector-a-probabilistic-approach-to-study-sex-chromosomes-in-non-model-organisms</guid>
	<pubDate>Wed, 30 May 2018 15:57:31 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36827/sex-detector-a-probabilistic-approach-to-study-sex-chromosomes-in-non-model-organisms</link>
	<title><![CDATA[SEX-DETector: A Probabilistic Approach to Study Sex Chromosomes in Non-Model Organisms]]></title>
	<description><![CDATA[<p>SEX-DETector is a probabilistic method that relies on RNAseq data from a cross (parents and progeny of each sex) to infer autosomal and sex-linked genes (genes located on the non recombining part of sex chromosomes).</p>
<h3>How does SEX-DETector work?</h3>
<p>SEX-DETector does not require prior sequencing of a reference genome: the same sequencing data can be used for the assembly and for the mapping of the reads. A full documentation on the pipeline can be found&nbsp;<a href="https://lbbe.univ-lyon1.fr/IMG/pdf/sex-detector_user_manual.pdf?1294/78de9ae01fbe949e85db7b4392a7854efeba225d">here</a>.</p>
<ul>
<li>we recommend&nbsp;<a href="http://github.com/trinityrnaseq/trinityrnaseq/wiki">Trinity</a>&nbsp;for the assembly.</li>
<li>Trinity components should be merged with&nbsp;<a href="http://seq.cs.iastate.edu/cap3.html">cap3</a>. Our code to perform the merging is available&nbsp;<a href="http://lbbe.univ-lyon1.fr/IMG/zip/cap3_on_trinity_output-2.zip?1517/9ee57874639c69f96319b15e301705489ffce5ce">here</a>.</li>
<li>We recommend&nbsp;<a href="http://bio-bwa.sourceforge.net/">BWA</a>&nbsp;for mapping of the reads.</li>
<li>When the mapping has been perfomed, the individuals need to be genotyped; SEX-DETector takes files produced by Reads2snp (which is available for download on the&nbsp;<a href="http://kimura.univ-montp2.fr/PopPhyl/index.php?section=tools">PopPhyl website</a>) as input.</li>
</ul><p>Address of the bookmark: <a href="http://lbbe.univ-lyon1.fr/-SEX-DETector-.html?lang=eg" rel="nofollow">http://lbbe.univ-lyon1.fr/-SEX-DETector-.html?lang=eg</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38577/genoviz-visualization-software-for-genomics</guid>
	<pubDate>Wed, 02 Jan 2019 04:07:57 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38577/genoviz-visualization-software-for-genomics</link>
	<title><![CDATA[GenoViz: Visualization software for genomics]]></title>
	<description><![CDATA[<p><span>GenoViz provides software applications and re-usable components for data visualization and data sharing in genomics. Our flagship product is Integrated Genome Browser (IGB).</span><br><br><span>For more information about IGB, visit&nbsp;</span><a href="http://bioviz.org/" target="_blank">http://bioviz.org<span></span></a><span>.</span><br><br><span>Source code for the project was hosted here for many years. In 2014, we moved to a new git repository at&nbsp;</span><a href="http://www.bitbucket.org/lorainelab/integrated-genome-browser" target="_blank">http://www.bitbucket.org/lorainelab/integrated-genome-browser<span></span></a><span>. We are still using SourceForge to distribute new releases of IGB as compiled code (igb.zip) you can use to run IGB on your computer.&nbsp;</span><br><br><span>If you have questions, feel free to get in touch. Contact project head Ann Loraine (</span><a href="mailto:aloraine@uncc.edu" target="_blank">aloraine@uncc.edu<span></span></a><span>) or lead developer David Norris (</span><a href="mailto:dcnorris@uncc.edu" target="_blank">dcnorris@uncc.edu<span></span></a><span>&gt;).</span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/genoviz/" rel="nofollow">https://sourceforge.net/projects/genoviz/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/11592/xampp-starting-apache-fail-ubuntu</guid>
	<pubDate>Sat, 07 Jun 2014 05:52:35 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/11592/xampp-starting-apache-fail-ubuntu</link>
	<title><![CDATA[XAMPP: Starting Apache fail Ubuntu]]></title>
	<description><![CDATA[<p>Once you install XAMMP on linux, the most common problem you face is Apache failure. To fix the issues please use following command to first stop and then again start it.</p><p>sudo /etc/init.d/apache2 stop</p><p>sudo /etc/init.d/mysql stop</p><p>sudo /etc/init.d/proftpd stop</p><p>sudo /opt/lampp/lampp start</p><p>&nbsp;</p><p><strong>PhpMyAdmin &ldquo;Wrong permissions on configuration file, should not be world writable!&rdquo;</strong></p><p>Once the Xammp is installed, it might be possible to set up the configuration file in writable mode. Try the following steps:</p><p>Just chmod 0755 the file</p><pre>sudo chmod 0755 config.inc.php</pre>]]></description>
	<dc:creator>Ram Yash Pal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28269/4dgenome</guid>
	<pubDate>Mon, 04 Jul 2016 00:44:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28269/4dgenome</link>
	<title><![CDATA[4DGenome]]></title>
	<description><![CDATA[<p><span>Records in 4DGenome are compiled through comprehensive literature curation of experimentally-derived and computationally-predicted interactions. The current release contains 4,433,071 experimentally-derived and 3,605,176 computationally-predicted interactions in 5 organisms. Experimental data cover both high throughput datasets and individiual focused studies.&nbsp;</span><br><br><span>All interaction data are freely available in a standardized file format. Records can be queried by genomic regions, gene names, organism, and detection technology.&nbsp;</span></p><p>Address of the bookmark: <a href="http://4dgenome.research.chop.edu/" rel="nofollow">http://4dgenome.research.chop.edu/</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38556/reactome-pathway-database</guid>
	<pubDate>Mon, 31 Dec 2018 02:41:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38556/reactome-pathway-database</link>
	<title><![CDATA[Reactome Pathway Database]]></title>
	<description><![CDATA[<p><span>REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database. Our goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology and education. Founded in 2003, the Reactome project is led by Lincoln Stein of&nbsp;</span><a href="http://oicr.on.ca/">OICR</a><span>, Peter D&rsquo;Eustachio of&nbsp;</span><a href="http://nyulangone.org/">NYULMC</a><span>, Henning Hermjakob of&nbsp;</span><a href="http://www.ebi.ac.uk/">EMBL-EBI</a><span>, and Guanming Wu of&nbsp;</span><a href="http://www.ohsu.edu/">OHSU</a><span>.</span></p><p>Address of the bookmark: <a href="https://reactome.org/" rel="nofollow">https://reactome.org/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/44640/new-blast-core-nucleotide-database-core-nt</guid>
	<pubDate>Tue, 13 Aug 2024 07:12:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/44640/new-blast-core-nucleotide-database-core-nt</link>
	<title><![CDATA[New BLAST Core Nucleotide Database (core_nt)]]></title>
	<description><![CDATA[<p><span>The Core Nucleotide Database (core_nt) is now the default nucleotide BLAST database. Core_nt is also available on the command line. You get faster searches &amp; more focused results.</span></p><p><span><span>Core_nt contains the same eukaryotic transcript and gene-related sequences as nt. The core_nt database is nt without most eukaryotic chromosome sequences. Most nucleotide BLAST searches with core_nt will be similar to the nt database. However, core_nt is better than nt for accomplishing your most common BLAST search goals, such as identifying gene-related sequences like transcript sequences and complete bacterial chromosomes. This is because, in recent years, nt has acquired more low-relevance, non-annotated, and non-gene&nbsp;<span>content.&nbsp;</span></span></span></p><p><span> Learn more:&nbsp;https://ncbiinsights.ncbi.nlm.nih.gov/2024/07/18/new-blast-core-nucleotide-database/</span></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/39865/blast-nr-version-5-database-nr-v5</guid>
	<pubDate>Fri, 23 Aug 2019 11:35:35 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/39865/blast-nr-version-5-database-nr-v5</link>
	<title><![CDATA[BLAST nr version 5 database, (nr_v5)]]></title>
	<description><![CDATA[<p>NCBI have made changes the nr version 5 database, (nr_v5), to facilitate better search results and improved performance by reducing the number of redundant titles in the nr_v5 database used by webBLAST, which is also available for&nbsp;BLAST+ users.</p><p><span style="text-decoration: underline;"></span></p><p>The changes in nr preserve the taxonomic diversity of the entries in the database while reducing the number of titles for identical sequences. GenPept accessions are still accessible via&nbsp;<a href="http://www.ncbi.nlm.nih.gov/protein/$GENBANK_ACCESSION" target="_blank">www.ncbi.nlm.nih.gov/protein/$GENBANK_ACCESSION</a>&nbsp;or the IPG website&nbsp;<a href="https://www.ncbi.nlm.nih.gov/ipg/" target="_blank">https://www.ncbi.nlm.nih.gov/ipg/</a>.<span style="text-decoration: underline;"></span><span style="text-decoration: underline;"></span></p><p>The "Identical Proteins" link in the alignments section of the webBLAST results takes you to a full list of all accessions associated with a sequence.</p><p><span style="text-decoration: underline;"></span></p><p>For&nbsp;BLAST+ users downloading nr_v5: the database is now approximately 50% smaller, resulting in faster downloads and&nbsp;BLAST&nbsp;searches, and smaller disk space requirements. The database is downloadable at: &nbsp;<a href="ftp://ftp.ncbi.nlm.nih.gov/blast/db/v5/" target="_blank">ftp://ftp.ncbi.nlm.nih.gov/blast/db/v5/</a></p><p><span style="text-decoration: underline;"></span></p><p>For&nbsp;BLAST+ there is a cleanup script to help you manage the transition to this smaller database. The script removes unused database volumes:&nbsp;<a href="ftp://ftp.ncbi.nlm.nih.gov/blast/temp/cleanup-blastdb-volumes.py" target="_blank">ftp://ftp.ncbi.nlm.nih.gov/blast/temp/cleanup-blastdb-volumes.py</a></p><p><span style="text-decoration: underline;"></span></p><p>Here are the new rules on how we keep titles in nr_v5:</p><p><span style="text-decoration: underline;"></span></p><p>1.&nbsp;&nbsp;&nbsp; We keep all refseq, swissprot, pir and PDB titles.<span style="text-decoration: underline;"></span><span style="text-decoration: underline;"></span></p><p>2.&nbsp; &nbsp;&nbsp;We keep any GenPept titles with a TAXID that has not already been seen in the record.<span style="text-decoration: underline;"></span><span style="text-decoration: underline;"></span></p><p>3.&nbsp; &nbsp;&nbsp;We keep at least five GenPept titles regardless of whether the TAXIDS have been seen before or not in this record.</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43011/deg-50-a-database-of-essential-genes-in-both-prokaryotes-and-eukaryotes</guid>
	<pubDate>Tue, 30 Mar 2021 11:47:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43011/deg-50-a-database-of-essential-genes-in-both-prokaryotes-and-eukaryotes</link>
	<title><![CDATA[DEG 5.0: a database of essential genes in both prokaryotes and eukaryotes]]></title>
	<description><![CDATA[<p><span>Essential genes are those indispensable for the survival of an organism, and their functions are therefore considered a foundation of life. Determination of a minimal gene set needed to sustain a life form, a fundamental question in biology, plays a key role in the emerging field, synthetic biology. </span></p>
<p><span></span><span>DEG is freely available at the website&nbsp;</span><a href="http://tubic.tju.edu.cn/deg" target="_blank">http://tubic.tju.edu.cn/deg</a><span>&nbsp;or&nbsp;</span><a href="http://www.essentialgene.org/" target="_blank">http://www.essentialgene.org</a><span>.</span></p><p>Address of the bookmark: <a href="http://www.essentialgene.org/" rel="nofollow">http://www.essentialgene.org/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>