<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/35131?offset=240</link>
	<atom:link href="https://bioinformaticsonline.com/related/35131?offset=240" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35292/pgap-x-extension-on-pan-genome-analysis-pipeline</guid>
	<pubDate>Tue, 23 Jan 2018 11:41:43 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35292/pgap-x-extension-on-pan-genome-analysis-pipeline</link>
	<title><![CDATA[PGAP-X: Extension on pan-genome analysis pipeline]]></title>
	<description><![CDATA[<p>PGAP-X is a microbial comparative genomic analysis platform with graphic interface. Serials of algorithms and methodologies have been developed and integrated to analyze and visualize genomics structure variation, gene distribution with different conservative levels, and genetic variation from pan-genome sight. At the same time, analytical result data from many other programs, including genome alignment result and orthologs clusters, are also supported to be further analyzed or visualized in PGAP-X. The workflow and feature snapshot in PGAP-X were shown as Fig.1 and Fig.2.</p>
<div><img src="https://pgapx.ybzhao.com/image/f1.jpg" alt="image" style="border: 0px; border: 0px;"></div>
<div>&nbsp;</div>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://pgapx.ybzhao.com/" rel="nofollow">https://pgapx.ybzhao.com/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40298/environment-for-tree-exploration-ete-is-a-python-programming-toolkit-that-assists-in-the-recontruction-manipulation-analysis-and-visualization-of-phylogenetic-trees</guid>
	<pubDate>Wed, 27 Nov 2019 05:32:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40298/environment-for-tree-exploration-ete-is-a-python-programming-toolkit-that-assists-in-the-recontruction-manipulation-analysis-and-visualization-of-phylogenetic-trees</link>
	<title><![CDATA[Environment for Tree Exploration (ETE) is a Python programming toolkit that assists in the recontruction, manipulation, analysis and visualization of phylogenetic trees]]></title>
	<description><![CDATA[<p><span>The Environment for Tree Exploration (ETE) is a Python programming toolkit that assists in the recontruction, manipulation, analysis and visualization of phylogenetic trees (although clustering trees or any other tree-like data structure are also supported).</span></p>
<p><span>Other tools</span></p>
<p><span><a href="https://github.com/shenwei356/taxonkit">https://github.com/shenwei356/taxonkit</a></span></p>
<p>&nbsp;</p>
<ul>
<li>ETE, version:&nbsp;<a href="https://pypi.org/project/ete3/3.1.1/">3.1.1</a></li>
<li>BioPython, version:&nbsp;<a href="https://pypi.org/project/biopython/1.73/">1.73</a></li>
<li>taxadb, version:&nbsp;<a href="https://pypi.org/project/taxadb/0.9.0">0.10.1</a></li>
<li>TaxonKit, version:&nbsp;<a href="https://github.com/shenwei356/taxonkit/releases/tag/0.10.1">0.5.0</a></li>
</ul><p>Address of the bookmark: <a href="https://pypi.org/project/ete3/3.1.1/" rel="nofollow">https://pypi.org/project/ete3/3.1.1/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42568/breedbase-is-a-comprehensive-breeding-management-and-analysis-software</guid>
	<pubDate>Wed, 06 Jan 2021 19:45:21 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42568/breedbase-is-a-comprehensive-breeding-management-and-analysis-software</link>
	<title><![CDATA[Breedbase is a comprehensive breeding management and analysis software]]></title>
	<description><![CDATA[<p><span>Breedbase is a comprehensive breeding management and analysis software. It can be used to design field layouts, collect phenotypic information using tablets, support the collection of genotyping samples in a field, store large amounts of high density genotypic information, and provide Genomic Selection related analyses and predictions. Breedbase supports the BrAPI standard.</span></p><p>Address of the bookmark: <a href="https://breedbase.org/" rel="nofollow">https://breedbase.org/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43815/kebabs-package-provides-functionality-for-kernel-based-analysis-of-biological-sequences-via-support-vector-machine-svm-based-methods</guid>
	<pubDate>Fri, 04 Mar 2022 00:14:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43815/kebabs-package-provides-functionality-for-kernel-based-analysis-of-biological-sequences-via-support-vector-machine-svm-based-methods</link>
	<title><![CDATA[kebabs: package provides functionality for kernel based analysis of biological sequences via Support Vector Machine (SVM) based methods]]></title>
	<description><![CDATA[<p><span>The&nbsp;</span><tt>kebabs</tt><span>&nbsp;package provides functionality for kernel based analysis of biological sequences via Support Vector Machine (SVM) based methods. Biological sequences include DNA, RNA, and amino acid (AA) sequences. Sequence kernels define similarity measures between sequences. The package implements some of the most important kernels for sequence analysis in a very flexible and efficient way and extends the standard position-independent functionality of these kernels in a novel way to take the position of patterns in the sequences into account for the similarity measure.</span></p>
<p>http://www.bioinf.jku.at/software/kebabs/</p>
<p>http://bioconductor.org/packages/release/bioc/vignettes/kebabs/inst/doc/kebabs.pdf</p><p>Address of the bookmark: <a href="http://www.bioinf.jku.at/software/kebabs/" rel="nofollow">http://www.bioinf.jku.at/software/kebabs/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44541/powerful-books-for-learning-data-analysis-with-r</guid>
	<pubDate>Tue, 28 May 2024 07:42:56 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44541/powerful-books-for-learning-data-analysis-with-r</link>
	<title><![CDATA[Powerful books for learning data analysis with R]]></title>
	<description><![CDATA[<p><span>R is powerful tool for data analysis, visualization, and machine learning. And it costs $0 to use! Here are six FREE books you can use to learn R today:</span></p>
<p><span>https://csgillespie.github.io/efficientR/</span></p>
<p><span>https://r-graphics.org/</span></p>
<p><span>https://rstudio-education.github.io/hopr/</span></p>
<p><span>https://r-pkgs.org/</span></p>
<p><span>https://r4ds.had.co.nz/</span></p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://r-graphics.org/" rel="nofollow">https://r-graphics.org/</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34624/teacheng-teaching-engine-for-genomics</guid>
	<pubDate>Wed, 13 Dec 2017 17:55:23 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34624/teacheng-teaching-engine-for-genomics</link>
	<title><![CDATA[TeachEnG: Teaching Engine for Genomics]]></title>
	<description><![CDATA[<p>TeachEnG (pronounced &ldquo;teaching&rdquo;), a <span style="text-decoration: underline;">Teach</span>ing <span style="text-decoration: underline;">En</span>gine for <span style="text-decoration: underline;">G</span>enomics, provides educational games to help students and researchers understand key bioinformatics concepts. The current version includes interactive modules for sequence alignment and phylogenetic tree reconstruction algorithms, with accompanying video tutorials. <br><br> Please contact us via email (knoweng@illinois.edu) if you have any questions or suggestions.&nbsp;</p><p>Address of the bookmark: <a href="http://teacheng.illinois.edu/" rel="nofollow">http://teacheng.illinois.edu/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</guid>
	<pubDate>Tue, 23 Mar 2021 05:32:15 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42987/public-databases-for-bioinformatics</link>
	<title><![CDATA[Public Databases for Bioinformatics !]]></title>
	<description><![CDATA[<pre>https://www.nature.com/articles/s41467-020-17155-y<br><br>Server Infrastructure:

File Server:

dhara: Synology 3614 Storage Appliance
4 Core Xeon
108TB disk storage
10Gb ethernet to SCG3
Access atx: dhara:5000
Has btsync server (try it - its much better than dropbox)

Compute Servers:

nandi: Kundaje and Phi Server
24 intel cores
256GB RAM
500GB of SSD storage 
36TB RAID6 local storage
4 Intel Phi's (space for 4 more GPU's)


durga: Montgomery and sensitive data
24 intel cores
256GB RAM
500GB of SSD RAID0 storage 
60TB RAID6 local storage

mitra: Bassik and Web/DB Server
24 core
256GB RAM 
500GB of SSD RAID0 storage 
36TB RAID6 local storage

vayu: Kundaje GPU server
4 core
64GB RAM 
200GB of SSD storage 
8TB RAID10 local storage
4 Nvidia GTX 970 4GB GPUs

amold: Bickel and SGE server
32 AMD core
128GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

wotan: Bickel and SGE server
64 AMD core
256GB RAM 
200GB of SSD storage 
12TB RAID5 local storage

Filesystem:

/users/$USER
default home directory
full backups nightly 
nfs mount to dhara
should store code, papers, and other highly processed data here

/mnt/data/
globally accessible data
should store common data here
e.g. genomes and indexes, annotations, ENCODE data  
if you dont want this to count towards your quote you must chown

/mnt/lab_data/$LAB/
lab accessible data
should store lab project data here 
e.g. ATAC-seq prediction data, enhancer prediction, motif calls

/srv/scratch/$USER
fast local storage
not backed up, but on raid and data will never be deleted
most analysis should be performed here

/srv/persistent/$USER
fast local storage
synced nightly, but not backed up
       ie if the hard drives fail or you delete something and notice 
       within 24 hours we can recover. Otherwise not. (vs home which is 
       properly backed up )  
intermediate analysis products that would be hard to recover should be stored here 
       e.g. stochastic analysis results that need to be kept so that paper 
       results can be reproduced

/srv/www/$LABNAME/
web accessible from mitra.stanford.edu
*NOT BACKED UP*

Some parallel programming patterns:

# gzip a bunch of files
parallel gzip -- *.FILESTOGZIP

# fork example in python:
(for more detailed examples look at 
 https://github.com/nboley/grit/ grit/lib/multiprocessing_utils.py)

import os
import time
import random

import multiprocessing

class ProcessSafeOPStream( object ):
    def __init__( self, writeable_obj ):
        self.writeable_obj = writeable_obj
        self.lock = multiprocessing.Lock()
        self.name = self.writeable_obj.name
        return
    
    def write( self, data ):
        self.lock.acquire()
        self.writeable_obj.write( data )
        self.writeable_obj.flush()
        self.lock.release()
        return
    
    def close( self ):
        self.writeable_obj.close()

def worker(queue, ofp):
    # Try without this
    random.seed()
    while True:
        i = queue.get()
        if i == 'FINISHED': return
        # simulate an expensive function
        x = random.random()
        time.sleep(x/10)
        print i, x
        ofp.write("%i\t%s\n" % (i, x))

NSIMS = 10000
NPROC = 25

# populate queue
todo = multiprocessing.Queue()
for i in xrange(NSIMS): todo.put(i)
for i in xrange(NPROC): todo.put('FINISHED')

ofp = ProcessSafeOPStream( open("output.txt", "w") )

pids = []
for i in xrange(NPROC):
    pid = os.fork()
    if pid == 0:
       worker(todo, ofp)
       os._exit(0)
    else:
       pids.append(pid)  

for pid in pids:
    os.waitpid(pid, 0)

ofp.close()

print "FINISHED"<br><br></pre>
<p>For use case 1 we obtained the following ENCODE and ROADMAP datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz">https://www.encodeproject.org/files/ENCFF446WOD/@@download/ENCFF446WOD.bed.gz</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam">https://www.encodeproject.org/files/ENCFF546PJU/@@download/ENCFF546PJU.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam">https://www.encodeproject.org/files/ENCFF059BEU/@@download/ENCFF059BEU.bam</a>. Blacklisted regions were obtained from&nbsp;<a href="http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz">http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/hg38.blacklist.bed.gz</a>. The human genome version hg38 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz</a>.</p>
<p>For use case 2 we used the set of narrowPeak files summarized in&nbsp;<a href="https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt">https://github.com/wkopp/janggu_usecases/tree/master/extra/urls.txt</a>&nbsp;(archived version v1.0.1). The human genome version hg19 was obtained from&nbsp;<a href="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz">http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz</a></p>
<p>For use case 3 we used the ENCODE datasets&nbsp;<a href="https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam">https://www.encodeproject.org/files/ENCFF591XCX/@@download/ENCFF591XCX.bam</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig">https://www.encodeproject.org/files/ENCFF736LHE/@@download/ENCFF736LHE.bigWig</a>,&nbsp;<a href="https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam">https://www.encodeproject.org/files/ENCFF177HHM/@@download/ENCFF177HHM.bam</a>&nbsp;as we as the GENCODE annotation v29 from&nbsp;<a href="ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz">ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_29/gencode.v29.annotation.gtf.gz</a>.</p><p>Address of the bookmark: <a href="http://mitra.stanford.edu/" rel="nofollow">http://mitra.stanford.edu/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34482/ribbon-visualizing-complex-genome-alignments-and-structural-variation</guid>
	<pubDate>Wed, 29 Nov 2017 07:40:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34482/ribbon-visualizing-complex-genome-alignments-and-structural-variation</link>
	<title><![CDATA[Ribbon: Visualizing complex genome alignments and structural variation:]]></title>
	<description><![CDATA[<p>Ribbon can be used for long reads, short reads, paired-end reads, and assembly/genome alignments. Instructions for each data format are available by clicking on "instructions" in each tab on the right.</p>
<p>Local installation:</p>
<p>You can install Ribbon locally from Github by following the instructions here:&nbsp;<a href="https://github.com/MariaNattestad/ribbon" target="_blank">https://github.com/MariaNattestad/Ribbon</a></p><p>Address of the bookmark: <a href="http://genomeribbon.com/" rel="nofollow">http://genomeribbon.com/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34571/mugsy-multiple-whole-genome-alignment-tool</guid>
	<pubDate>Fri, 08 Dec 2017 17:41:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34571/mugsy-multiple-whole-genome-alignment-tool</link>
	<title><![CDATA[Mugsy: multiple whole genome alignment tool]]></title>
	<description><![CDATA[<p><span>Mugsy is a multiple whole genome aligner. Mugsy uses Nucmer for pairwise alignment, a custom graph based segmentation procedure for identifying collinear regions, and the segment-based progressive multiple alignment strategy from Seqan::TCoffee. Mugsy accepts draft genomes in the form of multi-FASTA files and does not require a reference genome.</span></p>
<p>To cite Mugsy, use:</p>
<p>Angiuoli SV and Salzberg SL.&nbsp;<a href="http://bioinformatics.oxfordjournals.org/content/27/3/334">Mugsy: Fast multiple alignment of closely related whole genomes.</a><em>Bioinformatics</em>&nbsp;2011 27(3):334-4</p><p>Address of the bookmark: <a href="http://mugsy.sourceforge.net/" rel="nofollow">http://mugsy.sourceforge.net/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34867/magic-blast-a-tool-for-mapping-large-next-generation-rna-or-dna-sequencing-runs-against-a-whole-genome-or-transcriptome</guid>
	<pubDate>Tue, 26 Dec 2017 22:23:39 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34867/magic-blast-a-tool-for-mapping-large-next-generation-rna-or-dna-sequencing-runs-against-a-whole-genome-or-transcriptome</link>
	<title><![CDATA[Magic-BLAST: a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome.]]></title>
	<description><![CDATA[<p>Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Each alignment optimizes a composite score, taking into account simultaneously the two reads of a pair, and in case of RNA-seq, locating the candidate introns and adding up the score of all exons. This is very different from other versions of BLAST, where each exon is scored as a separate hit and read-pairing is ignored.</p>
<p>Magic-BLAST incorporates within the NCBI BLAST code framework ideas developed in the NCBI Magic pipeline, in particular hit extensions by local walk and jump&nbsp;<a href="http://www.ncbi.nlm.nih.gov/pubmed/26109056">(http://www.ncbi.nlm.nih.gov/pubmed/26109056)</a>, and recursive clipping of mismatches near the edges of the reads, which avoids accumulating artefactual mismatches near splice sites and is needed to distinguish short indels from substitutions near the edges.</p><p>Address of the bookmark: <a href="https://ncbi.github.io/magicblast/" rel="nofollow">https://ncbi.github.io/magicblast/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>