<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43614?offset=600</link>
	<atom:link href="https://bioinformaticsonline.com/related/43614?offset=600" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33741/diya-a-bacterial-annotation-pipeline-for-any-genomics-lab</guid>
	<pubDate>Fri, 30 Jun 2017 08:48:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33741/diya-a-bacterial-annotation-pipeline-for-any-genomics-lab</link>
	<title><![CDATA[DIYA: a bacterial annotation pipeline for any genomics lab]]></title>
	<description><![CDATA[<p><span>DIY Genomics is an open source bioinformatics consortium intended to bring a collection of tools and libraries into the hands of small scale genomics labs for the process of sequence assembly and annotation. Projects include DIYA, MGAP, CRISPR, and DIYGV</span></p>
<p><span>http://gmod.org/wiki/Diya</span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/diyg/" rel="nofollow">https://sourceforge.net/projects/diyg/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44585/dram-distilled-and-refined-annotation-of-metabolism</guid>
	<pubDate>Sat, 06 Jul 2024 04:19:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44585/dram-distilled-and-refined-annotation-of-metabolism</link>
	<title><![CDATA[DRAM: Distilled and Refined Annotation of Metabolism]]></title>
	<description><![CDATA[<p><span>DRAM (Distilled and Refined Annotation of Metabolism) is a tool for annotating metagenomic assembled genomes and&nbsp;</span><a href="https://github.com/simroux/VirSorter">VirSorter</a><span>&nbsp;identified viral contigs. DRAM annotates MAGs and viral contigs using&nbsp;</span><a href="https://www.kegg.jp/">KEGG</a><span>&nbsp;(if provided by the user),&nbsp;</span><a href="https://www.uniprot.org/">UniRef90</a><span>,&nbsp;</span><a href="https://pfam.xfam.org/">PFAM</a><span>,&nbsp;</span><a href="http://bcb.unl.edu/dbCAN2/">dbCAN</a><span>,&nbsp;</span><a href="https://www.ncbi.nlm.nih.gov/genome/viruses/">RefSeq viral</a><span>,&nbsp;</span><a href="http://vogdb.org/">VOGDB</a><span>&nbsp;and the&nbsp;</span><a href="https://www.ebi.ac.uk/merops/">MEROPS</a><span>&nbsp;peptidase database as well as custom user databases. DRAM is run in two stages. First an annotation step to assign database identifiers to gene, and then a distill step to curate these annotations into useful functional categories. Additionally, viral contigs are further analyzed during to identify potential AMGs. This is done via assigning an auxiliary score and flags representing the confidence that a gene is both metabolic and viral.</span></p>
<p><img src="https://genomicsaotearoa.github.io/metagenomics_summer_school/figures/ex14_DRAM_annotation_rank.png" alt="image" style="border: 0px;"></p>
<p>Ref&nbsp;https://genomicsaotearoa.github.io/metagenomics_summer_school/day4/ex15_gene_annotation_part3/#overview-of-drampy-annotate-output&nbsp;</p><p>Address of the bookmark: <a href="https://github.com/WrightonLabCSU/DRAM" rel="nofollow">https://github.com/WrightonLabCSU/DRAM</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37965/kobas-a-web-server-for-geneprotein-functional-annotation-and-functional-gene-set-enrichment</guid>
	<pubDate>Fri, 19 Oct 2018 09:36:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37965/kobas-a-web-server-for-geneprotein-functional-annotation-and-functional-gene-set-enrichment</link>
	<title><![CDATA[KOBAS: a web server for gene/protein functional annotation and functional gene set enrichment]]></title>
	<description><![CDATA[<p><span>KOBAS 3.0 is a web server for gene/protein functional annotation (</span><a href="http://kobas.cbi.pku.edu.cn/annotate.php">Annotate</a><span>&nbsp;module) and functional gene set enrichment(Enrichment module). For Annotate module, it accepts gene list as input, including IDs or sequences, and generates annotations for each gene based on multiple databases about pathways, diseases, and Gene Ontology. For Enrichment module, it can accept either gene list or gene expression data as input, and generates enriched gene sets, corresponding name, p-value or a probability of enrichment and enrichment score based on results of multiple methods.</span></p><p>Address of the bookmark: <a href="http://kobas.cbi.pku.edu.cn/" rel="nofollow">http://kobas.cbi.pku.edu.cn/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38659/detail-annotation-of-genes</guid>
	<pubDate>Fri, 11 Jan 2019 05:23:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38659/detail-annotation-of-genes</link>
	<title><![CDATA[Detail annotation of genes !]]></title>
	<description><![CDATA[<p>gene_info recalculated daily<br>---------------------------------------------------------------------------<br> tab-delimited<br> one line per GeneID<br> Column header line is the first line in the file.<br> Note: subsets of gene_info are available in the DATA/GENE_INFO<br> directory (described later)<br>---------------------------------------------------------------------------</p>
<p>tax_id:<br> the unique identifier provided by NCBI Taxonomy<br> for the species or strain/isolate</p>
<p>GeneID:<br> the unique identifier for a gene<br> ASN1: geneid</p>
<p>Symbol:<br> the default symbol for the gene<br> ASN1: gene-&gt;locus</p>
<p>LocusTag:<br> the LocusTag value<br> ASN1: gene-&gt;locus-tag</p>
<p>Synonyms:<br> bar-delimited set of unofficial symbols for the gene</p>
<p>dbXrefs:<br> bar-delimited set of identifiers in other databases<br> for this gene. The unit of the set is database:value.<br> Note that HGNC and MGI include 'HGNC' and 'MGI', respectively,<br> in the value part of their identifier. Consequently,<br> dbXrefs for these databases will appear like:<br> HGNC:HGNC:1100<br> This would be interpreted as database='HGNC', value='HGNC:1100'<br> Example for MGI:<br> MGI:MGI:104537<br> This would be interpreted as database='MGI', value='MGI:104537'</p>
<p>chromosome:<br> the chromosome on which this gene is placed.<br> for mitochondrial genomes, the value 'MT' is used.</p>
<p>map location:<br> the map location for this gene</p>
<p>description:<br> a descriptive name for this gene</p>
<p>type of gene:<br> the type assigned to the gene according to the list of options<br> provided in https://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/objects/entrezgene/entrezgene.asn</p>
<p><br>Symbol from nomenclature authority:<br> when not '-', indicates that this symbol is from a<br> a nomenclature authority</p>
<p>Full name from nomenclature authority:<br> when not '-', indicates that this full name is from a<br> a nomenclature authority</p>
<p>Nomenclature status:<br> when not '-', indicates the status of the name from the <br> nomenclature authority (O for official, I for interim)</p>
<p>Other designations:<br> pipe-delimited set of some alternate descriptions that<br> have been assigned to a GeneID<br> '-' indicates none is being reported.</p>
<p>Modification date:<br> the last date a gene record was updated, in YYYYMMDD format</p>
<p>Feature type:<br> pipe-delimited set of annotated features and their classes or <br> controlled vocabularies, displayed as feature_type:feature_class <br> or feature_type:controlled_vocabulary, when appropriate; derived <br> from select feature annotations on RefSeq(s) associated with the <br> GeneID</p><p>Address of the bookmark: <a href="ftp://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/" rel="nofollow">ftp://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39881/apollo-a-sequence-annotation-editor</guid>
	<pubDate>Tue, 27 Aug 2019 08:08:47 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39881/apollo-a-sequence-annotation-editor</link>
	<title><![CDATA[Apollo: a sequence annotation editor]]></title>
	<description><![CDATA[<p><span>The well-established inaccuracy of purely computational methods for annotating genome sequences necessitates an interactive tool to allow biological experts to refine these approximations by viewing and independently evaluating the data supporting each annotation. Apollo was developed to meet this need, enabling curators to inspect genome annotations closely and edit them</span></p><p>Address of the bookmark: <a href="https://genomebiology.biomedcentral.com/articles/10.1186/gb-2002-3-12-research0082" rel="nofollow">https://genomebiology.biomedcentral.com/articles/10.1186/gb-2002-3-12-research0082</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42491/eukulele-taxonomic-annotation-of-the-unsung-eukaryotic-microbes</guid>
	<pubDate>Sat, 26 Dec 2020 12:10:17 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42491/eukulele-taxonomic-annotation-of-the-unsung-eukaryotic-microbes</link>
	<title><![CDATA[EUKulele: Taxonomic annotation of the unsung eukaryotic microbes]]></title>
	<description><![CDATA[<p><span><span>&nbsp;</span>EUKulele, an open-source software tool designed to assign taxonomy to microeukaryotes detected in meta-omic samples, and complement analysis approaches in other domains by accommodating assembly output and providing concrete metrics reporting the taxonomic completeness of each sample.</span></p><p>Address of the bookmark: <a href="https://github.com/AlexanderLabWHOI/EUKulele" rel="nofollow">https://github.com/AlexanderLabWHOI/EUKulele</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43877/crowdgo-machine-learning-and-semantic-similarity-guided-consensus-gene-ontology-annotation</guid>
	<pubDate>Thu, 26 May 2022 00:59:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43877/crowdgo-machine-learning-and-semantic-similarity-guided-consensus-gene-ontology-annotation</link>
	<title><![CDATA[CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation]]></title>
	<description><![CDATA[<p dir="auto">CrowdGO is a protein Gene Ontology predictor using a meta approach, analyzing the predictions of other tools in order to get an improved precision and recall.</p>
<p dir="auto">Please note that the CrowdGO snakemake workflow is currently only tested on Ubuntu. It should work on OSX, but please report any errors to <a href="mailto:maarten.reijnders@unil.ch">maarten.reijnders@unil.ch</a> or create an issue.</p>
<p>https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010075</p><p>Address of the bookmark: <a href="https://gitlab.com/mreijnders/crowdgo" rel="nofollow">https://gitlab.com/mreijnders/crowdgo</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33847/omega2-metagenome-assembly-pipeline</guid>
	<pubDate>Mon, 10 Jul 2017 05:56:07 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33847/omega2-metagenome-assembly-pipeline</link>
	<title><![CDATA[Omega2: metagenome assembly pipeline]]></title>
	<description><![CDATA[<p><span>Omega found overlaps between reads using a prefix/suffix hash table. The overlap graph of reads was simplified by removing transitive edges and trimming short branches. Unitigs were generated based on minimum cost flow analysis of the overlap graph and then merged to contigs and scaffolds using mate-pair information. In comparison with three de Bruijn graph assemblers (SOAPdenovo, IDBA-UD and MetaVelvet), Omega provided comparable overall performance on a HiSeq 100-bp dataset and superior performance on a MiSeq 300-bp dataset. In comparison with Celera on the MiSeq dataset, Omega provided more continuous assemblies overall using a fraction of the computing time of existing overlap-layout-consensus assemblers. This indicates Omega can more efficiently assemble longer Illumina reads, and at deeper coverage, for metagenomic datasets.</span></p><p>Address of the bookmark: <a href="http://omega.omicsbio.org/" rel="nofollow">http://omega.omicsbio.org/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34416/miniasm-very-fast-olc-based-de-novo-assembler-for-noisy-long-reads</guid>
	<pubDate>Mon, 27 Nov 2017 07:58:49 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34416/miniasm-very-fast-olc-based-de-novo-assembler-for-noisy-long-reads</link>
	<title><![CDATA[miniasm: very fast OLC-based de novo assembler for noisy long reads]]></title>
	<description><![CDATA[<p>Miniasm is a very fast OLC-based&nbsp;<em>de novo</em>&nbsp;assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by&nbsp;<a href="https://github.com/lh3/minimap">minimap</a>) as input and outputs an assembly graph in the&nbsp;<a href="https://github.com/pmelsted/GFA-spec/blob/master/GFA-spec.md">GFA</a>&nbsp;format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final&nbsp;<a href="http://wgs-assembler.sourceforge.net/wiki/index.php/Celera_Assembler_Terminology">unitig</a>&nbsp;sequences. Thus the per-base error rate is similar to the raw input reads.</p>
<p>So far miniasm is in early development stage. It has only been tested on a dozen of PacBio and Oxford Nanopore (ONT) bacterial data sets. Including the mapping step, it takes about 3 minutes to assemble a bacterial genome. Under the default setting, miniasm assembles 9 out of 12 PacBio datasets and 3 out of 4 ONT datasets into a single contig. The 12 PacBio data sets are&nbsp;<a href="https://github.com/PacificBiosciences/DevNet/wiki/E.-coli-Bacterial-Assembly">PacBio E. coli sample</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS473430">ERS473430</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS544009">ERS544009</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS554120">ERS554120</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS605484">ERS605484</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS617393">ERS617393</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS646601">ERS646601</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS659581">ERS659581</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS670327">ERS670327</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS685285">ERS685285</a>,&nbsp;<a href="http://www.ebi.ac.uk/ena/data/view/ERS743109">ERS743109</a>&nbsp;and a&nbsp;<a href="https://github.com/PacificBiosciences/DevNet/wiki/E.-coli-20kb-Size-Selected-Library-with-P6-C4/ce0533c1d2a957488594f0b29da61ffa3e4627e8">deprecated PacBio E. coli data set</a>. ONT data are acquired from the&nbsp;<a href="http://lab.loman.net/2015/09/24/first-sqk-map-006-experiment/">Loman Lab</a>.</p>
<p>For a&nbsp;<em>C. elegans</em>&nbsp;<a href="https://github.com/PacificBiosciences/DevNet/wiki/C.-elegans-data-set">PacBio data set</a>&nbsp;(only 40X are used, not the whole dataset), miniasm finishes the assembly, including reads overlapping, in ~10 minutes with 16 CPUs. The total assembly size is 105Mb; the N50 is 1.94Mb. In comparison, the&nbsp;<a href="https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/HGAP">HGAP3</a>produces a 104Mb assembly with N50 1.61Mb.&nbsp;<a href="http://lh3lh3.users.sourceforge.net/download/ce-miniasm.png">This dotter plot</a>&nbsp;gives a global view of the miniasm assembly (on the X axis) and the HGAP3 assembly (on Y). They are broadly comparable. Of course, the HGAP3 consensus sequences are much more accurate. In addition, on the whole data set (assembled in ~30 min), the miniasm N50 is reduced to 1.79Mb. Miniasm still needs improvements.</p>
<p>Miniasm confirms that at least for high-coverage bacterial genomes, it is possible to generate long contigs from raw PacBio or ONT reads without error correction. It also shows that&nbsp;<a href="https://github.com/lh3/minimap">minimap</a>&nbsp;can be used as a read overlapper, even though it is probably not as sensitive as the more sophisticated overlapers such as&nbsp;<a href="https://github.com/marbl/MHAP">MHAP</a>&nbsp;and&nbsp;<a href="https://github.com/thegenemyers/DALIGNER">DALIGNER</a>. Coupled with long-read error correctors and consensus tools, miniasm may also be useful to produce high-quality assemblies.</p>
<p>Minimap and miniasm are ultrafast tools for (i) mapping and (ii) assembly. Designed for long, noisy reads, they do not have a correction or consensus step, and therefore the resulting assemblies are contiguous (i.e. long) but very noisy (i.e. full of errors)</p>
<p>We start with an all against all comparison:</p>
<div>
<pre><code>minimap -Sw5 -L100 -m0 -t8 reads.fq reads.fq | gzip -1 &gt; reads.paf.gz
</code></pre>
</div>
<p>Then we can assemble</p>
<div>
<pre><code>miniasm -f reads.fq reads.paf.gz &gt; reads.gfa
</code></pre>
</div>
<p>Convert GFA to FASTA:</p>
<div>
<pre><code>awk <span>'/^S/{print "&gt;"$2"\n"$3}'</span> reads.gfa | fold &gt; reads.fa
</code></pre>
</div>
<p>And then count how many contigs:</p>
<div>
<pre><code>grep <span>"&gt;"</span> reads.fa | wc -l</code></pre>
</div>
<p>&nbsp;</p>
<pre><span><span>#</span> Download sample PacBio from the PBcR website</span>
wget -O- http://www.cbcb.umd.edu/software/PBcR/data/selfSampleData.tar.gz <span>|</span> tar zxf -
ln -s selfSampleData/pacbio_filtered.fastq reads.fq
<span><span>#</span> Install minimap and miniasm (requiring gcc and zlib)</span>
git clone https://github.com/lh3/minimap <span>&amp;&amp;</span> (cd minimap <span>&amp;&amp;</span> make)
git clone https://github.com/lh3/miniasm <span>&amp;&amp;</span> (cd miniasm <span>&amp;&amp;</span> make)
<span><span>#</span> Overlap</span>
minimap/minimap -Sw5 -L100 -m0 -t8 reads.fq reads.fq <span>|</span> gzip -1 <span>&gt;</span> reads.paf.gz
<span><span>#</span> Layout</span>
miniasm/miniasm -f reads.fq reads.paf.gz <span>&gt;</span> reads.gfa</pre><p>Address of the bookmark: <a href="https://github.com/lh3/miniasm" rel="nofollow">https://github.com/lh3/miniasm</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36621/hapcut2-robust-and-accurate-haplotype-assembly-for-diverse-sequencing-technologies</guid>
	<pubDate>Tue, 15 May 2018 07:35:26 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36621/hapcut2-robust-and-accurate-haplotype-assembly-for-diverse-sequencing-technologies</link>
	<title><![CDATA[HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies]]></title>
	<description><![CDATA[HapCUT2 is a maximum-likelihood-based tool for assembling haplotypes from DNA sequence reads, designed to "just work" with excellent speed and accuracy. We found that previously described haplotype assembly methods are specialized for specific read technologies or protocols, with slow or inaccurate performance on others. With this in mind, HapCUT2 is designed for speed and accuracy across diverse sequencing technologies, including but not limited to:

NGS short reads (Illumina HiSeq)
clone-based sequencing (Fosmid or BAC clones)
SMRT reads (PacBio)
Oxford Nanopore reads
10X Genomics Linked-Reads
proximity-ligation (Hi-C) reads
high-coverage sequencing (&gt;40x coverage-per-SNP) using above technologies
combinations of the above technologies (e.g. scaffold long reads with Hi-C reads)
See below for specific examples of command line options and best practices for some of these technologies.

NOTE: At this time HapCUT2 is for diploid organisms only. VCF input should contain diploid variants.

If you use HapCUT2 in your research, please cite:

Edge, P., Bafna, V. &amp; Bansal, V. HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res. gr.213462.116 (2016). doi:10.1101/gr.213462.116<p>Address of the bookmark: <a href="https://github.com/vibansal/HapCUT2" rel="nofollow">https://github.com/vibansal/HapCUT2</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>