<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43607?offset=20</link>
	<atom:link href="https://bioinformaticsonline.com/related/43607?offset=20" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42963/davi-deep-learning-based-tool-for-alignment-and-single-nucleotide-variant-identification</guid>
	<pubDate>Tue, 16 Mar 2021 05:41:33 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42963/davi-deep-learning-based-tool-for-alignment-and-single-nucleotide-variant-identification</link>
	<title><![CDATA[DAVI: Deep learning-based tool for alignment and single nucleotide variant identification]]></title>
	<description><![CDATA[<p>DAVI consists of models for both global and local alignment and for variant calling. We have evaluated the performance of DAVI against existing state-of-the-art tool sets and found that its accuracy and performance is comparable to existing tools used for bench-marking. We further demonstrate that while existing tools are based on data generated from a specific sequencing technology, the models proposed in DAVI are generic and can be used across different NGS technologies as well as across different species</p>
<p>https://iopscience.iop.org/article/10.1088/2632-2153/ab7e19/pdf</p><p>Address of the bookmark: <a href="https://github.com/gguptaiitd/NEAT" rel="nofollow">https://github.com/gguptaiitd/NEAT</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39383/geck-trio-based-comparative-benchmarking-of-variant-calls</guid>
	<pubDate>Sun, 19 May 2019 20:54:12 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39383/geck-trio-based-comparative-benchmarking-of-variant-calls</link>
	<title><![CDATA[geck: trio-based comparative benchmarking of variant calls]]></title>
	<description><![CDATA[<p><span>Determine the accuracy of our model by comparing the precision and recall of GATK Unified Genotyper and Haplotype Caller on the high-confidence SNPs of the NIST Ashkenazim trio and the two independent Platinum Genome trios. We show that our method is able to estimate&nbsp;</span><em>differential</em><span>&nbsp;precision and recall between the two pipelines with&nbsp;</span><span>10<span>&minus;3</span></span><span>uncertainty.</span></p><p>Address of the bookmark: <a href="https://github.com/sbg/geck" rel="nofollow">https://github.com/sbg/geck</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41146/lofreq-a-sequence-quality-aware-ultra-sensitive-variant-caller-for-ngs-data</guid>
	<pubDate>Tue, 18 Feb 2020 03:24:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41146/lofreq-a-sequence-quality-aware-ultra-sensitive-variant-caller-for-ngs-data</link>
	<title><![CDATA[LoFreq*: A sequence-quality aware, ultra-sensitive variant caller for NGS data]]></title>
	<description><![CDATA[<p>LoFreq* (i.e. LoFreq version 2) is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It makes full use of base-call qualities and other sources of errors inherent in sequencing (e.g. mapping or base/indel alignment uncertainty), which are usually ignored by other methods or only used for filtering.</p>
<p>https://github.com/CSB5/lofreq</p>
<p>http://csb5.github.io/lofreq/installation/</p>
<p>https://github.com/CSB5/lofreq/tree/master/dist</p><p>Address of the bookmark: <a href="http://csb5.github.io/lofreq/" rel="nofollow">http://csb5.github.io/lofreq/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43661/maftools</guid>
	<pubDate>Fri, 17 Dec 2021 03:18:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43661/maftools</link>
	<title><![CDATA[maftools]]></title>
	<description><![CDATA[<p>With advances in Cancer Genomics, <a href="https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/">Mutation Annotation Format</a> (MAF) is being widely accepted and used to store somatic variants detected. <a href="http://cancergenome.nih.gov">The Cancer Genome Atlas</a> Project has sequenced over 30 different cancers with sample size of each cancer type being over 200. <a href="https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files">Resulting data</a> consisting of somatic variants are stored in the form of <a href="https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/">Mutation Annotation Format</a>. This package attempts to summarize, analyze, annotate and visualize MAF files in an efficient manner from either TCGA sources or any in-house studies as long as the data is in MAF format.</p>
<p>https://www.bioconductor.org/packages/devel/bioc/vignettes/maftools/inst/doc/maftools.html</p><p>Address of the bookmark: <a href="https://github.com/PoisonAlien/maftools" rel="nofollow">https://github.com/PoisonAlien/maftools</a></p>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</guid>
	<pubDate>Mon, 10 Oct 2016 08:56:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</link>
	<title><![CDATA[PHYMMBL]]></title>
	<description><![CDATA[<p><span>Metagenomics sequencing projects collect samples of DNA from uncharacterized environments that may contain hundreds or even thousands of species. One of the main challenges in analyzing a metagenome is phylogenetic classification of raw sequence reads into groups representing the same or similar species. Such classification is a useful prerequisite for genome assembly and for analysis of the biological diversity present in a sample. The newest sequencing technologies have simultaneously made metagenomics easier, by making the sequencing process faster, and more difficult, by producing shorter read lengths than previous technologies. Methods for classifying sequences as short as 100 base pairs (bp) have until now been relatively inaccurate, requiring metagenomics projects to use older, long-read technologies.&nbsp;</span><strong>Phymm</strong><span>, a new classification approach for metagenomics data which uses interpolated Markov models (IMMs) to taxonomically classify DNA sequences, can accurately classify reads as short as 100 bp. Its accuracy for short reads represents a significant leap forward over previous composition-based classification methods.&nbsp;</span><strong>PhymmBL</strong><span>&nbsp;(rhymes with "thimble"), the hybrid classifier included in this distribution which combines analysis from both Phymm and&nbsp;</span><a href="http://www.ncbi.nlm.nih.gov/BLAST">BLAST</a><span>, produces even higher accuracy.</span></p><p>Address of the bookmark: <a href="http://www.cbcb.umd.edu/software/phymm/" rel="nofollow">http://www.cbcb.umd.edu/software/phymm/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33901/rnacon-web-server-for-the-prediction-and-classification-of-non-coding-rnas</guid>
	<pubDate>Mon, 17 Jul 2017 04:55:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33901/rnacon-web-server-for-the-prediction-and-classification-of-non-coding-rnas</link>
	<title><![CDATA[RNAcon: web-server for the prediction and classification of non-coding RNAs]]></title>
	<description><![CDATA[<p style="text-align: justify;">RNAcon is a web-server for the prediction and classification of non-coding RNAs. It uses SVM-based model for the discrimination between coding and ncRNAs and RandomForest-based prediction model for the classification of ncRNAs into different classes. The structural information based graph properties were used for the development of prediction model.</p>
<p style="text-align: justify;">The&nbsp;<a href="http://crdd.osdd.net/raghava/rnacon/RNAcon_v1.0.tar.gz">standalone version (Linux-based command-line) of RNAcon</a>&nbsp;is freely available for the global scientific community.</p>
<p style="text-align: justify;">Reference:&nbsp;<a href="http://www.biomedcentral.com/1471-2164/15/127/abstract">Panwar, B.; Arora, A. and Raghava, G.P.S. (2014) Prediction and classification of ncRNAs using structural information</a>BMC Genomics 2014, 15:127</p><p>Address of the bookmark: <a href="http://crdd.osdd.net/raghava/rnacon/" rel="nofollow">http://crdd.osdd.net/raghava/rnacon/</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41686/catbat-tool-for-taxonomic-classification-of-contigs-and-metagenome-assembled-genomes-mags</guid>
	<pubDate>Mon, 18 May 2020 10:53:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41686/catbat-tool-for-taxonomic-classification-of-contigs-and-metagenome-assembled-genomes-mags</link>
	<title><![CDATA[CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)]]></title>
	<description><![CDATA[<p>Contig Annotation Tool (CAT) and Bin Annotation Tool (BAT) are pipelines for the taxonomic classification of long DNA sequences and metagenome assembled genomes (MAGs/bins) of both known and (highly) unknown microorganisms, as generated by contemporary metagenomics studies. The core algorithm of both programs involves gene calling, mapping of predicted ORFs against the nr protein database, and voting-based classification of the entire contig / MAG based on classification of the individual ORFs. CAT and BAT can be run from intermediate steps if files are formated appropriately (see <a href="https://github.com/dutilh/CAT#usage">Usage</a>).</p><p>Address of the bookmark: <a href="https://github.com/dutilh/CAT" rel="nofollow">https://github.com/dutilh/CAT</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43916/understanding-dump-files-from-ncbi-taxonomy-database</guid>
	<pubDate>Fri, 15 Jul 2022 04:29:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43916/understanding-dump-files-from-ncbi-taxonomy-database</link>
	<title><![CDATA[Understanding DUMP files from NCBI Taxonomy database !]]></title>
	<description><![CDATA[<p>*.dmp files are bcp-like dump from GenBank taxonomy database</p><p>General information.</p><p>Field terminator is "\t|\t"</p><p>Row terminator is "\t|\n"</p><p>&nbsp;</p><p>nodes.dmp file consists of taxonomy nodes. The description for each node includes the following</p><p>fields:</p><p>tax_id -- node id in GenBank taxonomy database</p><p>&nbsp; parent tax_id -- parent node id in GenBank taxonomy database</p><p>&nbsp; rank -- rank of this node (superkingdom, kingdom, ...)&nbsp;</p><p>&nbsp; embl code -- locus-name prefix; not unique</p><p>&nbsp; division id -- see division.dmp file</p><p>&nbsp; inherited div flag&nbsp; (1 or 0) -- 1 if node inherits division from parent</p><p>&nbsp; genetic code id -- see gencode.dmp file</p><p>&nbsp; inherited GC&nbsp; flag&nbsp; (1 or 0) -- 1 if node inherits genetic code from parent</p><p>&nbsp; mitochondrial genetic code id -- see gencode.dmp file</p><p>&nbsp; inherited MGC flag&nbsp; (1 or 0) -- 1 if node inherits mitochondrial gencode from parent</p><p>&nbsp; GenBank hidden flag (1 or 0)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- 1 if name is suppressed in GenBank entry lineage</p><p>&nbsp; hidden subtree root flag (1 or 0) &nbsp; &nbsp; &nbsp; -- 1 if this subtree has no sequence data yet</p><p>&nbsp; comments -- free-text comments and citations</p><p>&nbsp;</p><p>Taxonomy names file (names.dmp):</p><p>tax_id -- the id of node associated with this name</p><p>name_txt -- name itself</p><p>unique name -- the unique variant of this name if name not unique</p><p>name class -- (synonym, common name, ...)</p><p>&nbsp;</p><p>Divisions file (division.dmp):</p><p>division id -- taxonomy database division id</p><p>division cde -- GenBank division code (three characters)</p><p>division name -- e.g. BCT, PLN, VRT, MAM, PRI...</p><p>comments</p><p>&nbsp;</p><p>Genetic codes file (gencode.dmp):</p><p>genetic code id -- GenBank genetic code id</p><p>abbreviation -- genetic code name abbreviation</p><p>name -- genetic code name</p><p>cde -- translation table for this genetic code</p><p>starts -- start codons for this genetic code</p><p>&nbsp;</p><p>Deleted nodes file (delnodes.dmp):</p><p>tax_id -- deleted node id</p><p>&nbsp;</p><p>Merged nodes file (merged.dmp):</p><p>old_tax_id&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- id of nodes which has been merged</p><p>new_tax_id&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- id of nodes which is result of merging</p><p>Citations file (citations.dmp):</p><p>cit_id -- the unique id of citation</p><p>cit_key -- citation key</p><p>pubmed_id -- unique id in PubMed database (0 if not in PubMed)</p><p>medline_id -- unique id in MedLine database (0 if not in MedLine)</p><p>url -- URL associated with citation</p><p>text -- any text (usually article name and authors).</p><p>-- The following characters are escaped in this text by a backslash:</p><p>-- newline (appear as "\n"),</p><p>-- tab character ("\t"),</p><p>-- double quotes ('\"'),</p><p>-- backslash character ("\\").</p><p>taxid_list -- list of node ids separated by a single space</p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44644/dengue-lineages</guid>
	<pubDate>Fri, 16 Aug 2024 04:40:14 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44644/dengue-lineages</link>
	<title><![CDATA[Dengue Lineages !]]></title>
	<description><![CDATA[<p><span>Our dengue virus lineage system splits up the current genotypes into major and minor lineages to provide additional spatiotemporal resolution and a common language to discuss important genomic diversity. A full description of the lineage system can be found&nbsp;</span><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11118645/">here.</a></p>
<p>https://dengue-lineages.org/</p><p>Address of the bookmark: <a href="https://dengue-lineages.org/" rel="nofollow">https://dengue-lineages.org/</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/videolist/watch/13267/the-genome-10k-project</guid>
	<pubDate>Tue, 29 Jul 2014 09:11:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/videolist/watch/13267/the-genome-10k-project</link>
	<title><![CDATA[The Genome 10K Project]]></title>
	<description><![CDATA[<iframe width="" height="" src="https://www.youtube-nocookie.com/embed/B57xDIGtCT0" frameborder="0" allowfullscreen></iframe>https://genome10k.soe.ucsc.edu

The Genome 10K project aims to assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species, approximately one for every vertebrate genus. The trajectory of cost reduction in DNA sequencing suggests that this project will be feasible within a few years. Capturing the genetic diversity of vertebrate species would create an unprecedented resource for the life sciences and for worldwide conservation efforts.

The growing Genome 10K Community of Scientists (G10KCOS), made up of leading scientists representing major zoos, museums, research centers, and universities around the world, is dedicated to coordinating efforts in tissue specimen collection that will lay the groundwork for a large-scale sequencing and analysis project.]]></description>
	
</item>

</channel>
</rss>