<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/44644?offset=10</link>
	<atom:link href="https://bioinformaticsonline.com/related/44644?offset=10" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</guid>
	<pubDate>Mon, 10 Oct 2016 08:56:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29384/phymmbl</link>
	<title><![CDATA[PHYMMBL]]></title>
	<description><![CDATA[<p><span>Metagenomics sequencing projects collect samples of DNA from uncharacterized environments that may contain hundreds or even thousands of species. One of the main challenges in analyzing a metagenome is phylogenetic classification of raw sequence reads into groups representing the same or similar species. Such classification is a useful prerequisite for genome assembly and for analysis of the biological diversity present in a sample. The newest sequencing technologies have simultaneously made metagenomics easier, by making the sequencing process faster, and more difficult, by producing shorter read lengths than previous technologies. Methods for classifying sequences as short as 100 base pairs (bp) have until now been relatively inaccurate, requiring metagenomics projects to use older, long-read technologies.&nbsp;</span><strong>Phymm</strong><span>, a new classification approach for metagenomics data which uses interpolated Markov models (IMMs) to taxonomically classify DNA sequences, can accurately classify reads as short as 100 bp. Its accuracy for short reads represents a significant leap forward over previous composition-based classification methods.&nbsp;</span><strong>PhymmBL</strong><span>&nbsp;(rhymes with "thimble"), the hybrid classifier included in this distribution which combines analysis from both Phymm and&nbsp;</span><a href="http://www.ncbi.nlm.nih.gov/BLAST">BLAST</a><span>, produces even higher accuracy.</span></p><p>Address of the bookmark: <a href="http://www.cbcb.umd.edu/software/phymm/" rel="nofollow">http://www.cbcb.umd.edu/software/phymm/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44720/a-beginners-guide-to-using-kraken-for-taxonomic-classification</guid>
	<pubDate>Fri, 13 Dec 2024 11:29:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44720/a-beginners-guide-to-using-kraken-for-taxonomic-classification</link>
	<title><![CDATA[A Beginner&#039;s Guide to Using Kraken for Taxonomic Classification]]></title>
	<description><![CDATA[<div>Kraken is a popular bioinformatics tool designed for fast and accurate taxonomic classification of metagenomic sequences. Its efficiency and precision make it a go-to resource for analyzing microbial communities, including bacteria, viruses, archaea, and fungi. Whether you're new to bioinformatics or experienced in the field, Kraken is an indispensable tool for taxonomic analysis.</div><div><div><div><div dir="auto"><div><div><p>In this blog, we&rsquo;ll walk through the basics of Kraken, from installation to running an analysis, and highlight its key features and applications.</p><h4><strong>What is Kraken?</strong></h4><p>Kraken is a sequence classification tool that assigns taxonomic labels to DNA sequences using exact k-mer matching. It uses a reference database of genomes, dividing sequences into k-mers and identifying matches in a computationally efficient way.</p><h4><strong>Key Features of Kraken</strong></h4><ul>
<li><strong>Speed</strong>: Kraken processes data much faster than alignment-based methods.</li>
<li><strong>Accuracy</strong>: It uses a precise k-mer matching algorithm for high-resolution taxonomic assignments.</li>
<li><strong>Scalability</strong>: It can handle large metagenomic datasets.</li>
<li><strong>Custom Databases</strong>: You can build and use custom databases tailored to your research needs.</li>
</ul><h4><strong>Installing Kraken</strong></h4><ol>
<li>
<p><strong>System Requirements</strong></p>
<ul>
<li>A Unix-based operating system (Linux/macOS).</li>
<li>Sufficient computational resources for database building (RAM and disk space).</li>
</ul>
</li>
<li>
<p><strong>Installation Steps</strong></p>
<ul>
<li>Clone the Kraken repository from GitHub:
<div>
<div>&nbsp;</div>
<div dir="ltr"><code>git <span style="font-size: 12.8px; font-weight: normal;">clone</span> https://github.com/DerrickWood/kraken.git <span style="font-size: 12.8px; font-weight: normal;">cd</span> kraken </code></div>
</div>
</li>
<li>Compile the Kraken binaries:
<div>
<div>&nbsp;</div>
<div dir="ltr"><code>make </code></div>
</div>
</li>
<li>Add Kraken to your PATH for easy access:
<div>
<div>&nbsp;</div>
<div dir="ltr"><code><span style="font-size: 12.8px; font-weight: normal;">export</span> PATH=<span style="font-size: 12.8px; font-weight: normal;">$PATH</span>:/path/to/kraken </code></div>
</div>
</li>
</ul>
</li>
</ol><h4><strong>Preparing a Database</strong></h4><p>Kraken requires a database of reference genomes. You can use a pre-built database or create a custom one.</p><ol>
<li>
<p><strong>Downloading a Pre-built Database</strong><br />Kraken offers pre-built databases, such as the <em>MiniKraken</em> database, which is lightweight and suitable for smaller datasets. Download it using:</p>
<div>
<div dir="ltr"><code>kraken-build --download-library minikraken </code></div>
</div>
</li>
<li>
<p><strong>Building a Custom Database</strong><br />To include specific genomes, download FASTA files and build the database:</p>
<div>
<div dir="ltr"><code>kraken-build --download-library bacteria --threads 4 --db my_database kraken-build --build --db my_database </code></div>
</div>
<p>This process may take considerable time and resources, depending on the size of the database.</p>
</li>
</ol><h4><strong>Running Kraken</strong></h4><p>Once the database is ready, you can classify sequences.</p><ol>
<li>
<p><strong>Basic Usage</strong><br />Use the following command to classify sequences:</p>
<div>
<div dir="ltr"><code>kraken --db my_database --threads 4 --fastq-input input_sequences.fastq --output kraken_output.txt </code></div>
</div>
<p>Key options:</p>
<ul>
<li><code>--db</code>: Specifies the database.</li>
<li><code>--threads</code>: Number of threads for parallel processing.</li>
<li><code>--fastq-input</code>: Indicates input file format (FASTQ/FASTA).</li>
</ul>
</li>
<li>
<p><strong>Interpreting Results</strong><br />Kraken generates an output file with columns for sequence IDs, taxonomic classifications, and the confidence score.</p>
</li>
</ol><h4><strong>Visualizing Kraken Results</strong></h4><p>Kraken results can be visualized using tools like <strong>Krona</strong> or converted to human-readable reports using <code>kraken-report</code>.</p><ol>
<li>
<p><strong>Generate a Report</strong></p>
<div>
<div dir="ltr"><code>kraken-report --db my_database kraken_output.txt &gt; kraken_report.txt </code></div>
</div>
</li>
<li>
<p><strong>Krona Visualization</strong><br />Install Krona and convert Kraken output for visualization:</p>
<div>
<div dir="ltr"><code>cut -f2,3 kraken_output.txt | ktImportTaxonomy -o krona_output.html </code></div>
</div>
<p>Open the HTML file in your browser to interactively explore the taxonomic classifications.</p>
</li>
</ol><h4><strong>Advanced Usage</strong></h4><ol>
<li>
<p><strong>Confidence Thresholds</strong><br />Adjust the confidence threshold for classification using the <code>--confidence</code> option. Higher values reduce false positives but may miss some true positives:</p>
<div>
<div dir="ltr"><code>kraken --db my_database --confidence 0.1 --fastq-input input.fastq </code></div>
</div>
</li>
<li>
<p><strong>Paired-End Reads</strong><br />For paired-end sequencing data, use:</p>
<div>
<div dir="ltr"><code>kraken --db my_database --paired reads_1.fastq reads_2.fastq </code></div>
</div>
</li>
<li>
<p><strong>Customizing K-mers</strong><br />Kraken allows you to set custom k-mer lengths during database building for specific applications.</p>
</li>
</ol><h4><strong>Applications of Kraken</strong></h4><ul>
<li><strong>Microbial Ecology</strong>: Characterizing microbial communities in soil, water, and the human microbiome.</li>
<li><strong>Pathogen Detection</strong>: Identifying pathogens in clinical samples.</li>
<li><strong>Fungal Research</strong>: Analyzing fungal diversity in metagenomic datasets.</li>
<li><strong>Environmental Monitoring</strong>: Tracking microbial populations in diverse habitats.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Kraken is a versatile and efficient tool for taxonomic classification in metagenomics. Its speed, accuracy, and flexibility make it a favorite among bioinformaticians. By following this guide, you can set up and use Kraken to unlock insights into microbial and fungal communities, paving the way for discoveries in ecology, medicine, and biotechnology.</p></div></div></div></div></div></div>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41033/clark-fast-accurate-and-versatile-sequence-classification-system</guid>
	<pubDate>Sat, 15 Feb 2020 01:49:01 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41033/clark-fast-accurate-and-versatile-sequence-classification-system</link>
	<title><![CDATA[CLARK: Fast, accurate and versatile sequence classification system]]></title>
	<description><![CDATA[<p><span></span><a href="http://dx.doi.org/10.1186/s12864-015-1419-2"><strong>CLARK</strong></a><span>, a method based on a supervised sequence classification using discriminative&nbsp;</span><em>k</em><span>-mers. Considering two distinct specific classification problems (see the article for details), namely (1) the taxonomic classification of metagenomic reads to known bacterial genomes, and (2) the assignment of BAC clones and transcript to chromosome arms/centromeres (in the absence of a finished assembly for the reference genome), CLARK outperforms in classification speed and precision the best state-of-the-art methods.</span></p>
<p><span><a href="http://clark.cs.ucr.edu/Spaced/">http://clark.cs.ucr.edu/Spaced/</a></span></p><p>Address of the bookmark: <a href="http://clark.cs.ucr.edu/Spaced/" rel="nofollow">http://clark.cs.ucr.edu/Spaced/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43826/tiara-deep-learning-based-classification-system-for-eukaryotic-sequences</guid>
	<pubDate>Mon, 14 Mar 2022 23:02:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43826/tiara-deep-learning-based-classification-system-for-eukaryotic-sequences</link>
	<title><![CDATA[Tiara: deep learning-based classification system for eukaryotic sequences]]></title>
	<description><![CDATA[<p><span>With a large number of metagenomic datasets becoming available, eukaryotic metagenomics emerged as a new challenge. The proper classification of eukaryotic nuclear and organellar genomes is an essential step toward a better understanding of eukaryotic diversity.</span></p><p>Address of the bookmark: <a href="https://academic.oup.com/bioinformatics/article/38/2/344/6375939" rel="nofollow">https://academic.oup.com/bioinformatics/article/38/2/344/6375939</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44329/metabuli-%EB%B6%84%EB%A6%AC-improves-metagenomic-read-classification</guid>
	<pubDate>Sat, 03 Jun 2023 20:15:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44329/metabuli-%EB%B6%84%EB%A6%AC-improves-metagenomic-read-classification</link>
	<title><![CDATA[Metabuli 분리 improves metagenomic read classification]]></title>
	<description><![CDATA[<p><span>Metabuli 분리 improves metagenomic read classification through metamers, DNA-AA k-mers, to be sensitive and specific, recovering 99% and 98% of DNA or AA classifiers.</span></p>
<p>&nbsp;</p>
<p><span><span>Metabuli is metagenomic classifier that jointly analyze both DNA and amino acid (AA) sequences. DNA-based classifiers can make specific classifications, exploiting point mutations to distinguish close taxa. AA-based classifiers have higher sensitivity in detecting homology between query and reference sequences, leverageing higher conservation of AA sequences. Metabuli combines the information of both sequence types using a novel k-mer structure,&nbsp;</span><em>metamer</em><span>, to enable both specific and sensitive characterization of metagenomic samples. In addition, it can classify reads against a database of any size as long as it fits in the hard disk.</span> </span></p><p>Address of the bookmark: <a href="https://github.com/steineggerlab/Metabuli" rel="nofollow">https://github.com/steineggerlab/Metabuli</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43583/pango-lineage-analysis</guid>
	<pubDate>Mon, 15 Nov 2021 03:38:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43583/pango-lineage-analysis</link>
	<title><![CDATA[Pango Lineage Analysis !]]></title>
	<description><![CDATA[<p>The Pango nomenclature is being used by researchers and public health agencies worldwide to track the transmission and spread of SARS-CoV-2, including variants of concern. This website documents all current Pango lineages and their spread, as well as various software tools which can be used by researchers to perform analyses on SARS-COV-2 sequence data.</p><p>Address of the bookmark: <a href="https://cov-lineages.org/resources/pangolin/output.html" rel="nofollow">https://cov-lineages.org/resources/pangolin/output.html</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33901/rnacon-web-server-for-the-prediction-and-classification-of-non-coding-rnas</guid>
	<pubDate>Mon, 17 Jul 2017 04:55:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33901/rnacon-web-server-for-the-prediction-and-classification-of-non-coding-rnas</link>
	<title><![CDATA[RNAcon: web-server for the prediction and classification of non-coding RNAs]]></title>
	<description><![CDATA[<p style="text-align: justify;">RNAcon is a web-server for the prediction and classification of non-coding RNAs. It uses SVM-based model for the discrimination between coding and ncRNAs and RandomForest-based prediction model for the classification of ncRNAs into different classes. The structural information based graph properties were used for the development of prediction model.</p>
<p style="text-align: justify;">The&nbsp;<a href="http://crdd.osdd.net/raghava/rnacon/RNAcon_v1.0.tar.gz">standalone version (Linux-based command-line) of RNAcon</a>&nbsp;is freely available for the global scientific community.</p>
<p style="text-align: justify;">Reference:&nbsp;<a href="http://www.biomedcentral.com/1471-2164/15/127/abstract">Panwar, B.; Arora, A. and Raghava, G.P.S. (2014) Prediction and classification of ncRNAs using structural information</a>BMC Genomics 2014, 15:127</p><p>Address of the bookmark: <a href="http://crdd.osdd.net/raghava/rnacon/" rel="nofollow">http://crdd.osdd.net/raghava/rnacon/</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41686/catbat-tool-for-taxonomic-classification-of-contigs-and-metagenome-assembled-genomes-mags</guid>
	<pubDate>Mon, 18 May 2020 10:53:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41686/catbat-tool-for-taxonomic-classification-of-contigs-and-metagenome-assembled-genomes-mags</link>
	<title><![CDATA[CAT/BAT: tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)]]></title>
	<description><![CDATA[<p>Contig Annotation Tool (CAT) and Bin Annotation Tool (BAT) are pipelines for the taxonomic classification of long DNA sequences and metagenome assembled genomes (MAGs/bins) of both known and (highly) unknown microorganisms, as generated by contemporary metagenomics studies. The core algorithm of both programs involves gene calling, mapping of predicted ORFs against the nr protein database, and voting-based classification of the entire contig / MAG based on classification of the individual ORFs. CAT and BAT can be run from intermediate steps if files are formated appropriately (see <a href="https://github.com/dutilh/CAT#usage">Usage</a>).</p><p>Address of the bookmark: <a href="https://github.com/dutilh/CAT" rel="nofollow">https://github.com/dutilh/CAT</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43916/understanding-dump-files-from-ncbi-taxonomy-database</guid>
	<pubDate>Fri, 15 Jul 2022 04:29:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43916/understanding-dump-files-from-ncbi-taxonomy-database</link>
	<title><![CDATA[Understanding DUMP files from NCBI Taxonomy database !]]></title>
	<description><![CDATA[<p>*.dmp files are bcp-like dump from GenBank taxonomy database</p><p>General information.</p><p>Field terminator is "\t|\t"</p><p>Row terminator is "\t|\n"</p><p>&nbsp;</p><p>nodes.dmp file consists of taxonomy nodes. The description for each node includes the following</p><p>fields:</p><p>tax_id -- node id in GenBank taxonomy database</p><p>&nbsp; parent tax_id -- parent node id in GenBank taxonomy database</p><p>&nbsp; rank -- rank of this node (superkingdom, kingdom, ...)&nbsp;</p><p>&nbsp; embl code -- locus-name prefix; not unique</p><p>&nbsp; division id -- see division.dmp file</p><p>&nbsp; inherited div flag&nbsp; (1 or 0) -- 1 if node inherits division from parent</p><p>&nbsp; genetic code id -- see gencode.dmp file</p><p>&nbsp; inherited GC&nbsp; flag&nbsp; (1 or 0) -- 1 if node inherits genetic code from parent</p><p>&nbsp; mitochondrial genetic code id -- see gencode.dmp file</p><p>&nbsp; inherited MGC flag&nbsp; (1 or 0) -- 1 if node inherits mitochondrial gencode from parent</p><p>&nbsp; GenBank hidden flag (1 or 0)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- 1 if name is suppressed in GenBank entry lineage</p><p>&nbsp; hidden subtree root flag (1 or 0) &nbsp; &nbsp; &nbsp; -- 1 if this subtree has no sequence data yet</p><p>&nbsp; comments -- free-text comments and citations</p><p>&nbsp;</p><p>Taxonomy names file (names.dmp):</p><p>tax_id -- the id of node associated with this name</p><p>name_txt -- name itself</p><p>unique name -- the unique variant of this name if name not unique</p><p>name class -- (synonym, common name, ...)</p><p>&nbsp;</p><p>Divisions file (division.dmp):</p><p>division id -- taxonomy database division id</p><p>division cde -- GenBank division code (three characters)</p><p>division name -- e.g. BCT, PLN, VRT, MAM, PRI...</p><p>comments</p><p>&nbsp;</p><p>Genetic codes file (gencode.dmp):</p><p>genetic code id -- GenBank genetic code id</p><p>abbreviation -- genetic code name abbreviation</p><p>name -- genetic code name</p><p>cde -- translation table for this genetic code</p><p>starts -- start codons for this genetic code</p><p>&nbsp;</p><p>Deleted nodes file (delnodes.dmp):</p><p>tax_id -- deleted node id</p><p>&nbsp;</p><p>Merged nodes file (merged.dmp):</p><p>old_tax_id&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- id of nodes which has been merged</p><p>new_tax_id&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- id of nodes which is result of merging</p><p>Citations file (citations.dmp):</p><p>cit_id -- the unique id of citation</p><p>cit_key -- citation key</p><p>pubmed_id -- unique id in PubMed database (0 if not in PubMed)</p><p>medline_id -- unique id in MedLine database (0 if not in MedLine)</p><p>url -- URL associated with citation</p><p>text -- any text (usually article name and authors).</p><p>-- The following characters are escaped in this text by a backslash:</p><p>-- newline (appear as "\n"),</p><p>-- tab character ("\t"),</p><p>-- double quotes ('\"'),</p><p>-- backslash character ("\\").</p><p>taxid_list -- list of node ids separated by a single space</p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44506/mosquito-species-known-for-transmitting-the-dengue-virus</guid>
	<pubDate>Wed, 03 Apr 2024 00:05:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44506/mosquito-species-known-for-transmitting-the-dengue-virus</link>
	<title><![CDATA[Mosquito species known for transmitting the Dengue virus]]></title>
	<description><![CDATA[<p><span>Here is a list of mosquito species known for transmitting the Dengue virus along with essential and applied information about each species:</span><br /><br /><span>1. Aedes aegypti:</span><br /><span>- Geographical Distribution: Found in tropical and subtropical regions worldwide.</span><br /><span>- Biting Behavior: Daytime biter, prefers feeding indoors, often around human dwellings.</span><br /><span>- Role in Dengue Transmission: Primary vector responsible for transmitting Dengue virus to humans.</span><br /><br /><span>2. Aedes albopictus (Asian tiger mosquito):</span><br /><span>- Geographical Distribution: Found in tropical, subtropical, and temperate regions worldwide.</span><br /><span>- Biting Behavior: Daytime biter, feeds both indoors and outdoors, aggressive feeder.</span><br /><span>- Role in Dengue Transmission: Secondary vector, can transmit Dengue virus to humans.</span><br /><br /><span>3. Aedes polynesiensis:</span><br /><span>- Geographical Distribution: Found in Pacific Islands and coastal regions.</span><br /><span>- Biting Behavior: Daytime biter, prefers feeding outdoors, often near coastal areas.</span><br /><span>- Role in Dengue Transmission: Vector of Dengue virus in specific geographic regions.</span><br /><br /><span>4. Aedes scutellaris:</span><br /><span>- Geographical Distribution: Found in Southeast Asia, Pacific Islands, and coastal regions.</span><br /><span>- Biting Behavior: Daytime feeder, active in shaded areas, prefers outdoor environments.</span><br /><span>- Role in Dengue Transmission: Vector of Dengue virus, particularly in coastal areas.</span><br /><br /><span>5. Aedes africanus:</span><br /><span>- Geographical Distribution: Found in parts of Africa, including forested areas.</span><br /><span>- Biting Behavior: Daytime feeder, prefers shaded areas, bites humans and other animals.</span><br /><span>- Role in Dengue Transmission: Vector of Dengue virus in African regions.</span><br /><br /><span>Understanding the geographical distribution and biting behavior of these mosquito species is crucial for implementing effective control and prevention strategies to reduce Dengue virus transmission.</span></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>

</channel>
</rss>