<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/43916?</link>
	<atom:link href="https://bioinformaticsonline.com/related/43916?" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43319/k-mers-tutorial-classification-and-taxonomy</guid>
	<pubDate>Thu, 26 Aug 2021 10:28:43 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43319/k-mers-tutorial-classification-and-taxonomy</link>
	<title><![CDATA[k-mers tutorial - classification and taxonomy]]></title>
	<description><![CDATA[<p>DNA k-mers underlie much of our assembly work, and we (along with many others!) have spent a lot of time thinking about how to&nbsp;<a href="http://www.pnas.org/content/109/33/13272">store k-mer graphs efficiently</a>,&nbsp;<a href="http://ivory.idyll.org/blog/what-is-diginorm.html">discard redundant data</a>, and&nbsp;<a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0101271">count them efficiently</a>.</p>
<p>More recently, we've been enthused about&nbsp;<a href="http://joss.theoj.org/papers/3d793c6e7db683bee7c03377a4a7f3c9">using k-mer based similarity measures</a>&nbsp;and&nbsp;<a href="http://ivory.idyll.org/blog/2016-sourmash-sbt.html">computing and searching k-mer-based sketch search databases for all the things</a>.</p>
<p>But I haven't spent too much talking about using k-mers for taxonomy, although that has become an&nbsp;<em>ahem</em>&nbsp;area of interest recently,&nbsp;<a href="http://www.biorxiv.org/content/early/2017/07/03/155358">if you read into our papers a bit</a>.</p>
<p>In this blog post I'm going to fix this by doing a little bit of a literature review and waxing enthusiastic about other people's work. Then in a future blog post I'll talk about how we're building off of this work in fun! and interesting? ways!</p><p>Address of the bookmark: <a href="http://ivory.idyll.org/blog/2017-something-about-kmers.html" rel="nofollow">http://ivory.idyll.org/blog/2017-something-about-kmers.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44720/a-beginners-guide-to-using-kraken-for-taxonomic-classification</guid>
	<pubDate>Fri, 13 Dec 2024 11:29:03 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44720/a-beginners-guide-to-using-kraken-for-taxonomic-classification</link>
	<title><![CDATA[A Beginner&#039;s Guide to Using Kraken for Taxonomic Classification]]></title>
	<description><![CDATA[<div>Kraken is a popular bioinformatics tool designed for fast and accurate taxonomic classification of metagenomic sequences. Its efficiency and precision make it a go-to resource for analyzing microbial communities, including bacteria, viruses, archaea, and fungi. Whether you're new to bioinformatics or experienced in the field, Kraken is an indispensable tool for taxonomic analysis.</div><div><div><div><div dir="auto"><div><div><p>In this blog, we&rsquo;ll walk through the basics of Kraken, from installation to running an analysis, and highlight its key features and applications.</p><h4><strong>What is Kraken?</strong></h4><p>Kraken is a sequence classification tool that assigns taxonomic labels to DNA sequences using exact k-mer matching. It uses a reference database of genomes, dividing sequences into k-mers and identifying matches in a computationally efficient way.</p><h4><strong>Key Features of Kraken</strong></h4><ul>
<li><strong>Speed</strong>: Kraken processes data much faster than alignment-based methods.</li>
<li><strong>Accuracy</strong>: It uses a precise k-mer matching algorithm for high-resolution taxonomic assignments.</li>
<li><strong>Scalability</strong>: It can handle large metagenomic datasets.</li>
<li><strong>Custom Databases</strong>: You can build and use custom databases tailored to your research needs.</li>
</ul><h4><strong>Installing Kraken</strong></h4><ol>
<li>
<p><strong>System Requirements</strong></p>
<ul>
<li>A Unix-based operating system (Linux/macOS).</li>
<li>Sufficient computational resources for database building (RAM and disk space).</li>
</ul>
</li>
<li>
<p><strong>Installation Steps</strong></p>
<ul>
<li>Clone the Kraken repository from GitHub:
<div>
<div>&nbsp;</div>
<div dir="ltr"><code>git <span style="font-size: 12.8px; font-weight: normal;">clone</span> https://github.com/DerrickWood/kraken.git <span style="font-size: 12.8px; font-weight: normal;">cd</span> kraken </code></div>
</div>
</li>
<li>Compile the Kraken binaries:
<div>
<div>&nbsp;</div>
<div dir="ltr"><code>make </code></div>
</div>
</li>
<li>Add Kraken to your PATH for easy access:
<div>
<div>&nbsp;</div>
<div dir="ltr"><code><span style="font-size: 12.8px; font-weight: normal;">export</span> PATH=<span style="font-size: 12.8px; font-weight: normal;">$PATH</span>:/path/to/kraken </code></div>
</div>
</li>
</ul>
</li>
</ol><h4><strong>Preparing a Database</strong></h4><p>Kraken requires a database of reference genomes. You can use a pre-built database or create a custom one.</p><ol>
<li>
<p><strong>Downloading a Pre-built Database</strong><br />Kraken offers pre-built databases, such as the <em>MiniKraken</em> database, which is lightweight and suitable for smaller datasets. Download it using:</p>
<div>
<div dir="ltr"><code>kraken-build --download-library minikraken </code></div>
</div>
</li>
<li>
<p><strong>Building a Custom Database</strong><br />To include specific genomes, download FASTA files and build the database:</p>
<div>
<div dir="ltr"><code>kraken-build --download-library bacteria --threads 4 --db my_database kraken-build --build --db my_database </code></div>
</div>
<p>This process may take considerable time and resources, depending on the size of the database.</p>
</li>
</ol><h4><strong>Running Kraken</strong></h4><p>Once the database is ready, you can classify sequences.</p><ol>
<li>
<p><strong>Basic Usage</strong><br />Use the following command to classify sequences:</p>
<div>
<div dir="ltr"><code>kraken --db my_database --threads 4 --fastq-input input_sequences.fastq --output kraken_output.txt </code></div>
</div>
<p>Key options:</p>
<ul>
<li><code>--db</code>: Specifies the database.</li>
<li><code>--threads</code>: Number of threads for parallel processing.</li>
<li><code>--fastq-input</code>: Indicates input file format (FASTQ/FASTA).</li>
</ul>
</li>
<li>
<p><strong>Interpreting Results</strong><br />Kraken generates an output file with columns for sequence IDs, taxonomic classifications, and the confidence score.</p>
</li>
</ol><h4><strong>Visualizing Kraken Results</strong></h4><p>Kraken results can be visualized using tools like <strong>Krona</strong> or converted to human-readable reports using <code>kraken-report</code>.</p><ol>
<li>
<p><strong>Generate a Report</strong></p>
<div>
<div dir="ltr"><code>kraken-report --db my_database kraken_output.txt &gt; kraken_report.txt </code></div>
</div>
</li>
<li>
<p><strong>Krona Visualization</strong><br />Install Krona and convert Kraken output for visualization:</p>
<div>
<div dir="ltr"><code>cut -f2,3 kraken_output.txt | ktImportTaxonomy -o krona_output.html </code></div>
</div>
<p>Open the HTML file in your browser to interactively explore the taxonomic classifications.</p>
</li>
</ol><h4><strong>Advanced Usage</strong></h4><ol>
<li>
<p><strong>Confidence Thresholds</strong><br />Adjust the confidence threshold for classification using the <code>--confidence</code> option. Higher values reduce false positives but may miss some true positives:</p>
<div>
<div dir="ltr"><code>kraken --db my_database --confidence 0.1 --fastq-input input.fastq </code></div>
</div>
</li>
<li>
<p><strong>Paired-End Reads</strong><br />For paired-end sequencing data, use:</p>
<div>
<div dir="ltr"><code>kraken --db my_database --paired reads_1.fastq reads_2.fastq </code></div>
</div>
</li>
<li>
<p><strong>Customizing K-mers</strong><br />Kraken allows you to set custom k-mer lengths during database building for specific applications.</p>
</li>
</ol><h4><strong>Applications of Kraken</strong></h4><ul>
<li><strong>Microbial Ecology</strong>: Characterizing microbial communities in soil, water, and the human microbiome.</li>
<li><strong>Pathogen Detection</strong>: Identifying pathogens in clinical samples.</li>
<li><strong>Fungal Research</strong>: Analyzing fungal diversity in metagenomic datasets.</li>
<li><strong>Environmental Monitoring</strong>: Tracking microbial populations in diverse habitats.</li>
</ul><h4><strong>Conclusion</strong></h4><p>Kraken is a versatile and efficient tool for taxonomic classification in metagenomics. Its speed, accuracy, and flexibility make it a favorite among bioinformaticians. By following this guide, you can set up and use Kraken to unlock insights into microbial and fungal communities, paving the way for discoveries in ecology, medicine, and biotechnology.</p></div></div></div></div></div></div>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32134/lifemap</guid>
	<pubDate>Mon, 10 Apr 2017 05:42:37 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32134/lifemap</link>
	<title><![CDATA[Lifemap]]></title>
	<description><![CDATA[<p><strong>Lifemap</strong> is an interactive tool to explore the WHOLE NCBI TAXONOMY. The concept used in <strong>Lifemap</strong> is similar to the one used in cartography with tools like Google Maps&copy; or Open Street Maps: exploring is done by zooming and panning.</p>
<div>
<p>&nbsp;The current tree contains ALL species present in NCBI taxonomy as of <span style="text-decoration: underline;">October 18th, 2016</span>: 1,135,169 species including 10,545 Archaea, 418,777 Bacteria and 705,847 Eukaryotes. The Lifemap tree is updated every two weeks.</p>
</div>
<p>&nbsp;All the nodes in the tree are clickable. This displays various information and options:</p>
<ul>
<li>The species name (and the associated common name if there is one)</li>
<li>The rank (kingdom, family, class, species...)</li>
<li>Ability to go to the corresponding node/species on NCBI web site (displayed in a new window)</li>
<li>Possibility to download the corresponding subtree in newick extended format</li>
<li>Possibilty to get the whole lineage from the current node/tip to the root of the tree.</li>
</ul><p>Address of the bookmark: <a href="http://lifemap-ncbi.univ-lyon1.fr/" rel="nofollow">http://lifemap-ncbi.univ-lyon1.fr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28119/kraken-ultrafast-metagenomic-sequence-classification-using-exact-alignments</guid>
	<pubDate>Mon, 27 Jun 2016 11:01:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28119/kraken-ultrafast-metagenomic-sequence-classification-using-exact-alignments</link>
	<title><![CDATA[Kraken: ultrafast metagenomic sequence classification using exact alignments]]></title>
	<description><![CDATA[<p>Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster abundance estimation programs, which only classify small subsets of metagenomic data. Using exact alignment of <em>k</em>-mers, Kraken achieves classification accuracy comparable to the fastest BLAST program. In its fastest mode, Kraken classifies 100 base pair reads at a rate of over 4.1 million reads per minute, 909 times faster than Megablast and 11 times faster than the abundance estimation program MetaPhlAn. Kraken is available at <a href="http://ccb.jhu.edu/software/kraken/" target="pmc_ext">http://ccb.jhu.edu/software/kraken/</a>.</p>
<p>Krona</p>
<p>https://sourceforge.net/p/krona/home/krona/</p><p>Address of the bookmark: <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053813/" rel="nofollow">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053813/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43607/classification-of-sars-cov2-variant</guid>
	<pubDate>Fri, 26 Nov 2021 12:53:12 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43607/classification-of-sars-cov2-variant</link>
	<title><![CDATA[Classification of SARS-CoV2 Variant !]]></title>
	<description><![CDATA[<p>The scientists established some guidelines for determining whether a variant is a legitimate branch of an existing lineage:</p><p>The variant should be transmitted from its original location to another "geographically distinct population"&mdash;say, another country or a province of a large and populous country.<br />It should differ from its ancestor by at least one nucleotide.<br />At least 95% of its genetic code should have been sequenced at least five times from different samples.</p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36644/tacoa-taxonomic-classification-of-environmental-genomic-fragments-using-a-kernelized-nearest-neighbor-approach</guid>
	<pubDate>Tue, 15 May 2018 09:52:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36644/tacoa-taxonomic-classification-of-environmental-genomic-fragments-using-a-kernelized-nearest-neighbor-approach</link>
	<title><![CDATA[TACOA: Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach]]></title>
	<description><![CDATA[TACOA is a software that can accurately predict the taxonomic origin of genomic fragments from metagenomic data sets by combining the advantages of the k -NN approach with a smoothing kernel function. 

TACOA can be easily installed and run on a desktop computer, therefore allowing researchers to locally analyze their metagenomic sequence data or integrate it into their pipelines.<p>Address of the bookmark: <a href="http://www.cebitec.uni-bielefeld.de/index.php/2-uncategorised/99-tacoa" rel="nofollow">http://www.cebitec.uni-bielefeld.de/index.php/2-uncategorised/99-tacoa</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44236/type-of-ssr</guid>
	<pubDate>Thu, 09 Mar 2023 04:35:41 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44236/type-of-ssr</link>
	<title><![CDATA[Type of SSR]]></title>
	<description><![CDATA[<div><div><div><div><div><div><div><div><div><div><p>Types of SSRs (simple sequence repeats), SSRs are short DNA sequences consisting of a tandem repeat of a few nucleotides, typically 2-6 nucleotides in length. There are different types of SSRs based on the length and pattern of the repeated sequence, as well as the presence or absence of interruptions of non-repeated nucleotides within the repeat array. The four types of SSRs are:</p><ol>
<li>
<p>Perfect SSR: This is the simplest type of SSR, where the same repeat motif is present adjacent to each other without any interruption of any other nucleotide. For example, a perfect SSR with the repeat motif "CAT" would be "CATCATCATCAT", where the "CAT" sequence is repeated four times.</p>
</li>
<li>
<p>Imperfect SSR: This type of SSR contains repeat motifs that are interrupted by one or a few non-repeat nucleotides. For example, an imperfect SSR with the repeat motif "CAT" would be "CATCATGGCATCATCAT", where the "CAT" sequence is repeated twice, but interrupted by "GG".</p>
</li>
<li>
<p>Compound perfect SSR: This type of SSR contains two or more repeat motifs lying adjacent to each other, separated by no or very few intervening nucleotides. For example, a compound perfect SSR with the repeat motifs "CAT" and "GTC" would be "CATCATCATGTCGTC", where the "CAT" sequence is repeated three times, followed by the "GTC" sequence repeated twice.</p>
</li>
<li>
<p>Compound imperfect SSR: This type of SSR contains two or more repeat motifs interrupted by several non-repeat nucleotides. For example, a compound imperfect SSR with the repeat motifs "CAT" and "GTC" would be "CATCATCATNNNNNNNGTCGTCGTC", where the "CAT" sequence is repeated three times, interrupted by several non-repeat nucleotides, followed by the "GTC" sequence repeated three times.</p>
</li>
</ol></div></div></div></div></div></div></div></div></div></div>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/2839/look-up-a-biological-numbers</guid>
	<pubDate>Fri, 23 Aug 2013 03:27:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/2839/look-up-a-biological-numbers</link>
	<title><![CDATA[Look up a biological numbers]]></title>
	<description><![CDATA[<p><strong>Did you ever need to look up a number</strong><span>&nbsp;like the volume of a cell or the cellular concentration of ATP, only to find yourself spending much more time than you wanted on the Internet or flipping through textbooks - all without much success?&nbsp;</span><br><br><span>Well, it didn&rsquo;t happen only to you. It is often surprising how difficult it can be to find concrete biological numbers, even for properties that have been measured numerous times. To help solve this for one and all, BioNumbers (</span><strong>the database of key numbers in molecular biology</strong><span>) was created. Along with the numbers, you'll find the relevant&nbsp;</span><strong>references to the original literature</strong><span>, useful comments, and related numbers.&nbsp;</span></p>
<p><span><span>To cite BioNumbers please refer to: Milo et al. Nucl. Acids Res. (2010) 38: D750-D753. When using a specific entry from the database it is highly recommended that you also specify the BioNumbers 6 digit ID, e.g. "BNID 100986, Milo et al 2010".&nbsp;</span></span></p><p>Address of the bookmark: <a href="http://bionumbers.hms.harvard.edu/" rel="nofollow">http://bionumbers.hms.harvard.edu/</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27691/histonedb-20-%E2%80%93-with-variants</guid>
	<pubDate>Fri, 03 Jun 2016 05:06:20 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27691/histonedb-20-%E2%80%93-with-variants</link>
	<title><![CDATA[HistoneDB 2.0 – with variants]]></title>
	<description><![CDATA[<p><span>This histone database can be used to explore the diversity of histone proteins and their sequence variants in many organisms. The resource was established to better understand how sequence variation may affect functional and structural features of nucleosomes. To get started, select a histone type to explore its variants.</span></p>
<p><span>More at&nbsp;http://www.ncbi.nlm.nih.gov/projects/HistoneDB2.0/index.fcgi/browse/</span></p><p>Address of the bookmark: <a href="http://www.ncbi.nlm.nih.gov/projects/HistoneDB2.0/index.fcgi/browse/" rel="nofollow">http://www.ncbi.nlm.nih.gov/projects/HistoneDB2.0/index.fcgi/browse/</a></p>]]></description>
	<dc:creator>Anjana</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/33976/goldgenomes-online-database</guid>
	<pubDate>Wed, 26 Jul 2017 07:49:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/33976/goldgenomes-online-database</link>
	<title><![CDATA[GOLD:Genomes Online Database]]></title>
	<description><![CDATA[<p><span>GOLD</span><span>:Genomes Online Database, is a World Wide Web resource for comprehensive access to information regarding genome and metagenome sequencing projects, and their associated metadata, around the world.</span></p>
<p>https://gold.jgi.doe.gov/</p><p>Address of the bookmark: <a href="https://gold.jgi.doe.gov/" rel="nofollow">https://gold.jgi.doe.gov/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>