<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/34620?offset=250</link>
	<atom:link href="https://bioinformaticsonline.com/related/34620?offset=250" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	
<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/43044/kanthida-lab</guid>
  <pubDate>Wed, 28 Apr 2021 02:27:22 -0500</pubDate>
  <link></link>
  <title><![CDATA[Kanthida Lab !]]></title>
  <description><![CDATA[
<p>Research Interest: </p>

<p>Bioinformatics </p>

<p>High-throughput and high-dimensional data analysis</p>

<p>Microbiome data analysis (Main focus)</p>

<p>Next-generation and third-generation sequencing data analysis for genomics</p>

<p>Gene expression data analysis</p>

<p>Machine learning for biological data</p>

<p>Biomarkers identification </p>

<p>Database and web-application for biological data</p>

<p>More at <br />https://sites.google.com/mail.kmutt.ac.th/kanthida-k/home?authuser=0</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34552/edit-distance-application-in-bioinformatics</guid>
	<pubDate>Thu, 07 Dec 2017 08:46:51 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34552/edit-distance-application-in-bioinformatics</link>
	<title><![CDATA[Edit distance application in bioinformatics !]]></title>
	<description><![CDATA[<p>There are other popular measures of&nbsp;<a href="https://en.wikipedia.org/wiki/Edit_distance" title="Edit distance">edit distance</a>, which are calculated using a different set of allowable edit operations. For instance,</p><ul>
<li>the&nbsp;<a href="https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance" title="Damerau&ndash;Levenshtein distance">Damerau&ndash;Levenshtein distance</a>&nbsp;allows insertion, deletion, substitution, and the&nbsp;<a href="https://en.wikipedia.org/wiki/Transposition_(mathematics)" title="Transposition (mathematics)">transposition</a>&nbsp;of two adjacent characters;</li>
<li>the&nbsp;<a href="https://en.wikipedia.org/wiki/Longest_common_subsequence_problem" title="Longest common subsequence problem">longest common subsequence</a>&nbsp;(LCS) distance allows only insertion and deletion, not substitution;</li>
<li>the&nbsp;<a href="https://en.wikipedia.org/wiki/Hamming_distance" title="Hamming distance">Hamming distance</a>&nbsp;allows only substitution, hence, it only applies to strings of the same length.</li>
<li>the&nbsp;<a href="https://en.wikipedia.org/wiki/Jaro_distance" title="Jaro distance">Jaro distance</a>&nbsp;allows only&nbsp;<a href="https://en.wikipedia.org/wiki/Transposition_(mathematics)" title="Transposition (mathematics)">transposition</a>.</li>
</ul><p>&nbsp;</p><pre><span>use</span> Text<span>::</span>Levenshtein <span>qw</span><span>(</span>distance<span>);</span>

 <span>print</span> <span>distance</span><span>(</span><span>"foo"</span><span>,</span><span>"four"</span><span>);</span>
 <span># prints "2"</span>

 <span>my</span> <span>@words</span>     <span>=</span> <span>qw</span><span>/ four foo bar /</span><span>;</span>
 <span>my</span> <span>@distances</span> <span>=</span> <span>distance</span><span>(</span><span>"foo"</span><span>,</span><span>@words</span><span>);</span>

 <span>print</span> <span>"@distances"</span><span>;</span>
 <span># prints "2 0 3"</span><br /><br /><br /></pre><pre><span>use</span> Algorithm<span>::</span>LCSS <span>qw</span><span>(</span> LCSS CSS CSS_Sorted <span>);</span>
    <span>my</span> <span>$lcss_ary_ref</span> <span>=</span> <span>LCSS</span><span>(</span> <span>\</span><span>@SEQ1</span><span>,</span> <span>\</span><span>@SEQ2</span> <span>);</span>  <span># ref to array</span>
    <span>my</span> <span>$lcss_string</span>  <span>=</span> <span>LCSS</span><span>(</span> <span>$STR1</span><span>,</span> <span>$STR2</span> <span>);</span>    <span># string</span>
    <span>my</span> <span>$css_ary_ref</span> <span>=</span> <span>CSS</span><span>(</span> <span>\</span><span>@SEQ1</span><span>,</span> <span>\</span><span>@SEQ2</span> <span>);</span>    <span># ref to array of arrays</span>
    <span>my</span> <span>$css_str_ref</span> <span>=</span> <span>CSS</span><span>(</span> <span>$STR1</span><span>,</span> <span>$STR2</span> <span>);</span>      <span># ref to array of strings</span>
    <span>my</span> <span>$css_ary_ref</span> <span>=</span> <span>CSS_Sorted</span><span>(</span> <span>\</span><span>@SEQ1</span><span>,</span> <span>\</span><span>@SEQ2</span> <span>);</span>  <span># ref to array of arrays</span>
    <span>my</span> <span>$css_str_ref</span> <span>=</span> <span>CSS_Sorted</span><span>(</span> <span>$STR1</span><span>,</span> <span>$STR2</span> <span>);</span>    <span># ref to array of strings<br /><br /><br /><br /></span></pre><p>There are many different modules on CPAN for calculating the edit distance between two strings. Here's just a selection.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3ALevenshteinXS">Text::LevenshteinXS</a>&nbsp;and&nbsp;<a href="http://search.cpan.org/perldoc?Text%3A%3ALevenshtein%3A%3AXS">Text::Levenshtein::XS</a>&nbsp;are both versions of the Levenshtein algorithm that require a C compiler, but will be a lot faster than this module.</p><p>The Damerau-Levenshtein edit distance is like the Levenshtein distance, but in addition to insertion, deletion and substitution, it also considers the transposition of two adjacent characters to be a single edit. The module&nbsp;<a href="http://search.cpan.org/perldoc?Text%3A%3ALevenshtein%3A%3ADamerau">Text::Levenshtein::Damerau</a>&nbsp;defaults to using a pure perl implementation, but if you've installed&nbsp;<a href="http://search.cpan.org/perldoc?Text%3A%3ALevenshtein%3A%3ADamerau%3A%3AXS">Text::Levenshtein::Damerau::XS</a>&nbsp;then it will be a lot quicker.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3AWagnerFischer">Text::WagnerFischer</a>&nbsp;is an implementation of the Wagner-Fischer edit distance, which is similar to the Levenshtein, but applies different weights to each edit type.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3ABrew">Text::Brew</a>&nbsp;is an implementation of the Brew edit distance, which is another algorithm based on edit weights.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3AFuzzy">Text::Fuzzy</a>&nbsp;provides a number of operations for partial or fuzzy matching of text based on edit distance.&nbsp;<a href="http://search.cpan.org/perldoc?Text%3A%3AFuzzy%3A%3APP">Text::Fuzzy::PP</a>&nbsp;is a pure perl implementation of the same interface.</p><p><a href="http://search.cpan.org/perldoc?String%3A%3ASimilarity">String::Similarity</a>&nbsp;takes two strings and returns a value between 0 (meaning entirely different) and 1 (meaning identical). Apparently based on edit distance.</p><p><a href="http://search.cpan.org/perldoc?Text%3A%3ADice">Text::Dice</a>&nbsp;calculates&nbsp;<a href="https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient">Dice's coefficient</a>&nbsp;for two strings. This formula was originally developed to measure the similarity of two different populations in ecological research.</p><pre><span>&nbsp;</span></pre>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37830/nquire-a-statistical-framework-for-ploidy-estimation-using-next-generation-sequencing</guid>
	<pubDate>Thu, 04 Oct 2018 05:23:59 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37830/nquire-a-statistical-framework-for-ploidy-estimation-using-next-generation-sequencing</link>
	<title><![CDATA[nQuire: a statistical framework for ploidy estimation using next generation sequencing]]></title>
	<description><![CDATA[<p>nQuire provides a statistical framework to study organisms with intraspecific variation in ploidy. nQuire is likely to be useful in epidemiological studies of pathogens, artificial selection experiments, and for historical or ancient samples where intact nuclei are not preserved. It is implemented as a stand-alone Linux command line tool in the C programming language and is available at https://github.com/clwgg/nQuireunder the MIT license.</p><p>Address of the bookmark: <a href="https://github.com/clwgg/nQuireunder" rel="nofollow">https://github.com/clwgg/nQuireunder</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27845/cnidaria-fast-reference-free-phylogenomic-clustering</guid>
	<pubDate>Thu, 16 Jun 2016 17:55:17 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27845/cnidaria-fast-reference-free-phylogenomic-clustering</link>
	<title><![CDATA[CNIDARIA: fast, reference-free phylogenomic clustering]]></title>
	<description><![CDATA[<p>Motivation: Identification of biological specimens is a major requirement for a range of applications. Reference-free methods analyse unprocessed sequencing data without relying on prior knowledge, but these do not scale to arbitrarily large genomes and arbitrarily large phylogenetic distances.</p>
<p>Results: We present Cnidaria, a practical tool for clustering genomic and transcriptomic data with no limitation on ge-nome size or phylogenetic distances. We successfully simultaneously clustered 169 genomic and transcriptomic datasets from 4 kingdoms, achieving 100% accuracy at supra-species level and 78% accuracy for species level.</p>
<p>Availability and Implementation: Cnidaria is written in C++ and Python and is available at http://www.ab.wur.nl/cnidaria.</p>
<p>Contact: Saulo Aflitos - sauloal@gmail.com</p>
<p>Supplementary information: Supplementary data are available at Bioinformatics online.</p><p>Address of the bookmark: <a href="https://github.com/sauloal/cnidaria/wiki" rel="nofollow">https://github.com/sauloal/cnidaria/wiki</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39213/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</guid>
	<pubDate>Tue, 02 Apr 2019 21:54:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39213/flye-fast-and-accurate-de-novo-assembler-for-single-molecule-sequencing-reads</link>
	<title><![CDATA[Flye: Fast and accurate de novo assembler for single molecule sequencing reads]]></title>
	<description><![CDATA[<p><span>Flye is a de novo assembler for single molecule sequencing reads, such as those produced by PacBio and Oxford Nanopore Technologies. It is designed for a wide range of datasets, from small bacterial projects to large mammalian-scale assemblies. The package represents a complete pipeline: it takes raw PB / ONT reads as input and outputs polished contigs. Flye also includes a special mode for metagenome assembly.</span></p><p>Address of the bookmark: <a href="https://github.com/fenderglass/Flye" rel="nofollow">https://github.com/fenderglass/Flye</a></p>]]></description>
	<dc:creator>BioJoker</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41033/clark-fast-accurate-and-versatile-sequence-classification-system</guid>
	<pubDate>Sat, 15 Feb 2020 01:49:01 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41033/clark-fast-accurate-and-versatile-sequence-classification-system</link>
	<title><![CDATA[CLARK: Fast, accurate and versatile sequence classification system]]></title>
	<description><![CDATA[<p><span></span><a href="http://dx.doi.org/10.1186/s12864-015-1419-2"><strong>CLARK</strong></a><span>, a method based on a supervised sequence classification using discriminative&nbsp;</span><em>k</em><span>-mers. Considering two distinct specific classification problems (see the article for details), namely (1) the taxonomic classification of metagenomic reads to known bacterial genomes, and (2) the assignment of BAC clones and transcript to chromosome arms/centromeres (in the absence of a finished assembly for the reference genome), CLARK outperforms in classification speed and precision the best state-of-the-art methods.</span></p>
<p><span><a href="http://clark.cs.ucr.edu/Spaced/">http://clark.cs.ucr.edu/Spaced/</a></span></p><p>Address of the bookmark: <a href="http://clark.cs.ucr.edu/Spaced/" rel="nofollow">http://clark.cs.ucr.edu/Spaced/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34618/mashmap-a-fast-and-approximate-software-for-mapping-long-reads-pacbioont-or-assembly-to-reference-genomes</guid>
	<pubDate>Tue, 12 Dec 2017 17:23:31 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34618/mashmap-a-fast-and-approximate-software-for-mapping-long-reads-pacbioont-or-assembly-to-reference-genomes</link>
	<title><![CDATA[MashMap: a fast and approximate software for mapping long reads (PacBio/ONT) or assembly to reference genome(s)]]></title>
	<description><![CDATA[<p><span>MashMap is a fast and approximate software for mapping long reads (PacBio/ONT) or assembly to reference genome(s). It maps a query sequence against a reference region if and only if its estimated alignment identity is above a specified threshold. It does not compute the alignments explicitly, but rather estimates a&nbsp;</span><em>k</em><span>-mer based&nbsp;</span><a href="https://en.wikipedia.org/wiki/Jaccard_index">Jaccard similarity</a><span>&nbsp;using a combination of&nbsp;</span><a href="http://www.cs.princeton.edu/courses/archive/spr05/cos598E/bib/p76-schleimer.pdf">Winnowing</a><span>&nbsp;and&nbsp;</span><a href="https://en.wikipedia.org/wiki/MinHash">MinHash</a><span>. This is then converted to an estimate of sequence identity using the&nbsp;</span><a href="http://mash.readthedocs.org/">Mash</a><span>&nbsp;distance. An appropriate&nbsp;</span><em>k</em><span>-mer sampling rate is automatically determined given minimum local alignment length and identity thresholds. The efficiency of the algorithm improves as both of these thresholds are increased.</span></p><p>Address of the bookmark: <a href="https://github.com/marbl/MashMap" rel="nofollow">https://github.com/marbl/MashMap</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36618/lamsa-fast-split-read-alignment-with-long-approximate-matches</guid>
	<pubDate>Tue, 15 May 2018 04:44:42 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36618/lamsa-fast-split-read-alignment-with-long-approximate-matches</link>
	<title><![CDATA[LAMSA: fast split read alignment with long approximate matches]]></title>
	<description><![CDATA[LAMSA (Long Approximate Matches-based Split Aligner) is a novel split alignment approach with faster speed and good ability of handling SV events. It is well-suited to align long reads (over thousands of base-pairs).

LAMSA takes takes the advantage of the rareness of SVs to implement a specifically designed two-step strategy. That is, LAMSA initially splits the read into relatively long fragments and co-linearly align them to solve the small variations or sequencing errors, and mitigate the effect of repeats. The alignments of the fragments are then used for implementing a sparse dynamic programming (SDP)-based split alignment approach to handle the large or non-co-linear variants.

We benchmarked LAMSA with simulated and real datasets having various read lengths and sequencing error rates, the results demonstrate that it is substantially faster than the state-of-the-art long read aligners; mean-while, it also has good ability to handle various categories of SVs.

LAMSA is open source and free for non-commercial use.

LAMSA is mainly designed by Bo Liu &amp; Yan Gao and developed by Yan Gao in Center for Bioinformatics, Harbin Institute of Technology, China.<p>Address of the bookmark: <a href="https://github.com/hitbc/LAMSA" rel="nofollow">https://github.com/hitbc/LAMSA</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37473/lsc-a-long-read-error-correction-tool</guid>
	<pubDate>Thu, 02 Aug 2018 07:39:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37473/lsc-a-long-read-error-correction-tool</link>
	<title><![CDATA[LSC :a long read error correction tool]]></title>
	<description><![CDATA[<h2>Getting Started</h2>
<p>These simple steps will help you integrate LSC into your transcriptomics analysis pipeline.</p>
<ul>
<li>Read the&nbsp;<a href="https://www.healthcare.uiowa.edu/labs/au/LSC/LSC_requirements.asp">LSC_requirements</a>&nbsp;for running LSC.</li>
<li><a href="https://www.healthcare.uiowa.edu/labs/au/LSC/LSC_download.asp">Download</a>&nbsp;and set-up the LSC package.</li>
<li>Follow the&nbsp;<a href="https://www.healthcare.uiowa.edu/labs/au/LSC/LSC_tutorial.asp">tutorial</a>&nbsp;to see how LSC works on some example data.</li>
<li>Read the&nbsp;<a href="https://www.healthcare.uiowa.edu/labs/au/LSC/LSC_manual.asp">manual</a>&nbsp;if anything is unclear.</li>
<li>You're ready, Happy LSCing!</li>
</ul>
<h2>Latest publication</h2>
<p><span>Kin Fai Au, Jason Underwood, Lawrence Lee and Wing Hung Wong&nbsp;</span><br><strong>Improving PacBio Long Read Accuracy by Short Read Alignment&nbsp;</strong><span>[</span><a href="http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0046679">Manuscript</a><span>]&nbsp;</span><br><em>PLoS ONE</em><span>&nbsp;2012. 7(10): e46679. doi:10.1371/journal.pone.0046679</span></p><p>Address of the bookmark: <a href="https://www.healthcare.uiowa.edu/labs/au/LSC/" rel="nofollow">https://www.healthcare.uiowa.edu/labs/au/LSC/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40212/kalign-fast-multiple-sequence-alignment-program-for-biological-sequences</guid>
	<pubDate>Fri, 01 Nov 2019 00:20:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40212/kalign-fast-multiple-sequence-alignment-program-for-biological-sequences</link>
	<title><![CDATA[Kalign: fast multiple sequence alignment program for biological sequences.]]></title>
	<description><![CDATA[<p><span>Kalign is a fast multiple sequence alignment program for biological sequences.</span></p>
<p>Align sequences and output the alignment in MSF format:</p>
<pre><code>kalign -i BB11001.tfa -f msf  -o out.msf
</code></pre>
<p>Align sequences and output the alignment in clustal format:</p>
<pre><code>kalign -i BB11001.tfa -f clu -o out.clu
</code></pre>
<p>Re-align sequences in an existing alignment:</p>
<pre><code>kalign -i BB11001.msf  -o out.afa
</code></pre>
<p>Reformat existing alignment:</p>
<pre><code>kalign -i BB11001.msf -r afa -o out.afa</code></pre><p>Address of the bookmark: <a href="https://github.com/TimoLassmann/kalign" rel="nofollow">https://github.com/TimoLassmann/kalign</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>

</channel>
</rss>