<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/38670?offset=200</link>
	<atom:link href="https://bioinformaticsonline.com/related/38670?offset=200" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41405/sequence-tube-maps-displays-multiple-genomic-sequences-in-the-form-of-a-tube-map</guid>
	<pubDate>Wed, 11 Mar 2020 01:12:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41405/sequence-tube-maps-displays-multiple-genomic-sequences-in-the-form-of-a-tube-map</link>
	<title><![CDATA[Sequence Tube Maps: displays multiple genomic sequences in the form of a tube map]]></title>
	<description><![CDATA[<p>A JavaScript module for the visualization of genomic sequence graphs. It automatically generates a "tube map"-like visualization of sequence graphs which have been created with <a href="https://github.com/vgteam/vg">vg</a>. (<a href="https://github.com/vgteam/vg">https://github.com/vgteam/vg</a>)</p>
<h3>Link to working demo: <a href="https://vgteam.github.io/sequenceTubeMap/">https://vgteam.github.io/sequenceTubeMap/</a></h3>
<p><img src="https://raw.githubusercontent.com/vgteam/sequenceTubeMap/master/images/header.png" alt="image" style="border: 0px; border: 0px;"></p><p>Address of the bookmark: <a href="https://github.com/vgteam/sequenceTubeMap" rel="nofollow">https://github.com/vgteam/sequenceTubeMap</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43826/tiara-deep-learning-based-classification-system-for-eukaryotic-sequences</guid>
	<pubDate>Mon, 14 Mar 2022 23:02:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43826/tiara-deep-learning-based-classification-system-for-eukaryotic-sequences</link>
	<title><![CDATA[Tiara: deep learning-based classification system for eukaryotic sequences]]></title>
	<description><![CDATA[<p><span>With a large number of metagenomic datasets becoming available, eukaryotic metagenomics emerged as a new challenge. The proper classification of eukaryotic nuclear and organellar genomes is an essential step toward a better understanding of eukaryotic diversity.</span></p><p>Address of the bookmark: <a href="https://academic.oup.com/bioinformatics/article/38/2/344/6375939" rel="nofollow">https://academic.oup.com/bioinformatics/article/38/2/344/6375939</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44663/svbyeye-r-package-to-visualize-alignments-between-two-or-multiple-dna-sequences</guid>
	<pubDate>Tue, 17 Sep 2024 02:34:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44663/svbyeye-r-package-to-visualize-alignments-between-two-or-multiple-dna-sequences</link>
	<title><![CDATA[SVbyEye: R Package to visualize alignments between two or multiple DNA sequences]]></title>
	<description><![CDATA[<p dir="auto">R Package to visualize alignments between two or multiple DNA sequences including<br>a number of functionalities to facilitate processing of alignments in PAF format.</p>
<p dir="auto"><span>SVbyEye, an open-source R package to visualize and annotate sequence-to-sequence alignments along with various functionalities to process alignments in PAF format. The tool facilitates the characterization of complex SVs in the context of sequence homology helping resolve the mechanisms underlying their formation. Availability and implementation SVbyEye is available at https://github.com/daewoooo/SVbyEye.</span></p>
<p dir="auto">Author: David Porubsky</p><p>Address of the bookmark: <a href="https://github.com/daewoooo/SVbyEye" rel="nofollow">https://github.com/daewoooo/SVbyEye</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39019/iq-tree-efficient-software-for-phylogenomic-inference</guid>
	<pubDate>Mon, 18 Feb 2019 04:25:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39019/iq-tree-efficient-software-for-phylogenomic-inference</link>
	<title><![CDATA[IQ-TREE: Efficient software for phylogenomic inference]]></title>
	<description><![CDATA[<p><span>A fast and effective stochastic algorithm to infer phylogenetic trees by maximum likelihood.&nbsp;</span><em>IQ-TREE compares favorably to RAxML and PhyML</em><span>&nbsp;in terms of likelihoods with similar computing time</span></p>
<p><span><span>IQ-TREE found higher likelihoods between 62.2% and 87.1% of the studied alignments, thus efficiently exploring the tree-space. If we use the IQ-TREE stopping rule, RAxML and PhyML are faster in 75.7% and 47.1% of the DNA alignments and 42.2% and 100% of the protein alignments, respectively. However, the range of obtaining higher likelihoods with IQ-TREE improves to 73.3&ndash;97.1%. IQ-TREE is freely available at&nbsp;</span><a href="http://www.cibiv.at/software/iqtree" target="">http://www.cibiv.at/software/iqtree</a></span></p><p>Address of the bookmark: <a href="http://www.iqtree.org/" rel="nofollow">http://www.iqtree.org/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/9639/find-certain-filesdocuments-in-linux-os</guid>
	<pubDate>Sun, 06 Apr 2014 23:56:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/9639/find-certain-filesdocuments-in-linux-os</link>
	<title><![CDATA[Find certain files/documents in Linux OS]]></title>
	<description><![CDATA[<p>As bioinformatician I know the fact that we usually handle the large dataset and lost in the huge numbers of files and folders. In order to search the missing file a strong search command is required. The Linux Find Command is one of the most important and much used command in Linux sytems. Find command used to search and locate list of files and directories based on conditions you specify for files that match the arguments. Find can be used in variety of conditions like you can find files by permissions, users, groups, file type, date, size and other possible criteria.<br /><br />Through this article we are sharing our day-to-day Linux find command experience and its usage in the form of examples. In this article we will show you the most used 35 Find Commands examples in Linux. We have divided the section into Five parts from basic to advance usage of find command.</p><p><strong>Part I &ndash; Basic Find Commands for Finding Files with Names</strong><br />1. Find Files Using Name in Current Directory<br /><br />Find all the files whose name is gene.txt in a current working directory.<br /><br /># find . -name gene.txt<br /><br />./gene.txt<br /><br />2. Find Files Under Home Directory<br /><br />Find all the files under /home directory with name gene.txt.<br /><br /># find /home -name gene.txt<br /><br />/home/gene.txt<br /><br />3. Find Files Using Name and Ignoring Case<br /><br />Find all the files whose name is gene.txt and contains both capital and small letters in /home directory.<br /><br /># find /home -iname gene.txt<br /><br />./gene.txt<br />./Gene.txt<br /><br />4. Find Directories Using Name<br /><br />Find all directories whose name is Gene in / directory.<br /><br /># find / -type d -name Gene<br /><br />/Gene<br /><br />5. Find fasta Files Using Name<br /><br />Find all php files whose name is gene.fasta in a current working directory.<br /><br /># find . -type f -name gene.fasta<br /><br />./gene.fasta<br /><br />6. Find all PHP Files in Directory<br /><br />Find all fasta files in a directory.<br /><br /># find . -type f -name "*.fasta"<br /><br />./gene.fasta<br />./cancer.fasta<br />./allgene.fasta<br /><br /><strong>Part II &ndash; Find Files Based on their Permissions</strong><br />7. Find Files With 777 Permissions<br /><br />Find all the files whose permissions are 777.<br /><br /># find . -type f -perm 0777 -print<br /><br />8. Find Files Without 777 Permissions<br /><br />Find all the files without permission 777.<br /><br /># find / -type f ! -perm 777<br /><br />9. Find SGID Files with 644 Permissions<br /><br />Find all the SGID bit files whose permissions set to 644.<br /><br /># find / -perm 2644<br /><br />10. Find Sticky Bit Files with 551 Permissions<br /><br />Find all the Sticky Bit set files whose permission are 551.<br /><br /># find / -perm 1551<br /><br />11. Find SUID Files<br /><br />Find all SUID set files.<br /><br /># find / -perm /u=s<br /><br />12. Find SGID Files<br /><br />Find all SGID set files.<br /><br /># find / -perm /g+s<br /><br />13. Find Read Only Files<br /><br />Find all Read Only files.<br /><br /># find / -perm /u=r<br /><br />14. Find Executable Files<br /><br />Find all Executable files.<br /><br /># find / -perm /a=x<br /><br />15. Find Files with 777 Permissions and Chmod to 644<br /><br />Find all 777 permission files and use chmod command to set permissions to 644.<br /><br /># find / -type f -perm 0777 -print -exec chmod 644 {} \;<br /><br />16. Find Directories with 777 Permissions and Chmod to 755<br /><br />Find all 777 permission directories and use chmod command to set permissions to 755.<br /><br /># find / -type d -perm 777 -print -exec chmod 755 {} \;<br /><br />17. Find and remove single File<br /><br />To find a single file called gene.txt and remove it.<br /><br /># find . -type f -name "gene.txt" -exec rm -f {} \;<br /><br />18. Find and remove Multiple File<br /><br />To find and remove multiple files such as .fa or .gb, then use.<br /><br /># find . -type f -name "*.fa" -exec rm -f {} \;<br /><br />OR<br /><br /># find . -type f -name "*.gb" -exec rm -f {} \;<br /><br />19. Find all Empty Files<br /><br />To file all empty files under certain path.<br /><br /># find /tmp -type f -empty<br /><br />20. Find all Empty Directories<br /><br />To file all empty directories under certain path.<br /><br /># find /tmp -type d -empty<br /><br />21. File all Hidden Files<br /><br />To find all hidden files, use below command.<br /><br /># find /tmp -type f -name ".*"<br /><br /><strong>Part III &ndash; Search Files Based On Owners and Groups</strong><br />22. Find Single File Based on User<br /><br />To find all or single file called gene.txt under / root directory of owner root.<br /><br /># find / -user root -name gene.txt<br /><br />23. Find all Files Based on User<br /><br />To find all files that belongs to user Rahul under /home directory.<br /><br /># find /home -user rahul<br /><br />24. Find all Files Based on Group<br /><br />To find all files that belongs to group Developer under /home directory.<br /><br /># find /home -group developer<br /><br />25. Find Particular Files of User<br /><br />To find all .txt files of user Rahul under /home directory.<br /><br /># find /home -user rahul -iname "*.txt"<br /><br /><strong>Part IV &ndash; Find Files and Directories Based on Date and Time</strong><br />26. Find Last 50 Days Modified Files<br /><br />To find all the files which are modified 50 days back.<br /><br /># find / -mtime 50<br /><br />27. Find Last 50 Days Accessed Files<br /><br />To find all the files which are accessed 50 days back.<br /><br /># find / -atime 50<br /><br />28. Find Last 50-100 Days Modified Files<br /><br />To find all the files which are modified more than 50 days back and less than 100 days.<br /><br /># find / -mtime +50 &ndash;mtime -100<br /><br />29. Find Changed Files in Last 1 Hour<br /><br />To find all the files which are changed in last 1 hour.<br /><br /># find / -cmin -60<br /><br />30. Find Modified Files in Last 1 Hour<br /><br />To find all the files which are modified in last 1 hour.<br /><br /># find / -mmin -60<br /><br />31. Find Accessed Files in Last 1 Hour<br /><br />To find all the files which are accessed in last 1 hour.<br /><br /># find / -amin -60<br /><br /><strong>Part V &ndash; Find Files and Directories Based on Size</strong><br />32. Find 50MB Files<br /><br />To find all 50MB files, use.<br /><br /># find / -size 50M<br /><br />33. Find Size between 50MB &ndash; 100MB<br /><br />To find all the files which are greater than 50MB and less than 100MB.<br /><br /># find / -size +50M -size -100M<br /><br />34. Find and Delete 100MB Files<br /><br />To find all 100MB files and delete them using one single command.<br /><br /># find / -size +100M -exec rm -rf {} \;<br /><br />35. Find Specific Files and Delete<br /><br />Find all .gb files with more than 10MB and delete them using one single command.<br /><br /># find / -type f -name *.gb -size +10M -exec rm {} \;</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43398/waafle-a-workflow-to-annotate-assemblies-and-find-lgt-events</guid>
	<pubDate>Thu, 23 Sep 2021 14:31:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43398/waafle-a-workflow-to-annotate-assemblies-and-find-lgt-events</link>
	<title><![CDATA[WAAFLE: a Workflow to Annotate Assemblies and Find LGT Events.]]></title>
	<description><![CDATA[<p><span>Lateral gene transfer (LGT) is an important mechanism for genome diversification in microbial communities, including the human microbiome. While methods exist to identify LGTs from sequenced isolate genomes, identifying LGTs from community metagenomes remains an open problem. To address this, we developed&nbsp;</span><span>WAAFLE</span><span>: a&nbsp;</span><span>W</span><span>orkflow to&nbsp;</span><span>A</span><span>nnotate&nbsp;</span><span>A</span><span>ssemblies and&nbsp;</span><span>F</span><span>ind&nbsp;</span><span>L</span><span>GT&nbsp;</span><span>E</span><span>vents.</span></p><p>Address of the bookmark: <a href="http://huttenhower.sph.harvard.edu/waafle" rel="nofollow">http://huttenhower.sph.harvard.edu/waafle</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/4590/tigers-genome-sequenced</guid>
	<pubDate>Tue, 17 Sep 2013 16:48:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/4590/tigers-genome-sequenced</link>
	<title><![CDATA[Tigers genome sequenced]]></title>
	<description><![CDATA[<p>Fifteen scientists led by Dr Jong Bhak of Genome Research Foundation, South Korea, decoded as many as 3 billion nucleotides (organic molecules that form the basic building blocks of nucleic acids, such as DNA). They identified 20,000 genes related to various functions of the tiger.&nbsp;</p><p>The biggest and perhaps most fearsome of the world's big cats, the tiger, shares 95.6 percent of its DNA with humans' cute and furry companions, domestic cats.</p><p>The new research showed that big cats have genetic mutations that enabled them to be carnivores. The team also identified mutations that allow snow leopards to thrive at high altitudes.</p><p>Reference:</p><p><a href="http://www.nbcnews.com/science/your-cat-ferocious-tigers-share-lot-95-6-percent-their-4B11182690">http://www.nbcnews.com/science/your-cat-ferocious-tigers-share-lot-95-6-percent-their-4B11182690</a></p><p><a href="http://timesofindia.indiatimes.com/home/environment/flora-fauna/Gene-mapping-of-tiger-completed/articleshow/22671681.cms">http://timesofindia.indiatimes.com/home/environment/flora-fauna/Gene-mapping-of-tiger-completed/articleshow/22671681.cms</a></p><p>Paper:</p><p><a href="http://www.nature.com/ncomms/2013/130917/ncomms3433/full/ncomms3433.html">http://www.nature.com/ncomms/2013/130917/ncomms3433/full/ncomms3433.html</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34418/spades-hybrid-genome-assembly</guid>
	<pubDate>Mon, 27 Nov 2017 08:05:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34418/spades-hybrid-genome-assembly</link>
	<title><![CDATA[SPAdes hybrid genome assembly]]></title>
	<description><![CDATA[<p>When you have both Illumina and Nanopore data, then SPAdes remains a good option for hybrid assembly - SPAdes was used to produce the&nbsp;<a href="https://gigascience.biomedcentral.com/articles/10.1186/s13742-015-0101-6">B fragilis assembly</a>&nbsp;by Mick Watson&rsquo;s group.</p><p>Again, running spades.py will show you the options:</p><div><pre><code>spades.py
</code></pre></div><p>This produces:</p><div><pre><code>SPAdes genome assembler v3.10.1

Usage: /usr/local/SPAdes-3.10.1-Linux/bin/spades.py [options] -o &lt;output_dir&gt;

Basic options:
-o      &lt;output_dir&gt;    directory to store all the resulting files (required)
--sc                    this flag is required for MDA (single-cell) data
--meta                  this flag is required for metagenomic sample data
--rna                   this flag is required for RNA-Seq data
--plasmid               runs plasmidSPAdes pipeline for plasmid detection
--iontorrent            this flag is required for IonTorrent data
--test                  runs SPAdes on toy dataset
-h/--help               prints this usage message
-v/--version            prints version

Input data:
--12    &lt;filename&gt;      file with interlaced forward and reverse paired-end reads
-1      &lt;filename&gt;      file with forward paired-end reads
-2      &lt;filename&gt;      file with reverse paired-end reads
-s      &lt;filename&gt;      file with unpaired reads
--pe&lt;#&gt;-12      &lt;filename&gt;      file with interlaced reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--pe&lt;#&gt;-1       &lt;filename&gt;      file with forward reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--pe&lt;#&gt;-2       &lt;filename&gt;      file with reverse reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--pe&lt;#&gt;-s       &lt;filename&gt;      file with unpaired reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--pe&lt;#&gt;-&lt;or&gt;    orientation of reads for paired-end library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9; &lt;or&gt; = fr, rf, ff)
--s&lt;#&gt;          &lt;filename&gt;      file with unpaired reads for single reads library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-12      &lt;filename&gt;      file with interlaced reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-1       &lt;filename&gt;      file with forward reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-2       &lt;filename&gt;      file with reverse reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-s       &lt;filename&gt;      file with unpaired reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--mp&lt;#&gt;-&lt;or&gt;    orientation of reads for mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9; &lt;or&gt; = fr, rf, ff)
--hqmp&lt;#&gt;-12    &lt;filename&gt;      file with interlaced reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--hqmp&lt;#&gt;-1     &lt;filename&gt;      file with forward reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--hqmp&lt;#&gt;-2     &lt;filename&gt;      file with reverse reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--hqmp&lt;#&gt;-s     &lt;filename&gt;      file with unpaired reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--hqmp&lt;#&gt;-&lt;or&gt;  orientation of reads for high-quality mate-pair library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9; &lt;or&gt; = fr, rf, ff)
--nxmate&lt;#&gt;-1   &lt;filename&gt;      file with forward reads for Lucigen NxMate library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--nxmate&lt;#&gt;-2   &lt;filename&gt;      file with reverse reads for Lucigen NxMate library number &lt;#&gt; (&lt;#&gt; = 1,2,..,9)
--sanger        &lt;filename&gt;      file with Sanger reads
--pacbio        &lt;filename&gt;      file with PacBio reads
--nanopore      &lt;filename&gt;      file with Nanopore reads
--tslr  &lt;filename&gt;      file with TSLR-contigs
--trusted-contigs       &lt;filename&gt;      file with trusted contigs
--untrusted-contigs     &lt;filename&gt;      file with untrusted contigs

Pipeline options:
--only-error-correction runs only read error correction (without assembling)
--only-assembler        runs only assembling (without read error correction)
--careful               tries to reduce number of mismatches and short indels
--continue              continue run from the last available check-point
--restart-from  &lt;cp&gt;    restart run with updated options and from the specified check-point ('ec', 'as', 'k&lt;int&gt;', 'mc')
--disable-gzip-output   forces error correction not to compress the corrected reads
--disable-rr            disables repeat resolution stage of assembling

Advanced options:
--dataset       &lt;filename&gt;      file with dataset description in YAML format
-t/--threads    &lt;int&gt;           number of threads
                                [default: 16]
-m/--memory     &lt;int&gt;           RAM limit for SPAdes in Gb (terminates if exceeded)
                                [default: 250]
--tmp-dir       &lt;dirname&gt;       directory for temporary files
                                [default: &lt;output_dir&gt;/tmp]
-k              &lt;int,int,...&gt;   comma-separated list of k-mer sizes (must be odd and
                                less than 128) [default: 'auto']
--cov-cutoff    &lt;float&gt;         coverage cutoff value (a positive float number, or 'auto', or 'off') [default: 'off']
--phred-offset  &lt;33 or 64&gt;      PHRED quality offset in the input reads (33 or 64)
                                [default: auto-detect]
</code></pre></div><p>As you can see this is also a &ldquo;pipeline&rdquo; of tools that can be switched on or off. SPAdes takes quite a long time, so for the purposes of this practical, something like this may suffice:</p><div><pre><code>spades.py -t 4 <span>\</span>
          -m 32 <span>\</span>
          -k 31,51,71 <span>\</span>
          --only-assembler <span>\</span>
          -1 miseq.1.fastq -2 miseq.2.fastq <span>\</span>
          --nanopore minion.fastq <span>\</span>
          -o hybrid_assembly
</code></pre></div><p>In turn, these parameters mean</p><ul>
<li>use 4 threads</li>
<li>max memory is 32Gb</li>
<li>use 3 kmer values to build the de bruijn graph(s) - 31, 51 and 71</li>
<li>only run the assembler, not the correction algorithm (for speed)</li>
<li>read 1 and read 2 of the MiSeq data</li>
<li>the nanopore data</li>
<li>put the output in folder &ldquo;hybrid_assembly&rdquo;</li>
</ul>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34528/cope-an-accurate-k-mer-based-pair-end-reads-connection-tool-to-facilitate-genome-assembly</guid>
	<pubDate>Wed, 06 Dec 2017 02:08:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34528/cope-an-accurate-k-mer-based-pair-end-reads-connection-tool-to-facilitate-genome-assembly</link>
	<title><![CDATA[COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly]]></title>
	<description><![CDATA[<p><span>An efficient tool called Connecting Overlapped Pair-End (COPE) reads, to connect overlapping pair-end reads using k-mer frequencies. We evaluated our tool on 30&times; simulated pair-end reads from Arabidopsis thaliana with 1% base error. COPE connected over 99% of reads with 98.8% accuracy, which is, respectively, 10 and 2% higher than the recently published tool FLASH. When COPE is applied to real reads for genome assembly, the resulting contigs are found to have fewer errors and give a 14-fold improvement in the N50 measurement when compared with the contigs produced using unconnected reads.</span></p><p>Address of the bookmark: <a href="ftp://ftp.genomics.org.cn/pub/cope" rel="nofollow">ftp://ftp.genomics.org.cn/pub/cope</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34685/tools-for-bacterial-whole-genome-annotation</guid>
	<pubDate>Sat, 16 Dec 2017 17:37:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34685/tools-for-bacterial-whole-genome-annotation</link>
	<title><![CDATA[Tools for bacterial whole genome annotation]]></title>
	<description><![CDATA[<p><a href="http://rast.nmpdr.org/">RAST</a>&nbsp;&ndash;&nbsp;Web tool (upload contigs), uses the subsystems in the SEED database and&nbsp;provides detailed annotation and pathway analysis. Takes several hours per genome but I think this is the best way to get a high quality annotation (if you have only a few genomes to annotate).</p><p><a href="http://www.vicbioinformatics.com/software.prokka.shtml">Prokka</a>&nbsp;&ndash;&nbsp;Standalone command line tool, takes just a few minutes per genome.&nbsp;This is the best way to get good quality annotation in a flash, which is particularly useful if you have loads of genomes or need to annotate a pangenome or metagenome. Note however that the quality of functional information is not as good as RAST, and you&nbsp;will need several extra steps if you want to do&nbsp;functional profiling and pathway analysis of your genome(s)&hellip; which is in-built in RAST.</p><p>NCBI Prokaryotic Genome Annotation Pipeline is designed to annotate bacterial and archaeal genomes (chromosomes and plasmids).</p><p>Genome annotation is a multi-level process that includes prediction of protein-coding genes, as well as other functional genome units such as structural RNAs, tRNAs, small RNAs, pseudogenes, control regions, direct and inverted repeats, insertion sequences, transposons and other mobile elements.</p><p><a href="https://www.ncbi.nlm.nih.gov/genome/annotation_prok/">PGAP</a>: NCBI has developed an automatic prokaryotic genome annotation pipeline that combines&nbsp;<em>ab initio</em>&nbsp;gene prediction algorithms with homology based methods. The first version of NCBI Prokaryotic Genome Automatic Annotation Pipeline (PGAAP;&nbsp;<a href="https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=pubmed&amp;dopt=Abstract&amp;list_uids=18416670">see Pubmed Article</a>) developed in 2005 has been replaced with an upgraded version that is capable of processing a larger data volume.&nbsp; NCBI's annotation pipeline depends on several internal databases and is not currently available for download or use outside of the NCBI environment.</p><p><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC453985">BEACON</a> (automated tool for Bacterial GEnome Annotation ComparisON), a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at:&nbsp;<a href="http://www.cbrc.kaust.edu.sa/BEACON/" target="pmc_ext">http://www.cbrc.kaust.edu.sa/BEACON/</a>.</p><p><a href="http://www.kegg.jp/blastkoala/">BlastKOLA</a>: Assigns K numbers to the user's sequence data by BLAST searches, respectively, against a nonredundant set of KEGG GENES. KOALA (KEGG Orthology And Links Annotation) is KEGG's internal annotation tool for K number assignment of KEGG GENES using SSEARCH computation. Annotate Sequence in KEGG Mapper and Pathogen Checker in KEGG Pathogen are special interfaces to this server and can be executed in an interactive mode. BlastKOALA is suitable for annotating fully sequenced genomes.</p><p><a href="http://www.sanger.ac.uk/science/tools/pagit">PAGIT</a>: Provides a toolkit for improving the quality of genome assemblies created via an assembly software. PAGIT compiled four tools: (i) ABACAS which classifies and orientates contigs and estimates the sizes of gaps between them; (ii) IMAGE uses paired-end reads to extend contigs and close gaps within the scaffolds; (iii) ICORN for identifying and correcting small errors in consensus sequences and; (iv) RATT for help annotation. The software was mainly created to analyze parasite genomes of up to about 300 Mb.</p><p><a href="http://www.yandell-lab.org/software/maker.html">MAKER: </a>A portable and easily configurable genome annotation pipeline. MAKER allows smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. It identifies repeats, aligns ESTs and proteins to a genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence-based quality values. MAKER's inputs are minimal and its ouputs can be directly loaded into a Generic Model Organism Database (GMOD). They can also be viewed in the Apollo genome browser; this feature of MAKER provides an easy means to annotate, view and edit individual contigs and BACs without the overhead of a database. MAKER is available for download and can be tested online via the MAKER Web Annotation Service (MWAS).</p><p><a href="https://www.sciencedirect.com/science/article/pii/S0167701215001207">MyPro</a> is a software pipeline for high-quality prokaryotic genome assembly and annotation. It was validated on 18 oral streptococcal strains to produce submission-ready, annotated draft genomes. MyPro installed as a virtual machine and supported by updated databases will enable biologists to perform quality prokaryotic genome assembly and annotation with ease.</p>]]></description>
	<dc:creator>Radha Agarkar</dc:creator>
</item>

</channel>
</rss>