<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/37414?offset=40</link>
	<atom:link href="https://bioinformaticsonline.com/related/37414?offset=40" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43634/illumina-based-assembly-pipeline-steps</guid>
	<pubDate>Fri, 10 Dec 2021 06:22:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43634/illumina-based-assembly-pipeline-steps</link>
	<title><![CDATA[Illumina based assembly pipeline steps !]]></title>
	<description><![CDATA[<h3 id="illumina">Illumina<a href="https://nf-co.re/viralrecon#illumina"><span></span></a></h3><ol>
<li>Merge re-sequenced FastQ files (<a href="http://www.linfo.org/cat.html"><code>cat</code></a>)</li>
<li>Read QC (<a href="https://www.bioinformatics.babraham.ac.uk/projects/fastqc/"><code>FastQC</code></a>)</li>
<li>Adapter trimming (<a href="https://github.com/OpenGene/fastp"><code>fastp</code></a>)</li>
<li>Removal of host reads (<a href="http://ccb.jhu.edu/software/kraken2/"><code>Kraken 2</code></a>; <em>optional</em>)</li>
<li>Variant calling<ol>
<li>Read alignment (<a href="http://bowtie-bio.sourceforge.net/bowtie2/index.shtml"><code>Bowtie 2</code></a>)</li>
<li>Sort and index alignments (<a href="https://sourceforge.net/projects/samtools/files/samtools/"><code>SAMtools</code></a>)</li>
<li>Primer sequence removal (<a href="https://github.com/andersen-lab/ivar"><code>iVar</code></a>; <em>amplicon data only</em>)</li>
<li>Duplicate read marking (<a href="https://broadinstitute.github.io/picard/"><code>picard</code></a>; <em>optional</em>)</li>
<li>Alignment-level QC (<a href="https://broadinstitute.github.io/picard/"><code>picard</code></a>, <a href="https://sourceforge.net/projects/samtools/files/samtools/"><code>SAMtools</code></a>)</li>
<li>Genome-wide and amplicon coverage QC plots (<a href="https://github.com/brentp/mosdepth/"><code>mosdepth</code></a>)</li>
<li>Choice of multiple variant calling and consensus sequence generation routes (<a href="https://github.com/andersen-lab/ivar"><code>iVar variants and consensus</code></a>; <em>default for amplicon data</em> <em>||</em> <a href="http://samtools.github.io/bcftools/bcftools.html"><code>BCFTools</code></a>, <a href="https://github.com/arq5x/bedtools2/"><code>BEDTools</code></a>; <em>default for metagenomics data</em>)
<ul>
<li>Variant annotation (<a href="http://snpeff.sourceforge.net/SnpEff.html"><code>SnpEff</code></a>, <a href="http://snpeff.sourceforge.net/SnpSift.html"><code>SnpSift</code></a>)</li>
<li>Consensus assessment report (<a href="http://quast.sourceforge.net/quast"><code>QUAST</code></a>)</li>
<li>Lineage analysis (<a href="https://github.com/cov-lineages/pangolin"><code>Pangolin</code></a>)</li>
<li>Clade assignment, mutation calling and sequence quality checks (<a href="https://github.com/nextstrain/nextclade"><code>Nextclade</code></a>)</li>
<li>Individual variant screenshots with annotation tracks (<a href="https://asciigenome.readthedocs.io/en/latest/"><code>ASCIIGenome</code></a>)</li>
</ul>
</li>
<li>Intersect variants across callers (<a href="http://samtools.github.io/bcftools/bcftools.html"><code>BCFTools</code></a>)</li>
</ol></li>
<li><em>De novo</em> assembly<ol>
<li>Primer trimming (<a href="https://cutadapt.readthedocs.io/en/stable/guide.html"><code>Cutadapt</code></a>; <em>amplicon data only</em>)</li>
<li>Choice of multiple assembly tools (<a href="http://cab.spbu.ru/software/spades/"><code>SPAdes</code></a> <em>||</em> <a href="https://github.com/rrwick/Unicycler"><code>Unicycler</code></a> <em>||</em> <a href="https://github.com/GATB/minia"><code>minia</code></a>)
<ul>
<li>Blast to reference genome (<a href="https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch"><code>blastn</code></a>)</li>
<li>Contiguate assembly (<a href="https://www.sanger.ac.uk/science/tools/pagit"><code>ABACAS</code></a>)</li>
<li>Assembly report (<a href="https://github.com/BU-ISCIII/plasmidID"><code>PlasmidID</code></a>)</li>
<li>Assembly assessment report (<a href="http://quast.sourceforge.net/quast"><code>QUAST</code></a>)</li>
</ul>
</li>
</ol></li>
<li>Present QC and visualisation for raw read, alignment, assembly and variant calling results (<a href="http://multiqc.info/"><code>MultiQC</code></a>)</li>
</ol>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44561/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</guid>
	<pubDate>Sat, 08 Jun 2024 16:25:08 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44561/bactopia-a-flexible-pipeline-for-complete-analysis-of-bacterial-genomes</link>
	<title><![CDATA[Bactopia: a flexible pipeline for complete analysis of bacterial genomes]]></title>
	<description><![CDATA[<p>Bactopia is a flexible pipeline for complete analysis of bacterial genomes. The goal of Bactopia is process your data with a broad set of tools, so that you can get to the fun part of analyses quicker!</p>
<p>Bactopia was inspired by&nbsp;<a href="https://staphopia.github.io/">Staphopia</a>, a workflow we (Tim Read and myself) released that is targeted towards&nbsp;<em>Staphylococcus aureus</em>&nbsp;genomes. Using what we learned from Staphopia and user feedback, Bactopia was developed from scratch with usability, portability, and speed in mind from the start.</p>
<p>Bactopia uses&nbsp;<a href="https://www.nextflow.io/">Nextflow</a>&nbsp;to manage the workflow, allowing for support of many types of environments (e.g. cluster or cloud). Bactopia allows for the usage of many public datasets as well as your own datasets to further enhance the analysis of your sequencing. Bactopia only uses software packages available from&nbsp;<a href="https://bioconda.github.io/">Bioconda</a>&nbsp;and&nbsp;<a href="https://conda-forge.org/">Conda-Forge</a>&nbsp;to make installation as simple as possible for&nbsp;<em>all</em>&nbsp;users.</p>
<p>To highlight the use of&nbsp;<a href="https://bactopia.github.io/latest/full-guide/">Bactopia</a>&nbsp;and&nbsp;<a href="https://bactopia.github.io/latest/bactopia-tools/">Bactopia Tools</a>, we performed an analysis of 1,664 public&nbsp;<em>Lactobacillus</em>&nbsp;genomes, focusing on&nbsp;<em>Lactobacillus crispatus</em>, a species that is a common part of the human vaginal microbiome. The results from this analysis are published in mSystems under the title:&nbsp;<em><a href="https://doi.org/10.1128/mSystems.00190-20">Bactopia: a flexible pipeline for complete analysis of bacterial genomes</a></em></p>
<p><a href="https://bactopia.github.io/latest/assets/bactopia-workflow.png"><img src="https://bactopia.github.io/latest/assets/bactopia-workflow.png" alt="Bactopia Workflow" style="border: 0px;"></a></p><p>Address of the bookmark: <a href="https://bactopia.github.io/latest/" rel="nofollow">https://bactopia.github.io/latest/</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/11609/bioinformatician%E2%80%99s-pocket-reference</guid>
	<pubDate>Sun, 08 Jun 2014 09:56:58 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/11609/bioinformatician%E2%80%99s-pocket-reference</link>
	<title><![CDATA[Bioinformatician’s Pocket Reference !!]]></title>
	<description><![CDATA[<p><span>It is amusing how brain of bioinformaticians work! Learning a new programming language for days feels so much of fun that making 5 minute discussion with neighbours (unless under special circumstances!) in our own mother-tongue. Today every bioinformatician keeps more than few languages and core IT toolkits on their plate. It has become mandatory to be able to mould different code snippets to build our own custom workflows, and thus keeping syntax at our fingertips has become essential.Although Google is best way to get syntax problem solved, it is not a bad idea to keep reference sheets is our smartphones or stick out some printed sheets on the back of your door, in the old fashion way!!</span></p><p>Address of the bookmark: <a href="http://infoplatter.wordpress.com/2014/04/06/bioinformaticians-pocket-reference/" rel="nofollow">http://infoplatter.wordpress.com/2014/04/06/bioinformaticians-pocket-reference/</a></p>]]></description>
	<dc:creator>RAJESH DETROJA</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/30867/perl-special-vars-quick-reference</guid>
	<pubDate>Tue, 07 Feb 2017 05:08:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/30867/perl-special-vars-quick-reference</link>
	<title><![CDATA[Perl Special Vars Quick Reference]]></title>
	<description><![CDATA[<table>
<tbody>
<tr>
<td><tt>$_</tt></td>
<td>The default or implicit variable.</td>
</tr>
<tr>
<td><tt>@_</tt></td>
<td>Subroutine parameters.</td>
</tr>
<tr>
<td><tt>$a</tt><br /><tt>$b</tt></td>
<td><a href="http://perldoc.perl.org/functions/sort.html">sort</a>&nbsp;comparison routine variables.</td>
</tr>
<tr>
<td><tt>@ARGV</tt></td>
<td>The command-line args.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Regular Expressions</span></td>
</tr>
<tr>
<td><tt>$&lt;digit&gt;</tt></td>
<td>Regexp parenthetical capture holders.</td>
</tr>
<tr>
<td><tt>$&amp;</tt></td>
<td>Last successful match (degrades performance).</td>
</tr>
<tr>
<td><tt>${^MATCH}</tt></td>
<td>Similar to&nbsp;<tt>$&amp;</tt>&nbsp;without performance penalty. Requires /p modifier.</td>
</tr>
<tr>
<td><tt>$`</tt></td>
<td>Prematch for last successful match string (degrades performance).</td>
</tr>
<tr>
<td><tt>${^PREMATCH}</tt></td>
<td>Similar to&nbsp;<tt>$`</tt>&nbsp;without performance penalty. Requires&nbsp;<tt>/p</tt>&nbsp;modifier.</td>
</tr>
<tr>
<td><tt>$'</tt></td>
<td>Postmatch for last successful match string (degrades performance).</td>
</tr>
<tr>
<td><tt>${^POSTMATCH}</tt></td>
<td>Similar to&nbsp;<tt>$'</tt>&nbsp;without performance penalty. Requires&nbsp;<tt>/p</tt>&nbsp;modifier.</td>
</tr>
<tr>
<td><tt>$+</tt></td>
<td>Last paren match.</td>
</tr>
<tr>
<td><tt>$^N</tt></td>
<td>Last closed paren match (last submatch).</td>
</tr>
<tr>
<td><tt>@+</tt></td>
<td>Offsets of ends of successful submatches in scope.</td>
</tr>
<tr>
<td><tt>@-</tt></td>
<td>Offsets of starts of successful submatches in scope.</td>
</tr>
<tr>
<td><tt>%+</tt></td>
<td>Like&nbsp;<tt>@+</tt>, but for named submatches.</td>
</tr>
<tr>
<td><tt>%-</tt></td>
<td>Like&nbsp;<tt>@-</tt>, but for named submatches.</td>
</tr>
<tr>
<td><tt>$^R</tt></td>
<td>Last regexp (?{code}) result.</td>
</tr>
<tr>
<td><tt>${^RE_DEBUG_FLAGS}</tt></td>
<td>Current value of regexp debugging flags. See&nbsp;<tt>use re 'debug';</tt></td>
</tr>
<tr>
<td><tt>${^RE_TRIE_MAXBUF}</tt></td>
<td>Control memory allocations for RE optimizations for large alternations.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Encoding</span></td>
</tr>
<tr>
<td><tt>${^ENCODING}</tt></td>
<td>The object reference to the Encode object, used to convert the source code to Unicode.</td>
</tr>
<tr>
<td><tt>${^OPEN}</tt></td>
<td>Internal use: \0 separated Input / Output layer information.</td>
</tr>
<tr>
<td><tt>${^UNICODE}</tt></td>
<td>Read-only Unicode settings.</td>
</tr>
<tr>
<td><tt>${^UTF8CACHE}</tt></td>
<td>State of the internal UTF-8 offset caching code.</td>
</tr>
<tr>
<td><tt>${^UTF8LOCALE}</tt></td>
<td>Indicates whether UTF8 locale was detected at startup.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">IO and Separators</span></td>
</tr>
<tr>
<td><tt>$.</tt></td>
<td>Current line number (or record number) of most recent filehandle.</td>
</tr>
<tr>
<td><tt>$/</tt></td>
<td>Input record separator.</td>
</tr>
<tr>
<td><tt>$|</tt></td>
<td>Output autoflush. 1=autoflush, 0=default. Applies to currently selected handle.</td>
</tr>
<tr>
<td><tt>$,</tt></td>
<td>Output field separator (lists)</td>
</tr>
<tr>
<td><tt>$\</tt></td>
<td>Output record separator.</td>
</tr>
<tr>
<td><tt>$"</tt></td>
<td>Output list separator. (interpolated lists)</td>
</tr>
<tr>
<td><tt>$;</tt></td>
<td>Subscript separator. (Use a real multidimensional array instead.)</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Formats</span></td>
</tr>
<tr>
<td><tt>$%</tt></td>
<td>Page number for currently selected output channel.</td>
</tr>
<tr>
<td><tt>$=</tt></td>
<td>Current page length.</td>
</tr>
<tr>
<td><tt>$-</tt></td>
<td>Number of lines left on page.</td>
</tr>
<tr>
<td><tt>$~</tt></td>
<td>Format name.</td>
</tr>
<tr>
<td><tt>$^</tt></td>
<td>Name of top-of-page format.</td>
</tr>
<tr>
<td><tt>$:</tt></td>
<td>Format line break characters</td>
</tr>
<tr>
<td><tt>$^L</tt></td>
<td>Form feed (default "\f").</td>
</tr>
<tr>
<td><tt>$^A</tt></td>
<td>Format Accumulator</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Status Reporting</span></td>
</tr>
<tr>
<td><tt>$?</tt></td>
<td>Child error. Status code of most recent system call or pipe.</td>
</tr>
<tr>
<td><tt>$!</tt></td>
<td>Operating System Error. (What just went 'bang'?)</td>
</tr>
<tr>
<td><tt>%!</tt></td>
<td>Error number hash</td>
</tr>
<tr>
<td><tt>$^E</tt></td>
<td>Extended Operating System Error (Extra error explanation).</td>
</tr>
<tr>
<td><tt>$@</tt></td>
<td>Eval error.</td>
</tr>
<tr>
<td><tt>${^CHILD_ERROR_NATIVE}</tt></td>
<td>Native status returned by the last pipe close, backtick (`` ) command, successful call to wait() or waitpid(), or from the system() operator.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">ID's and Process Information</span></td>
</tr>
<tr>
<td><tt>$$</tt></td>
<td>Process ID</td>
</tr>
<tr>
<td><tt>$&lt;</tt></td>
<td>Real user id of process.</td>
</tr>
<tr>
<td><tt>$&gt;</tt></td>
<td>Effective user id of process.</td>
</tr>
<tr>
<td><tt>$(</tt></td>
<td>Real group id of process.</td>
</tr>
<tr>
<td><tt>$)</tt></td>
<td>Effective group id of process.</td>
</tr>
<tr>
<td><tt>$0</tt></td>
<td>Program name.</td>
</tr>
<tr>
<td><tt>$^O</tt></td>
<td>Operating System name.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Perl Status Info</span></td>
</tr>
<tr>
<td><tt>$]</tt></td>
<td>Old: Version and patch number of perl interpreter. Deprecated.</td>
</tr>
<tr>
<td><tt>$^C</tt></td>
<td>Current value of flag associated with&nbsp;<strong>-c</strong>&nbsp;switch.</td>
</tr>
<tr>
<td><tt>$^D</tt></td>
<td>Current value of debugging flags</td>
</tr>
<tr>
<td><tt>$^F</tt></td>
<td>Maximum system file descriptor.</td>
</tr>
<tr>
<td><tt>$^I</tt></td>
<td>Value of the&nbsp;<strong>-i</strong>&nbsp;(inplace edit) switch.</td>
</tr>
<tr>
<td><tt>$^M</tt></td>
<td>Emergency Memory pool.</td>
</tr>
<tr>
<td><tt>$^P</tt></td>
<td>Internal variable for debugging support.</td>
</tr>
<tr>
<td><tt>$^R</tt></td>
<td>Last regexp (?{code}) result.</td>
</tr>
<tr>
<td><tt>$^S</tt></td>
<td>Exceptions being caught. (eval)</td>
</tr>
<tr>
<td><tt>$^T</tt></td>
<td>Base time of program start.</td>
</tr>
<tr>
<td><tt>$^V</tt></td>
<td>Perl version.</td>
</tr>
<tr>
<td><tt>$^W</tt></td>
<td>Status of -w switch</td>
</tr>
<tr>
<td><tt>${^WARNING_BITS}</tt></td>
<td>Current set of warning checks enabled by&nbsp;<tt>use warnings;</tt></td>
</tr>
<tr>
<td><tt>$^X</tt></td>
<td>Perl executable name.</td>
</tr>
<tr>
<td><tt>${^GLOBAL_PHASE}</tt></td>
<td>Current phase of the Perl interpreter.</td>
</tr>
<tr>
<td><tt>$^H</tt></td>
<td>Internal use only: Hook into Lexical Scoping.</td>
</tr>
<tr>
<td><tt>%^H</tt></td>
<td>Internaluse only: Useful to implement scoped pragmas.</td>
</tr>
<tr>
<td><tt>${^TAINT}</tt></td>
<td>Taint mode read-only flag.</td>
</tr>
<tr>
<td><tt>${^WIN32_SLOPPY_STAT}</tt></td>
<td>If true on Windows&nbsp;<tt>stat()</tt>&nbsp;won't try to open the file.</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Command Line Args</span></td>
</tr>
<tr>
<td><tt>ARGV</tt></td>
<td>Filehandle iterates over files from command line (see also&nbsp;<tt>&lt;&gt;</tt>).</td>
</tr>
<tr>
<td><tt>$ARGV</tt></td>
<td>Name of current file when reading &lt;&gt;</td>
</tr>
<tr>
<td><tt>@ARGV</tt></td>
<td>List of command line args.</td>
</tr>
<tr>
<td><tt>ARGVOUT</tt></td>
<td>Output filehandle for -i switch</td>
</tr>
<tr>
<td colspan="2" align="center"><span style="font-size: xx-small;">Miscellaneous</span></td>
</tr>
<tr>
<td><tt>@F</tt></td>
<td>Autosplit (-a mode) recipient.</td>
</tr>
<tr>
<td><tt>@INC</tt></td>
<td>List of library paths.</td>
</tr>
<tr>
<td><tt>%INC</tt></td>
<td>Keys are filenames, values are paths to modules included via&nbsp;<tt>use, require,&nbsp;</tt>or&nbsp;<tt>do</tt>.</td>
</tr>
<tr>
<td><tt>%ENV</tt></td>
<td>Hash containing current environment variables</td>
</tr>
<tr>
<td><tt>%SIG</tt></td>
<td>Signal handlers.</td>
</tr>
<tr>
<td><tt>$[</tt></td>
<td>Array and substr first element (Deprecated!).</td>
</tr>
</tbody>
</table><p>&nbsp;</p><p>See&nbsp;<a href="http://perldoc.perl.org/perlvar.html">perlvar</a>&nbsp;for detailed descriptions of each of these (and a few more) special variables.</p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/34569/ksnp30-snp-detection-and-phylogenetic-analysis-of-genomes-without-genome-alignment-or-reference-genome</guid>
	<pubDate>Fri, 08 Dec 2017 16:48:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/34569/ksnp30-snp-detection-and-phylogenetic-analysis-of-genomes-without-genome-alignment-or-reference-genome</link>
	<title><![CDATA[kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome]]></title>
	<description><![CDATA[<p><span>Sept. 20, 2017 Version 3.1 released. Major upgrade. Version 3.1 fixes the problems with SNP annotation that arose when NCBI discontinued use of GI numbers. Please read carefully the Preface (page 3) and the File of annotated genomes section (pages 9-10) in the version 3.1 User Guide. Thanks to Tom Slezak for revsing the get_genbank_file3 script and to Tod Stuber (USDA) for testing version 3.1 even though he doesn't need the annotation feature. All users are encouraged to upgrade to version 3.1.&nbsp;<br></span></p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/ksnp/files/" rel="nofollow">https://sourceforge.net/projects/ksnp/files/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36935/assemblytics-delta-file-to-analyze-alignments-of-an-assembly-to-another-assembly-or-a-reference-genome</guid>
	<pubDate>Thu, 14 Jun 2018 07:31:00 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36935/assemblytics-delta-file-to-analyze-alignments-of-an-assembly-to-another-assembly-or-a-reference-genome</link>
	<title><![CDATA[assemblytics: delta file to analyze alignments of an assembly to another assembly or a reference genome]]></title>
	<description><![CDATA[Download and install MUMmer
Align your assembly to a reference genome using nucmer (from MUMmer package)
$ nucmer -maxmatch -l 100 -c 500 REFERENCE.fa ASSEMBLY.fa -prefix OUT
Consult the MUMmer manual if you encounter problems

Optional: Gzip the delta file to speed up upload (usually 2-4X faster)
$ gzip OUT.delta
Then use the OUT.delta.gz file for upload.
Upload the .delta or delta.gz file (view example) to Assemblytics
Important: Use only contigs rather than scaffolds from the assembly. This will prevent false positives when the number of Ns in the scaffolded sequence does not match perfectly to the distance in the reference.

The unique sequence length required represents an anchor for determining if a sequence is unique enough to safely call variants from, which is an alternative to the mapping quality filter for read alignment.

http://assemblytics.com/<p>Address of the bookmark: <a href="http://assemblytics.com/" rel="nofollow">http://assemblytics.com/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32943/npscarf-scaffolding-and-completing-assemblies-in-real-time-fashion</guid>
	<pubDate>Tue, 23 May 2017 04:53:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32943/npscarf-scaffolding-and-completing-assemblies-in-real-time-fashion</link>
	<title><![CDATA[npScarf: Scaffolding and Completing Assemblies in Real-time Fashion]]></title>
	<description><![CDATA[<p><em>npScarf</em>&nbsp;(jsa.np.npscarf) is a program that scaffolds and completes draft genomes assemblies in real-time with Oxford Nanopore sequencing. The pipeline can run on a computing cluster as well as on a laptop computer for microbial datasets. It also facilitates the real-time analysis of positional information such as gene ordering and the detection of genes from mobile elements (plasmids and genomic islands).</p>
<p>Complete paper at&nbsp;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5321748/</p><p>Address of the bookmark: <a href="https://github.com/mdcao/npScarf" rel="nofollow">https://github.com/mdcao/npScarf</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43398/waafle-a-workflow-to-annotate-assemblies-and-find-lgt-events</guid>
	<pubDate>Thu, 23 Sep 2021 14:31:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43398/waafle-a-workflow-to-annotate-assemblies-and-find-lgt-events</link>
	<title><![CDATA[WAAFLE: a Workflow to Annotate Assemblies and Find LGT Events.]]></title>
	<description><![CDATA[<p><span>Lateral gene transfer (LGT) is an important mechanism for genome diversification in microbial communities, including the human microbiome. While methods exist to identify LGTs from sequenced isolate genomes, identifying LGTs from community metagenomes remains an open problem. To address this, we developed&nbsp;</span><span>WAAFLE</span><span>: a&nbsp;</span><span>W</span><span>orkflow to&nbsp;</span><span>A</span><span>nnotate&nbsp;</span><span>A</span><span>ssemblies and&nbsp;</span><span>F</span><span>ind&nbsp;</span><span>L</span><span>GT&nbsp;</span><span>E</span><span>vents.</span></p><p>Address of the bookmark: <a href="http://huttenhower.sph.harvard.edu/waafle" rel="nofollow">http://huttenhower.sph.harvard.edu/waafle</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36518/mix-combining-multiple-assemblies-from-ngs-data</guid>
	<pubDate>Tue, 08 May 2018 04:58:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36518/mix-combining-multiple-assemblies-from-ngs-data</link>
	<title><![CDATA[MIX: Combining multiple assemblies from NGS data]]></title>
	<description><![CDATA[<p>Mix is a tool that combines two or more draft assemblies, without relying on a reference genome and has the goal to reduce contig fragmentation and thus speed-up genome finishing. The proposed algorithm builds an extension graph where vertices represent extremities of contigs and edges represent existing alignments between these extremities. These alignment edges are used for contig extension. The resulting output assembly corresponds to a path in the extension graph that maximizes the cumulative contig length.</p>
<p>The Mix algorithm, approach and results were published in BMC bioinformatics :&nbsp;<a href="http://www.biomedcentral.com/1471-2105/14/S15/S16">http://www.biomedcentral.com/1471-2105/14/S15/S16</a>.</p><p>Address of the bookmark: <a href="https://github.com/cbib/MIX" rel="nofollow">https://github.com/cbib/MIX</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37239/kat-a-k-mer-analysis-toolkit-to-quality-control-ngs-datasets-and-genome-assemblies</guid>
	<pubDate>Fri, 06 Jul 2018 03:36:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37239/kat-a-k-mer-analysis-toolkit-to-quality-control-ngs-datasets-and-genome-assemblies</link>
	<title><![CDATA[KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies]]></title>
	<description><![CDATA[<p>KAT is a suite of tools that analyse jellyfish hashes or sequence files (fasta or fastq) using kmer counts. The following tools are currently available in KAT:</p>
<ul>
<li><span>hist</span>: Create an histogram of k-mer occurrences from a sequence file. Adds metadata in output for easy plotting.</li>
<li><span>gcp:</span>&nbsp;K-mer GC Processor. Creates a matrix of the number of K-mers found given a GC count and a K-mer count.</li>
<li><span>comp</span>: K-mer comparison tool. Creates a matrix of shared K-mers between two (or three) sequence files or hashes.</li>
<li><span>sect</span>: SEquence Coverage estimator Tool. Estimates the coverage of each sequence in a file using K-mers from another sequence file.</li>
<li><span>blob</span>: Given, reads and an assembly, calculates both the read and assembly K-mer coverage along with GC% for each sequence in the assembly.SEquence Coverage estimator Tool.</li>
<li><span>filter</span>: Filtering tools. Contains tools for filtering k-mer hashes and FastQ/A files:
<ul>
<li><span>kmer</span>: Produces a k-mer hash containing only k-mers within specified coverage and GC tolerances.</li>
<li><span>seq</span>: Filters a sequence file based on whether or not the sequences contain k-mers within a provided hash.</li>
</ul>
</li>
<li><span>plot</span>: Plotting tools. Contains several plotting tools to visualise K-mer and compare distributions. The following plot tools are available:
<ul>
<li><span>density</span>: Creates a density plot from a matrix created with the "comp" tool. Typically this is used to compare two K-mer hashes produced by different NGS reads.</li>
<li><span>profile</span>: Creates a K-mer coverage plot for a single sequence. Takes in fasta coverage output coverage from the "sect" tool</li>
<li><span>spectra-cn</span>: Creates a stacked histogram using a matrix created with the "comp" tool. Typically this is used to compare a jellyfish hash produced from a read set to a jellyfish hash produced from an assembly. The plot shows the amount of distinct K-mers absent, as well as the copy number variation present within the assembly.</li>
<li><span>spectra-hist</span>: Creates a K-mer spectra plot for a set of K-mer histograms produced either by jellyfish-histo or kat-histo.</li>
<li><span>spectra-mx</span>: Creates a K-mer spectra plot for a set of K-mer histograms that are derived from selected rows or columns in a matrix produced by the "comp".</li>
</ul>
</li>
</ul>
<p>In addition, KAT contains a python script for analysing the mathematical distributions present in the K-mer spectra in order to determine how much content is present in each peak.</p>
<p>This README only contains some brief details of how to install and use KAT. For more extensive documentation please visit:&nbsp;<a href="https://kat.readthedocs.org/en/latest/">https://kat.readthedocs.org/en/latest/</a></p>
<p><a href="https://academic.oup.com/bioinformatics/article/33/4/574/2664339">https://academic.oup.com/bioinformatics/article/33/4/574/2664339&nbsp;</a></p><p>Address of the bookmark: <a href="https://github.com/TGAC/KAT" rel="nofollow">https://github.com/TGAC/KAT</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>