<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/36533?offset=180</link>
	<atom:link href="https://bioinformaticsonline.com/related/36533?offset=180" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36880/jvarkit-java-utilities-for-bioinformatics</guid>
	<pubDate>Fri, 08 Jun 2018 09:31:55 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36880/jvarkit-java-utilities-for-bioinformatics</link>
	<title><![CDATA[Jvarkit : Java utilities for Bioinformatics]]></title>
	<description><![CDATA[Collection of Java tool kits for bioinformatics works:

Jvarkit : Java utilities for Bioinformatics<p>Address of the bookmark: <a href="http://lindenb.github.io/jvarkit/" rel="nofollow">http://lindenb.github.io/jvarkit/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37574/simlord-a-read-simulator-for-third-generation-sequencing-reads</guid>
	<pubDate>Wed, 22 Aug 2018 10:40:27 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37574/simlord-a-read-simulator-for-third-generation-sequencing-reads</link>
	<title><![CDATA[SimLoRD: A read simulator for third generation sequencing reads]]></title>
	<description><![CDATA[<p>SimLoRD is a read simulator for third generation sequencing reads and is currently focused on the Pacific Biosciences SMRT error model.</p>
<p>Reads are simulated from both strands of a provided or randomly generated reference sequence.</p>
<div id="rst-header-features">
<ul>
<li>The reference can be read from a FASTA file or randomly generated with a given GC content. It can consist of several chromosomes, whose structure is respected when drawing reads. (Simulation of genome rearrangements may be incorporated at a later stage.)</li>
<li>The read lengths can be determined in four ways: drawing from a log-normal distribution (typical for genomic DNA), sampling from an existing FASTQ file (typical for RNA), sampling from a a text file with integers (RNA), or using a fixed length</li>
<li>Quality values and number of passes depend on fragment length.</li>
<li>Provided subread error probabilities are modified according to number of passes</li>
<li>Outputs reads in FASTQ format and alignments in SAM format</li>
</ul>
</div><p>Address of the bookmark: <a href="https://bitbucket.org/genomeinformatics/simlord/" rel="nofollow">https://bitbucket.org/genomeinformatics/simlord/</a></p>]]></description>
	<dc:creator>Aaryan Lokwani</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43254/quasr-quantification-and-annotation-of-short-reads-in-r</guid>
	<pubDate>Fri, 13 Aug 2021 07:44:05 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43254/quasr-quantification-and-annotation-of-short-reads-in-r</link>
	<title><![CDATA[QuasR: Quantification and annotation of short reads in R]]></title>
	<description><![CDATA[<p>The <em><a href="https://bioconductor.org/packages/3.14/QuasR">QuasR</a></em> package (short for <em>Qu</em>antify and <em>a</em>nnotate <em>s</em>hort reads in <em>R</em>) integrates the functionality of several <strong>R</strong> packages (such as <em><a href="https://bioconductor.org/packages/3.14/IRanges">IRanges</a></em> <span>(Lawrence et al. 2013)</span> and <em><a href="https://bioconductor.org/packages/3.14/Rsamtools">Rsamtools</a></em>) and external software (e.g.&nbsp;<code>bowtie</code>, through the <em><a href="https://bioconductor.org/packages/3.14/Rbowtie">Rbowtie</a></em> package, and <code>HISAT2</code>, through the <em><a href="https://bioconductor.org/packages/3.14/Rhisat2">Rhisat2</a></em> package). The package aims to cover the whole analysis workflow of typical high throughput sequencing experiments, starting from the raw sequence reads, over pre-processing and alignment, up to quantification. A single <strong>R</strong> script can contain all steps of a complete analysis, making it simple to document, reproduce or share the workflow containing all relevant details.</p><p>Address of the bookmark: <a href="https://www.bioconductor.org/packages/devel/bioc/vignettes/QuasR/inst/doc/QuasR.html" rel="nofollow">https://www.bioconductor.org/packages/devel/bioc/vignettes/QuasR/inst/doc/QuasR.html</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27841/covcal-coverage-read-count-calculator</guid>
	<pubDate>Wed, 15 Jun 2016 18:08:13 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27841/covcal-coverage-read-count-calculator</link>
	<title><![CDATA[CovCal: Coverage / Read Count Calculator]]></title>
	<description><![CDATA[<h2>Coverage / Read Count Calculator</h2>
<h4>Calculate how much sequencing you need to hit a target depth of coverage (or vice versa).</h4>
<p><span>Instructions:</span> set the read length/configuration and genome size, then select what you want to calculate.</p>
<p>Written by <a href="http://stephenturner.us/" target="blank">Stephen Turner</a>, based on the <a href="http://www.ncbi.nlm.nih.gov/pubmed/3294162" target="_blank">Lander-Waterman formula</a>, inspired by <a href="http://core-genomics.blogspot.com/2016/05/how-many-reads-to-sequence-genome.html" target="_blank">a similar calculator</a> written by James Hadfield. Coverage is calculated as <em>C=LN/G</em> and reads as <em>N=CG/L</em> where <em>C</em> = Coverage (X),<em>L</em> = Read length (bp), <em>G</em> = Haploid genome size (bp), and <em>N</em> = Number of reads. Source code <a href="https://github.com/stephenturner/covcalc" target="_blank">on GitHub</a>.</p><p>Address of the bookmark: <a href="http://apps.bioconnector.virginia.edu/covcalc/" rel="nofollow">http://apps.bioconnector.virginia.edu/covcalc/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35055/jabba-hybrid-error-correction-for-long-sequencing-reads</guid>
	<pubDate>Fri, 05 Jan 2018 03:58:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35055/jabba-hybrid-error-correction-for-long-sequencing-reads</link>
	<title><![CDATA[Jabba: Hybrid Error Correction for Long Sequencing Reads]]></title>
	<description><![CDATA[<p>Jabba is a hybrid error correction tool to correct third generation (PacBio / ONT) sequencing data, using second generation (Illumina) data.</p>
<p>Input</p>
<p>Jabba takes as input a concatenated de Bruijn graph and a set of sequences:</p>
<p>the de Bruijn graph should appear in fasta format with 1 entry per node, the meta information should be in the format:<br>&gt;NODE <br>the set of sequences should be in fasta or fastq format. These sequences will be corrected (e.g. PacBio reads). The corrections will be written to a file Jabba fasta.<br>The output is a file in fasta format with corrections of the long reads, and additionally a file in the input format containing uncorrected reads.</p>
<p>https://github.com/biointec/jabba/wiki</p>
<p>https://almob.biomedcentral.com/articles/10.1186/s13015-016-0075-7</p><p>Address of the bookmark: <a href="https://github.com/biointec/jabba" rel="nofollow">https://github.com/biointec/jabba</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34221/alignment-free-sequence-comparison-tools-available-for-next-generation-sequencing-data-analysis</guid>
	<pubDate>Tue, 07 Nov 2017 05:33:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34221/alignment-free-sequence-comparison-tools-available-for-next-generation-sequencing-data-analysis</link>
	<title><![CDATA[Alignment-free sequence comparison tools available for next-generation sequencing data analysis]]></title>
	<description><![CDATA[<div><p><span>kallisto</span></p></div><div><p>Transcript abundance quantification from RNA-seq data (uses pseudoalignment for rapid determination of read compatibility with targets)</p><p>Software (C++)</p><p><a href="https://pachterlab.github.io/kallisto/">https://pachterlab.github.io/kallisto/</a></p><p>Sailfish</p><p>Estimation of isoform abundances from reference sequences and RNA-seq data (<em>k</em>-mer based)</p><p>Software (C++)</p><p><a href="http://www.cs.cmu.edu/~ckingsf/software/sailfish/">http://www.cs.cmu.edu/~ckingsf/software/sailfish/</a></p><p>Salmon</p><p>Quantification of the expression of transcripts using RNA-seq data (uses&nbsp;<em>k</em>-mers)</p><p><a href="https://combine-lab.github.io/salmon/">https://combine-lab.github.io/salmon/</a></p><p>RNA-Skim</p><p>RNA-seq quantification at transcript-level (partitions the transcriptome into disjoint transcript clusters; uses&nbsp;<em>sig</em>-mers, a special type of&nbsp;<em>k</em>-mers)</p><p>Software (C++)</p><p><a href="http://www.csbio.unc.edu/rs/">http://www.csbio.unc.edu/rs/</a></p><p>Variant calling</p><p>ChimeRScope</p><p>Fusion transcript prediction using gene&nbsp;<em>k</em>-mers profiles of the RNA-seq paired-end reads</p><p>Software (Java)</p><p><a href="https://github.com/ChimeRScope/ChimeRScope/wiki">https://github.com/ChimeRScope/ChimeRScope/wiki</a></p><p>FastGT</p><p>Genotyping of known SNV/SNP variants directly from raw NGS sequence reads by counting unique&nbsp;<em>k</em>-mers</p><p>Software (C)</p><p><a href="https://github.com/bioinfo-ut/GenomeTester4/">https://github.com/bioinfo-ut/GenomeTester4/</a></p><p>Phy-Mer</p><p>Reference-independent mitochondrial haplogroup classifier from NGS data (<em>k</em>-mer based)</p><p>Software (Python)</p><p><a href="https://github.com/danielnavarrogomez/phy-mer">https://github.com/danielnavarrogomez/phy-mer</a></p><p>LAVA</p><p>Genotyping of known SNPs (dbSNP and Affymetrix's Genome-Wide Human SNP Array) from raw NGS reads (<em>k</em>-mer based)</p><p>Software (C)</p><p><a href="http://lava.csail.mit.edu/">http://lava.csail.mit.edu/</a></p><p>MICADo</p><p>Detection of mutations in targeted third-generation NGS data (can distinguish patients&rsquo; specific mutations; algorithm uses&nbsp;<em>k</em>-mers and is based on colored de Bruijn graphs)</p><p>Software (Python)</p><p><a href="http://github.com/cbib/MICADo">http://github.com/cbib/MICADo</a></p><p>General mapper</p><p>Minimap</p><p>Lightweight and fast read mapper and read overlap detector (uses the concept of &ldquo;minimazers&rdquo;, a special type of&nbsp;<em>k</em>-mers)</p><p>Software (C)</p><p><a href="https://github.com/lh3/minimap">https://github.com/lh3/minimap</a></p><p>Assembly</p><p>De novo genome assembly</p><p>MHAP</p><p>Produces highly continuous assembly (fully resolved chromosome arms) from third-generation long and noisy reads (10 kbp) using a dimensionality reduction technique MinHash</p><p>Software (Java)</p><p><a href="https://github.com/marbl/MHAP">https://github.com/marbl/MHAP</a></p><p>Miniasm</p><p>Assembler of long noisy reads (SMRT, ONT) using the Overlap-Layout Consensus (OLC) approach without the necessity of an error correction stage (uses minimap)</p><p>Software (C)</p><p><a href="https://github.com/lh3/miniasm">https://github.com/lh3/miniasm</a></p><p>LINKS</p><p>Scaffolding genome assembly with error-containing long sequence (e.g., ONT or PacBio reads, draft genomes)</p><p>Software (Perl)</p><p><a href="https://github.com/warrenlr/LINKS/">https://github.com/warrenlr/LINKS/</a></p><p>Read clustering</p><p>afcluster</p><p>Clustering of reads from different genes and different species based on&nbsp;<em>k</em>-mer counts</p><p>Software (C++)</p><p><a href="https://github.com/luscinius/afcluster">https://github.com/luscinius/afcluster</a></p><p>QCluster</p><p>Clustering of reads with alignment-free measures (<em>k</em>-mer based) and quality values</p><p>Software (C++)</p><p><a href="http://www.dei.unipd.it/~ciompin/main/qcluster.html">http://www.dei.unipd.it/~ciompin/main/qcluster.html</a></p><p>Reads error correction</p><p>Lighter</p><p>Correction of sequencing errors in raw, whole genome sequencing reads (<em>k</em>-mer based)</p><p>Software (C++)</p><p><a href="https://github.com/mourisl/Lighter">https://github.com/mourisl/Lighter</a></p><p>QuorUM</p><p>Error corrector for Illumina reads using k-mers</p><p>Software (C++)</p><p><a href="https://github.com/gmarcais/Quorum">https://github.com/gmarcais/Quorum</a></p><p>Trowel</p><p>Software (C++)</p><p><a href="https://sourceforge.net/projects/trowel-ec/">https://sourceforge.net/projects/trowel-ec/</a></p><p>Metagenomics</p><p>Assembly-free phylogenomics</p><p>AAF</p><p>Phylogeny reconstruction directly from unassembled raw sequence data from whole genome sequencing projects; provides bootstrap support to assess uncertainty in the tree topology (<em>k</em>-mer based)</p><p>Software (Python)</p><p><a href="https://github.com/fanhuan/AAF">https://github.com/fanhuan/AAF</a></p><p>kSNP v3</p><p>Reference-free SNP identification and estimation of phylogenetic trees using SNPs (based on&nbsp;<em>k</em>-mer analysis)</p><p>Software (C)</p><p><a href="https://sourceforge.net/projects/ksnp/files/">https://sourceforge.net/projects/ksnp/files/</a></p><p>NGS-MC</p><p>Phylogeny of species based on NGS reads using alignment-free sequence dissimilarity measures d2* and d2&nbsp;S&nbsp;under different Markov chain models (using&nbsp;<em>k</em>-words)</p><p>R package</p><p><a href="http://www-rcf.usc.edu/~fsun/Programs/NGS-MC/NGS-MC.html">http://www-rcf.usc.edu/~fsun/Programs/NGS-MC/NGS-MC.html</a></p><p>Species identification/taxonomic profiling</p><p>CLARK</p><p>Taxonomic classification of metagenomic reads to known bacterial genomes using&nbsp;<em>k</em>-mer search and LCA assignment</p><p>Software (C++)</p><p><a href="http://clark.cs.ucr.edu/">http://clark.cs.ucr.edu/</a></p><p>FOCUS</p><p>Reports organisms present in metagenomic samples and profiles their abundances (uses composition-based approach and non-negative least squares for prediction)</p><p>Web service Software (Python)</p><p><a href="http://edwards.sdsu.edu/FOCUS/">http://edwards.sdsu.edu/FOCUS/</a></p><p>GSM</p><p>Estimation of abundances of microbial genomes in metagenomic samples (<em>k</em>-mer based)</p><p>Software (Go)</p><p><a href="https://github.com/pdtrang/GSM">https://github.com/pdtrang/GSM</a></p><p>Mash</p><p>Species identification using assembled or unassembled Illumina, PacBio, and ONT data (based on MinHash dimensionality-reduction technique)</p><p>Software (C++)</p><p><a href="https://github.com/marbl/mash">https://github.com/marbl/mash</a></p><p>Kraken</p><p>Taxonomic assignment in metagenome analysis by exact&nbsp;<em>k</em>-mer search; LCA assignment of short reads based on a comprehensive sequence database</p><p>Software (C++)</p><p><a href="https://ccb.jhu.edu/software/kraken/">https://ccb.jhu.edu/software/kraken/</a></p><p>LMAT</p><p>Assignment of taxonomic labels to reads by&nbsp;<em>k</em>-mers searches in precomputed database</p><p>Software (C++/Python)</p><p><a href="https://sourceforge.net/projects/lmat/">https://sourceforge.net/projects/lmat/</a></p><p>stringMLST</p><p><em>k</em>-mer-based tool for MLST directly from the genome sequencing reads</p><p>Software (Python)</p><p><a href="http://jordan.biology.gatech.edu/page/software/stringMLST">http://jordan.biology.gatech.edu/page/software/stringMLST</a></p><p>Taxonomer</p><p><em>k</em>-mer-based ultrafast metagenomics tool for assigning taxonomy to sequencing reads from clinical and environmental samples</p><p>Web service</p><p><a href="http://taxonomer.iobio.io/">http://taxonomer.iobio.io/</a></p><p>Other</p><p>d2-tools</p><p>Word-based (<em>k</em>-tuple) comparison (pairwise dissimilarity matrix using d2S measure) of metatranscriptomic samples from NGS reads</p><p>Software (Python/R)</p><p><a href="https://code.google.com/p/d2-tools/">https://code.google.com/p/d2-tools/</a></p><p>VirHostMatcher</p><p>Prediction of hosts from metagenomic viral sequences based on ONF using various distance measures (e.g., d2)</p><p>Software (C++)</p><p><a href="https://github.com/jessieren/VirHostMatcher">https://github.com/jessieren/VirHostMatcher</a></p><p>MetaFast</p><p>Statistics calculation of metagenome sequences and the distances between them based on assembly using de Bruijn graphs and Bray&ndash;Curtis dissimilarity measure</p><p>Software (Java)</p><p><a href="https://github.com/ctlab/metafast">https://github.com/ctlab/metafast</a></p></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40544/ngs-bits-short-read-sequencing-tools</guid>
	<pubDate>Thu, 16 Jan 2020 23:14:00 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40544/ngs-bits-short-read-sequencing-tools</link>
	<title><![CDATA[ngs-bits - Short-read sequencing tools]]></title>
	<description><![CDATA[<p>Binaries of&nbsp;<em>ngs-bits</em>&nbsp;are available via Bioconda. Alternatively,&nbsp;<em>ngs-bits</em>&nbsp;can be built from sources:</p>
<ul>
<li><span>Binaries</span>&nbsp;for&nbsp;<a href="https://github.com/imgag/ngs-bits/blob/master/doc/install_bioconda.md">Linux/macOS</a></li>
<li>From&nbsp;<span>sources</span>&nbsp;for&nbsp;<a href="https://github.com/imgag/ngs-bits/blob/master/doc/install_unix.md">Linux/macOS</a></li>
<li>From&nbsp;<span>sources</span>&nbsp;for&nbsp;<a href="https://github.com/imgag/ngs-bits/blob/master/doc/install_win.md">Windows</a></li>
</ul><p>Address of the bookmark: <a href="https://github.com/imgag/ngs-bits" rel="nofollow">https://github.com/imgag/ngs-bits</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</guid>
	<pubDate>Fri, 04 Oct 2024 02:45:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</link>
	<title><![CDATA[Libraries or management tools for high throughput sequencing data]]></title>
	<description><![CDATA[<ul>
<li><a href="http://gatb.inria.fr/"><span>GATB</span></a>&nbsp;Library.&nbsp;The&nbsp;<span>Genome Analysis Toolbox with de-Bruijn graph.&nbsp;</span>A large part of tools developed by the GenScale team are based on this library.<br />These methods enable the analysis of data sets of any size on multi-core desktop computers, including very huge amount of reads data coming from any kind of organisms such as bacteria, plants, animals and even complex samples (<em>e.g.</em>&nbsp;metagenomes). Among them are (the full is available here:&nbsp;<a href="https://gatb.inria.fr/software/">https://gatb.inria.fr/software/</a>):</li>
<li><a href="https://github.com/morispi/LRez"><span>LRez</span></a>: C++ Library and toolkit for the barcode-based management and indexation of linked-read datasets.</li>
</ul><h2>Variant calling and/or genotyping</h2><ul>
<li><a href="https://gatb.inria.fr/software/discosnp/" title="DiscoSNP">DiscoSNP++ and&nbsp;discoSnpRAD</a>: Reference-free small variant discovery (SNPs and indels)</li>
<li><a href="https://gatb.inria.fr/software/mind-the-gap/" title="MindTheGap">MindTheGap</a>: Detection and assembly of large insertion variants</li>
<li><a href="https://gatb.inria.fr/software/takeabreak/" title="TakeABreak">TakeABreak</a>:&nbsp;reference-free inversion discovery tool</li>
<li><a href="https://github.com/llecompte/SVJedi">SVJedi</a>: Structural Variant genotyper with long read data</li>
<li><a href="https://github.com/SandraLouise/SVJedi-graph">SVJedi-graph</a>: Structural Variant genotyper with long read data using a variation graph</li>
</ul><h2>Sequence assembly</h2><ul>
<li><a href="https://github.com/cguyomar/MinYS">MinYS</a>: reference-guided genome assembly in metagenomics data</li>
<li><a href="https://github.com/anne-gcd/MTG-Link">MTG-link</a>: local assembly tool for linked-read data</li>
<li><a href="https://gatb.inria.fr/software/minia/" title="Minia">Minia</a>: De novo short read assembler</li>
<li><a href="https://gatb.inria.fr/de-novo-genome-assembly/">de-novo pipeline</a>:&nbsp;<em>de-novo</em>&nbsp;assembly pipeline (error correction / contigs / scaffolding) for genomes and meta-genomes</li>
<li><a href="https://gatb.inria.fr/software/mapsembler/" title="Mapsembler2">Mapsembler2</a>: Targeted assembly (not maintained)</li>
</ul><h2>Managing k-mers &amp; indexation</h2><ul>
<li><a href="https://github.com/lrobidou/findere">findere</a>:&nbsp;simple strategy for speeding up queries and for reducing false positive calls from any Approximate Membership Query data structure.
<ul>
<li><a href="https://github.com/lrobidou/fimpera">fimpera</a>&nbsp;extends findere adding the abundance information.</li>
</ul>
</li>
<li><a href="https://github.com/tlemane/kmtricks">kmtricks</a>:&nbsp;modular tool suite for counting kmers, and constructing Bloom filters or kmer matrices, for large collections of sequencing data.</li>
<li><a href="https://github.com/tlemane/kmindex">kmindex&nbsp;</a>is a tool for indexing and querying sequencing samples. It is built on top of kmtricks.</li>
<li><a href="https://github.com/pierrepeterlongo/back_to_sequences">back to sequences</a>: Find sequences (reads, unitigs, genes) related to a set of kmers in large datasets, in a matter of seconds.</li>
<li><a href="https://github.com/vicLeva/bqf">Backpack Quotient Filter</a>:&nbsp;k-mer indexing data structure with abundance</li>
<li><a href="http://github.com/GATB/rconnector">short read connector</a>:&nbsp;Detect similar reads from potentially large read set</li>
<li><a href="https://gatb.inria.fr/software/dsk/" title="DSK">DSK</a>:&nbsp;Count K-mer in sequences</li>
</ul><h2>Pangenome graph manipulation</h2><ul>
<li><a href="https://github.com/Tharos-ux/pancat">Pancat</a>: Pangenome Comparison and Analysis Toolkit</li>
<li><a href="https://pypi.org/project/gfagraphs/">GFAGraphs</a>: a Python library to handle pangenome graph files in GFA format.</li>
</ul><h2>Comparative metagenomics with k-mers</h2><ul>
<li><a href="https://github.com/GATB/simka">Simka and SimkaMin</a>:&nbsp;Comparative metagenomics for large-scale datasets</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/compreads-metagenomic-data-analysis/">Comparead &amp; Commet</a>:&nbsp;comparison of metagenomic datasets</li>
</ul><h2>Species and bacterial strains identification</h2><ul>
<li><a href="https://github.com/gsiekaniec/ORI">ORI</a>: software using long nanopore reads to identify bacteria present in a sample at the strain level</li>
<li><a href="https://github.com/kevsilva/StrainFLAIR">StrainFLAIR</a>:&nbsp;STRAIN-level proFiLing using vArIation gRaph</li>
</ul><h2>General-purpose sequencing data manipulation</h2><ul>
<li><a href="https://team.inria.fr/genscale/ngs-software/gassst/">GASSST</a>:&nbsp;long read mapper</li>
<li><a href="https://gatb.inria.fr/software/leon/" title="Leon">Leon</a>: short read compressor (now included in GATB-core)</li>
<li><a href="https://gatb.inria.fr/software/bloocoo/" title="Bloocoo">Bloocoo</a>:&nbsp;short read corrector</li>
<li><a href="https://github.com/GATB/bcalm">BCALM</a>:&nbsp;Construct compacted de Bruijn graphs (unitigs)</li>
</ul><h2>&nbsp;Protein Structure</h2><ul>
<li><a href="https://team.inria.fr/genscale/protein-structure/a-purva-contact-map-overlap-solver/">A_Purva</a>:&nbsp;Contact Map Overlap solver</li>
<li><a href="https://team.inria.fr/genscale/protein-structure/md-jeep-distance-geomtry-solver/">MD-Jeep</a>:&nbsp;Distance Geometry solver</li>
<li><a href="https://team.inria.fr/genscale/csa-comparative-structural-alignment/">CSA</a>:&nbsp;Comparative Structural Alignment</li>
</ul><h2>Workflow</h2><ul>
<li><a href="https://team.inria.fr/genscale/workflows/slicee/">SLICEE</a>:&nbsp;parallel execution of bioinformatics workflows</li>
</ul><h3>Comparative Genomics</h3><ul>
<li><a href="https://team.inria.fr/genscale/comparative-genomics/cassis/">CASSIS</a>:&nbsp;detection of rearrangement breakpoints</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/plast-intensive-sequence-comparison/">PLAST</a>:&nbsp;intensive bank-to-bank sequence comparison</li>
<li><a href="https://github.com/stephanierobin/DrjBreakpointFinder">DRJBreakpointFinder</a>: detection and precise localization of excision sites in proviral segments</li>
</ul>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36865/perga-a-paired-end-read-guided-de-novo-assembler-for-extending-contigs-using-svm-and-look-ahead-approach</guid>
	<pubDate>Tue, 05 Jun 2018 09:57:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36865/perga-a-paired-end-read-guided-de-novo-assembler-for-extending-contigs-using-svm-and-look-ahead-approach</link>
	<title><![CDATA[PERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM and Look Ahead Approach]]></title>
	<description><![CDATA[PERGA - Paired End Reads Guided Assembler

PERGA is a novel sequence reads guided de novo assembly approach which adopts greedy-like prediction strategy for assembling reads to contigs and scaffolds. Instead of using single-end reads to construct contig, PERGA uses paired-end reads and different read overlap sizes from O ≥ Omax to Omin to resolve the gaps and branches. Moreover, by constructing a decision model using machine learning approach based on branch features, PERGA can determine the correct extension in 99.7% of cases. PERGA will try to extend the contigs by all feasible nucleotides and determine if these multiple extensions due to sequencing errors or repeats by using looking ahead technology, and it also try to separate the different repeats of nearby genomic regions to make the assembly result more longer and accurate.

The simulated E.coli paired-end reads data are generated using GemSim (KE McElroy, F Luciani, T Thomas. Gemsim: General, Error-Model Based Simulator of Next-Generation Sequencing Data. BMC Genomics 2012, 13:74), with coverage 50x, 60x, 100x, read lengths 100-bp, and can be downloaded from https://github.com/zhuxiao/data_PERGA.<p>Address of the bookmark: <a href="https://github.com/hitbio/PERGA" rel="nofollow">https://github.com/hitbio/PERGA</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37221/asplice-a-scalable-and-memory-efficient-algorithm-for-de-novo-transcriptome-assembly</guid>
	<pubDate>Tue, 03 Jul 2018 04:09:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37221/asplice-a-scalable-and-memory-efficient-algorithm-for-de-novo-transcriptome-assembly</link>
	<title><![CDATA[ASplice: a scalable and memory-efficient algorithm for de novo transcriptome assembly]]></title>
	<description><![CDATA[With increased availability of de novo assembly algorithms, it is feasible to study entire transcriptomes of non-model organisms. While algorithms are available that are specifically designed for performing transcriptome assembly from high-throughput sequencing data, they are very memory-intensive, limiting their applications to small data sets with few libraries.

Texas A&amp;M University researchers develop a transcriptome assembly algorithm that recovers alternatively spliced isoforms and expression levels while utilizing as many RNA-Seq libraries as possible that contain hundreds of gigabases of data. New techniques are developed so that computations can be performed on a computing cluster with moderate amount of physical memory.

Availability – A software program that implements the algorithm is available at: http://faculty.cse.tamu.edu/shsze/asplice.

Sze SH, Pimsler ML, Tomberlin JK, Jones CD, Tarone AM. (2017) A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms. BMC Genomics 18(Suppl 4):387.<p>Address of the bookmark: <a href="http://faculty.cse.tamu.edu/shsze/asplice/" rel="nofollow">http://faculty.cse.tamu.edu/shsze/asplice/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>