<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/39213?offset=130</link>
	<atom:link href="https://bioinformaticsonline.com/related/39213?offset=130" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32946/grass-a-generic-algorithm-for-scaffolding-next-generation-sequencing-assemblies</guid>
	<pubDate>Tue, 23 May 2017 05:20:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32946/grass-a-generic-algorithm-for-scaffolding-next-generation-sequencing-assemblies</link>
	<title><![CDATA[GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies.]]></title>
	<description><![CDATA[<p><span>GRASS (GeneRic ASsembly Scaffolder)-a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an expectation-maximization procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS with existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used.</span></p><p>Address of the bookmark: <a href="https://github.com/AlexeyG/GRASS" rel="nofollow">https://github.com/AlexeyG/GRASS</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38169/amstat-display-statistics-of-large-sequence-files-from-next-generation-sequencing-projects</guid>
	<pubDate>Fri, 09 Nov 2018 13:34:56 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38169/amstat-display-statistics-of-large-sequence-files-from-next-generation-sequencing-projects</link>
	<title><![CDATA[AMStat: display statistics of large sequence files from next generation sequencing projects]]></title>
	<description><![CDATA[<p><span>SAMStat is an efficient C program to quickly display statistics of large sequence files from next generation sequencing projects. When applied to&nbsp;</span><a href="http://samstat.sourceforge.net/#about">SAM/BAM</a><span>&nbsp;files all statistics are reported for unmapped, poorly and accurately mapped reads separately. This allows for identification of a variety of problems, such as remaining linker and adaptor sequences, causing poor mapping. Apart from this SAMStat can be used to verify individual processing steps in large analysis pipelines.</span></p><p>Address of the bookmark: <a href="http://samstat.sourceforge.net/" rel="nofollow">http://samstat.sourceforge.net/</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42917/fings-filters-for-next-generation-sequencing</guid>
	<pubDate>Sat, 27 Feb 2021 01:18:35 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42917/fings-filters-for-next-generation-sequencing</link>
	<title><![CDATA[FiNGS: Filters for Next Generation Sequencing]]></title>
	<description><![CDATA[<h2>Key features</h2>
<ul>
<li><strong>Filters SNVs from any variant caller to remove false positives</strong></li>
<li><strong>Calculates metrics based on BAM files and provides filtering not possible with other tools</strong></li>
<li><strong>Fully user-configurable filtering (including which filters to use and their thresholds)</strong></li>
<li><strong>Option to use filters identical to ICGC recommendations</strong></li>
</ul>
<p>FiNGS provides researchers with a tool to reproducibly filter somatic variants that is simple to both deploy and use, with filters and thresholds that are fully configurable by the user. It ingests and emits standard variant call format (VCF) files and will slot into existing sequencing pipelines. It allows users to develop and implement their own filtering strategies and simple sharing of these with others.</p>
<p>FiNGS reliably improves upon the precision of default variant caller outputs and performs better than other tools designed for the same task.</p><p>Address of the bookmark: <a href="https://github.com/cpwardell/FiNGS" rel="nofollow">https://github.com/cpwardell/FiNGS</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</guid>
	<pubDate>Fri, 04 Oct 2024 02:45:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</link>
	<title><![CDATA[Libraries or management tools for high throughput sequencing data]]></title>
	<description><![CDATA[<ul>
<li><a href="http://gatb.inria.fr/"><span>GATB</span></a>&nbsp;Library.&nbsp;The&nbsp;<span>Genome Analysis Toolbox with de-Bruijn graph.&nbsp;</span>A large part of tools developed by the GenScale team are based on this library.<br />These methods enable the analysis of data sets of any size on multi-core desktop computers, including very huge amount of reads data coming from any kind of organisms such as bacteria, plants, animals and even complex samples (<em>e.g.</em>&nbsp;metagenomes). Among them are (the full is available here:&nbsp;<a href="https://gatb.inria.fr/software/">https://gatb.inria.fr/software/</a>):</li>
<li><a href="https://github.com/morispi/LRez"><span>LRez</span></a>: C++ Library and toolkit for the barcode-based management and indexation of linked-read datasets.</li>
</ul><h2>Variant calling and/or genotyping</h2><ul>
<li><a href="https://gatb.inria.fr/software/discosnp/" title="DiscoSNP">DiscoSNP++ and&nbsp;discoSnpRAD</a>: Reference-free small variant discovery (SNPs and indels)</li>
<li><a href="https://gatb.inria.fr/software/mind-the-gap/" title="MindTheGap">MindTheGap</a>: Detection and assembly of large insertion variants</li>
<li><a href="https://gatb.inria.fr/software/takeabreak/" title="TakeABreak">TakeABreak</a>:&nbsp;reference-free inversion discovery tool</li>
<li><a href="https://github.com/llecompte/SVJedi">SVJedi</a>: Structural Variant genotyper with long read data</li>
<li><a href="https://github.com/SandraLouise/SVJedi-graph">SVJedi-graph</a>: Structural Variant genotyper with long read data using a variation graph</li>
</ul><h2>Sequence assembly</h2><ul>
<li><a href="https://github.com/cguyomar/MinYS">MinYS</a>: reference-guided genome assembly in metagenomics data</li>
<li><a href="https://github.com/anne-gcd/MTG-Link">MTG-link</a>: local assembly tool for linked-read data</li>
<li><a href="https://gatb.inria.fr/software/minia/" title="Minia">Minia</a>: De novo short read assembler</li>
<li><a href="https://gatb.inria.fr/de-novo-genome-assembly/">de-novo pipeline</a>:&nbsp;<em>de-novo</em>&nbsp;assembly pipeline (error correction / contigs / scaffolding) for genomes and meta-genomes</li>
<li><a href="https://gatb.inria.fr/software/mapsembler/" title="Mapsembler2">Mapsembler2</a>: Targeted assembly (not maintained)</li>
</ul><h2>Managing k-mers &amp; indexation</h2><ul>
<li><a href="https://github.com/lrobidou/findere">findere</a>:&nbsp;simple strategy for speeding up queries and for reducing false positive calls from any Approximate Membership Query data structure.
<ul>
<li><a href="https://github.com/lrobidou/fimpera">fimpera</a>&nbsp;extends findere adding the abundance information.</li>
</ul>
</li>
<li><a href="https://github.com/tlemane/kmtricks">kmtricks</a>:&nbsp;modular tool suite for counting kmers, and constructing Bloom filters or kmer matrices, for large collections of sequencing data.</li>
<li><a href="https://github.com/tlemane/kmindex">kmindex&nbsp;</a>is a tool for indexing and querying sequencing samples. It is built on top of kmtricks.</li>
<li><a href="https://github.com/pierrepeterlongo/back_to_sequences">back to sequences</a>: Find sequences (reads, unitigs, genes) related to a set of kmers in large datasets, in a matter of seconds.</li>
<li><a href="https://github.com/vicLeva/bqf">Backpack Quotient Filter</a>:&nbsp;k-mer indexing data structure with abundance</li>
<li><a href="http://github.com/GATB/rconnector">short read connector</a>:&nbsp;Detect similar reads from potentially large read set</li>
<li><a href="https://gatb.inria.fr/software/dsk/" title="DSK">DSK</a>:&nbsp;Count K-mer in sequences</li>
</ul><h2>Pangenome graph manipulation</h2><ul>
<li><a href="https://github.com/Tharos-ux/pancat">Pancat</a>: Pangenome Comparison and Analysis Toolkit</li>
<li><a href="https://pypi.org/project/gfagraphs/">GFAGraphs</a>: a Python library to handle pangenome graph files in GFA format.</li>
</ul><h2>Comparative metagenomics with k-mers</h2><ul>
<li><a href="https://github.com/GATB/simka">Simka and SimkaMin</a>:&nbsp;Comparative metagenomics for large-scale datasets</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/compreads-metagenomic-data-analysis/">Comparead &amp; Commet</a>:&nbsp;comparison of metagenomic datasets</li>
</ul><h2>Species and bacterial strains identification</h2><ul>
<li><a href="https://github.com/gsiekaniec/ORI">ORI</a>: software using long nanopore reads to identify bacteria present in a sample at the strain level</li>
<li><a href="https://github.com/kevsilva/StrainFLAIR">StrainFLAIR</a>:&nbsp;STRAIN-level proFiLing using vArIation gRaph</li>
</ul><h2>General-purpose sequencing data manipulation</h2><ul>
<li><a href="https://team.inria.fr/genscale/ngs-software/gassst/">GASSST</a>:&nbsp;long read mapper</li>
<li><a href="https://gatb.inria.fr/software/leon/" title="Leon">Leon</a>: short read compressor (now included in GATB-core)</li>
<li><a href="https://gatb.inria.fr/software/bloocoo/" title="Bloocoo">Bloocoo</a>:&nbsp;short read corrector</li>
<li><a href="https://github.com/GATB/bcalm">BCALM</a>:&nbsp;Construct compacted de Bruijn graphs (unitigs)</li>
</ul><h2>&nbsp;Protein Structure</h2><ul>
<li><a href="https://team.inria.fr/genscale/protein-structure/a-purva-contact-map-overlap-solver/">A_Purva</a>:&nbsp;Contact Map Overlap solver</li>
<li><a href="https://team.inria.fr/genscale/protein-structure/md-jeep-distance-geomtry-solver/">MD-Jeep</a>:&nbsp;Distance Geometry solver</li>
<li><a href="https://team.inria.fr/genscale/csa-comparative-structural-alignment/">CSA</a>:&nbsp;Comparative Structural Alignment</li>
</ul><h2>Workflow</h2><ul>
<li><a href="https://team.inria.fr/genscale/workflows/slicee/">SLICEE</a>:&nbsp;parallel execution of bioinformatics workflows</li>
</ul><h3>Comparative Genomics</h3><ul>
<li><a href="https://team.inria.fr/genscale/comparative-genomics/cassis/">CASSIS</a>:&nbsp;detection of rearrangement breakpoints</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/plast-intensive-sequence-comparison/">PLAST</a>:&nbsp;intensive bank-to-bank sequence comparison</li>
<li><a href="https://github.com/stephanierobin/DrjBreakpointFinder">DRJBreakpointFinder</a>: detection and precise localization of excision sites in proviral segments</li>
</ul>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38535/nanopack-visualizing-and-processing-long-read-sequencing-data</guid>
	<pubDate>Tue, 25 Dec 2018 21:20:50 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38535/nanopack-visualizing-and-processing-long-read-sequencing-data</link>
	<title><![CDATA[NanoPack: visualizing and processing long-read sequencing data]]></title>
	<description><![CDATA[The NanoPack tools are written in Python3 and released under the GNU GPL3.0 License. The source code can be found at https://github.com/wdecoster/nanopack, together with links to separate scripts and their documentation. The scripts are compatible with Linux, Mac OS and the MS Windows 10 subsystem for Linux and are available as a graphical user interface, a web service at http://nanoplot.bioinf.be and command line tools.<p>Address of the bookmark: <a href="https://github.com/wdecoster/nanopack" rel="nofollow">https://github.com/wdecoster/nanopack</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36635/circlator-automated-circularization-of-genome-assemblies-using-long-sequencing-reads</guid>
	<pubDate>Tue, 15 May 2018 09:42:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36635/circlator-automated-circularization-of-genome-assemblies-using-long-sequencing-reads</link>
	<title><![CDATA[Circlator: automated circularization of genome assemblies using long sequencing reads]]></title>
	<description><![CDATA[A tool to circularize genome assemblies. The algorithm and benchmarks are described in the Genome Biology manuscript. 

Citation: "Circlator: automated circularization of genome assemblies using long sequencing reads", Hunt et al, Genome Biology 2015 Dec 29;16(1):294. doi: 10.1186/s13059-015-0849-0. PMID: 26714481.<p>Address of the bookmark: <a href="http://sanger-pathogens.github.io/circlator/" rel="nofollow">http://sanger-pathogens.github.io/circlator/</a></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37574/simlord-a-read-simulator-for-third-generation-sequencing-reads</guid>
	<pubDate>Wed, 22 Aug 2018 10:40:27 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37574/simlord-a-read-simulator-for-third-generation-sequencing-reads</link>
	<title><![CDATA[SimLoRD: A read simulator for third generation sequencing reads]]></title>
	<description><![CDATA[<p>SimLoRD is a read simulator for third generation sequencing reads and is currently focused on the Pacific Biosciences SMRT error model.</p>
<p>Reads are simulated from both strands of a provided or randomly generated reference sequence.</p>
<div id="rst-header-features">
<ul>
<li>The reference can be read from a FASTA file or randomly generated with a given GC content. It can consist of several chromosomes, whose structure is respected when drawing reads. (Simulation of genome rearrangements may be incorporated at a later stage.)</li>
<li>The read lengths can be determined in four ways: drawing from a log-normal distribution (typical for genomic DNA), sampling from an existing FASTQ file (typical for RNA), sampling from a a text file with integers (RNA), or using a fixed length</li>
<li>Quality values and number of passes depend on fragment length.</li>
<li>Provided subread error probabilities are modified according to number of passes</li>
<li>Outputs reads in FASTQ format and alignments in SAM format</li>
</ul>
</div><p>Address of the bookmark: <a href="https://bitbucket.org/genomeinformatics/simlord/" rel="nofollow">https://bitbucket.org/genomeinformatics/simlord/</a></p>]]></description>
	<dc:creator>Aaryan Lokwani</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/27839/lorma-a-tool-for-correcting-sequencing-errors-in-long-reads-such-those-produced-by-pacific-biosciences-sequencing-machines</guid>
	<pubDate>Wed, 15 Jun 2016 17:18:36 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/27839/lorma-a-tool-for-correcting-sequencing-errors-in-long-reads-such-those-produced-by-pacific-biosciences-sequencing-machines</link>
	<title><![CDATA[LoRMA: a tool for correcting sequencing errors in long reads such those produced by Pacific Biosciences sequencing machines]]></title>
	<description><![CDATA[<p>LoRMA is a tool for correcting sequencing errors in long reads such those produced by Pacific Biosciences sequencing machines.</p>
<p>Publication:</p>
<ul>
<li>L. Salmela, R. Walve, E. Rivals, and E. Ukkonen: Accurate selfcorrection of errors in long reads using de Bruijn graphs. Accepted to RECOMB-Seq 2016.</li>
</ul>
<p>Download:</p>
<ul>
<li><a href="https://www.cs.helsinki.fi/u/lmsalmel/LoRMA/LoRMA-0.3.tar.gz">LoRMA 0.3 source files</a></li>
<li><a href="https://www.cs.helsinki.fi/u/lmsalmel/LoRMA/README.txt">README</a></li>
</ul><p>Address of the bookmark: <a href="https://www.cs.helsinki.fi/u/lmsalmel/LoRMA/" rel="nofollow">https://www.cs.helsinki.fi/u/lmsalmel/LoRMA/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35055/jabba-hybrid-error-correction-for-long-sequencing-reads</guid>
	<pubDate>Fri, 05 Jan 2018 03:58:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35055/jabba-hybrid-error-correction-for-long-sequencing-reads</link>
	<title><![CDATA[Jabba: Hybrid Error Correction for Long Sequencing Reads]]></title>
	<description><![CDATA[<p>Jabba is a hybrid error correction tool to correct third generation (PacBio / ONT) sequencing data, using second generation (Illumina) data.</p>
<p>Input</p>
<p>Jabba takes as input a concatenated de Bruijn graph and a set of sequences:</p>
<p>the de Bruijn graph should appear in fasta format with 1 entry per node, the meta information should be in the format:<br>&gt;NODE <br>the set of sequences should be in fasta or fastq format. These sequences will be corrected (e.g. PacBio reads). The corrections will be written to a file Jabba fasta.<br>The output is a file in fasta format with corrections of the long reads, and additionally a file in the input format containing uncorrected reads.</p>
<p>https://github.com/biointec/jabba/wiki</p>
<p>https://almob.biomedcentral.com/articles/10.1186/s13015-016-0075-7</p><p>Address of the bookmark: <a href="https://github.com/biointec/jabba" rel="nofollow">https://github.com/biointec/jabba</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42826/ktrim-an-extra-fast-and-accurate-adapter-and-quality-trimmer-for-sequencing-data</guid>
	<pubDate>Thu, 11 Feb 2021 21:39:05 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42826/ktrim-an-extra-fast-and-accurate-adapter-and-quality-trimmer-for-sequencing-data</link>
	<title><![CDATA[Ktrim: an extra-fast and accurate adapter- and quality-trimmer for sequencing data]]></title>
	<description><![CDATA[<p>Ktrim&nbsp;is written in&nbsp;<code style="font-size: 13.6px; padding: 0.2em 0.4em; margin: 0px; background-color: var(--color-markdown-code-bg);">C++</code>&nbsp;for GNU Linux/Unix platforms. After uncompressing the source package, you can find an executable file&nbsp;<code style="font-size: 13.6px; padding: 0.2em 0.4em; margin: 0px; background-color: var(--color-markdown-code-bg);">ktrim</code>&nbsp;under&nbsp;<code style="font-size: 13.6px; padding: 0.2em 0.4em; margin: 0px; background-color: var(--color-markdown-code-bg);">bin/</code>&nbsp;directory compiled using&nbsp;<code style="font-size: 13.6px; padding: 0.2em 0.4em; margin: 0px; background-color: var(--color-markdown-code-bg);">g++ v4.8.5</code>&nbsp;and linked with&nbsp;<code style="font-size: 13.6px; padding: 0.2em 0.4em; margin: 0px; background-color: var(--color-markdown-code-bg);">libz v1.2.7</code>&nbsp;for Linux x86_64 system. If you could not run it (which is usually caused by low version of&nbsp;<code style="font-size: 13.6px; padding: 0.2em 0.4em; margin: 0px; background-color: var(--color-markdown-code-bg);">libc++</code>&nbsp;or&nbsp;<code style="font-size: 13.6px; padding: 0.2em 0.4em; margin: 0px; background-color: var(--color-markdown-code-bg);">libz</code>&nbsp;library) or you want to build a version optimized for your system, you can re-compile the programs:</p>
<p>user@linux$ make clean &amp;&amp; make</p><p>Address of the bookmark: <a href="https://github.com/hellosunking/Ktrim" rel="nofollow">https://github.com/hellosunking/Ktrim</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>