<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/36884?offset=220</link>
	<atom:link href="https://bioinformaticsonline.com/related/36884?offset=220" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</guid>
	<pubDate>Fri, 04 Oct 2024 02:45:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</link>
	<title><![CDATA[Libraries or management tools for high throughput sequencing data]]></title>
	<description><![CDATA[<ul>
<li><a href="http://gatb.inria.fr/"><span>GATB</span></a>&nbsp;Library.&nbsp;The&nbsp;<span>Genome Analysis Toolbox with de-Bruijn graph.&nbsp;</span>A large part of tools developed by the GenScale team are based on this library.<br />These methods enable the analysis of data sets of any size on multi-core desktop computers, including very huge amount of reads data coming from any kind of organisms such as bacteria, plants, animals and even complex samples (<em>e.g.</em>&nbsp;metagenomes). Among them are (the full is available here:&nbsp;<a href="https://gatb.inria.fr/software/">https://gatb.inria.fr/software/</a>):</li>
<li><a href="https://github.com/morispi/LRez"><span>LRez</span></a>: C++ Library and toolkit for the barcode-based management and indexation of linked-read datasets.</li>
</ul><h2>Variant calling and/or genotyping</h2><ul>
<li><a href="https://gatb.inria.fr/software/discosnp/" title="DiscoSNP">DiscoSNP++ and&nbsp;discoSnpRAD</a>: Reference-free small variant discovery (SNPs and indels)</li>
<li><a href="https://gatb.inria.fr/software/mind-the-gap/" title="MindTheGap">MindTheGap</a>: Detection and assembly of large insertion variants</li>
<li><a href="https://gatb.inria.fr/software/takeabreak/" title="TakeABreak">TakeABreak</a>:&nbsp;reference-free inversion discovery tool</li>
<li><a href="https://github.com/llecompte/SVJedi">SVJedi</a>: Structural Variant genotyper with long read data</li>
<li><a href="https://github.com/SandraLouise/SVJedi-graph">SVJedi-graph</a>: Structural Variant genotyper with long read data using a variation graph</li>
</ul><h2>Sequence assembly</h2><ul>
<li><a href="https://github.com/cguyomar/MinYS">MinYS</a>: reference-guided genome assembly in metagenomics data</li>
<li><a href="https://github.com/anne-gcd/MTG-Link">MTG-link</a>: local assembly tool for linked-read data</li>
<li><a href="https://gatb.inria.fr/software/minia/" title="Minia">Minia</a>: De novo short read assembler</li>
<li><a href="https://gatb.inria.fr/de-novo-genome-assembly/">de-novo pipeline</a>:&nbsp;<em>de-novo</em>&nbsp;assembly pipeline (error correction / contigs / scaffolding) for genomes and meta-genomes</li>
<li><a href="https://gatb.inria.fr/software/mapsembler/" title="Mapsembler2">Mapsembler2</a>: Targeted assembly (not maintained)</li>
</ul><h2>Managing k-mers &amp; indexation</h2><ul>
<li><a href="https://github.com/lrobidou/findere">findere</a>:&nbsp;simple strategy for speeding up queries and for reducing false positive calls from any Approximate Membership Query data structure.
<ul>
<li><a href="https://github.com/lrobidou/fimpera">fimpera</a>&nbsp;extends findere adding the abundance information.</li>
</ul>
</li>
<li><a href="https://github.com/tlemane/kmtricks">kmtricks</a>:&nbsp;modular tool suite for counting kmers, and constructing Bloom filters or kmer matrices, for large collections of sequencing data.</li>
<li><a href="https://github.com/tlemane/kmindex">kmindex&nbsp;</a>is a tool for indexing and querying sequencing samples. It is built on top of kmtricks.</li>
<li><a href="https://github.com/pierrepeterlongo/back_to_sequences">back to sequences</a>: Find sequences (reads, unitigs, genes) related to a set of kmers in large datasets, in a matter of seconds.</li>
<li><a href="https://github.com/vicLeva/bqf">Backpack Quotient Filter</a>:&nbsp;k-mer indexing data structure with abundance</li>
<li><a href="http://github.com/GATB/rconnector">short read connector</a>:&nbsp;Detect similar reads from potentially large read set</li>
<li><a href="https://gatb.inria.fr/software/dsk/" title="DSK">DSK</a>:&nbsp;Count K-mer in sequences</li>
</ul><h2>Pangenome graph manipulation</h2><ul>
<li><a href="https://github.com/Tharos-ux/pancat">Pancat</a>: Pangenome Comparison and Analysis Toolkit</li>
<li><a href="https://pypi.org/project/gfagraphs/">GFAGraphs</a>: a Python library to handle pangenome graph files in GFA format.</li>
</ul><h2>Comparative metagenomics with k-mers</h2><ul>
<li><a href="https://github.com/GATB/simka">Simka and SimkaMin</a>:&nbsp;Comparative metagenomics for large-scale datasets</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/compreads-metagenomic-data-analysis/">Comparead &amp; Commet</a>:&nbsp;comparison of metagenomic datasets</li>
</ul><h2>Species and bacterial strains identification</h2><ul>
<li><a href="https://github.com/gsiekaniec/ORI">ORI</a>: software using long nanopore reads to identify bacteria present in a sample at the strain level</li>
<li><a href="https://github.com/kevsilva/StrainFLAIR">StrainFLAIR</a>:&nbsp;STRAIN-level proFiLing using vArIation gRaph</li>
</ul><h2>General-purpose sequencing data manipulation</h2><ul>
<li><a href="https://team.inria.fr/genscale/ngs-software/gassst/">GASSST</a>:&nbsp;long read mapper</li>
<li><a href="https://gatb.inria.fr/software/leon/" title="Leon">Leon</a>: short read compressor (now included in GATB-core)</li>
<li><a href="https://gatb.inria.fr/software/bloocoo/" title="Bloocoo">Bloocoo</a>:&nbsp;short read corrector</li>
<li><a href="https://github.com/GATB/bcalm">BCALM</a>:&nbsp;Construct compacted de Bruijn graphs (unitigs)</li>
</ul><h2>&nbsp;Protein Structure</h2><ul>
<li><a href="https://team.inria.fr/genscale/protein-structure/a-purva-contact-map-overlap-solver/">A_Purva</a>:&nbsp;Contact Map Overlap solver</li>
<li><a href="https://team.inria.fr/genscale/protein-structure/md-jeep-distance-geomtry-solver/">MD-Jeep</a>:&nbsp;Distance Geometry solver</li>
<li><a href="https://team.inria.fr/genscale/csa-comparative-structural-alignment/">CSA</a>:&nbsp;Comparative Structural Alignment</li>
</ul><h2>Workflow</h2><ul>
<li><a href="https://team.inria.fr/genscale/workflows/slicee/">SLICEE</a>:&nbsp;parallel execution of bioinformatics workflows</li>
</ul><h3>Comparative Genomics</h3><ul>
<li><a href="https://team.inria.fr/genscale/comparative-genomics/cassis/">CASSIS</a>:&nbsp;detection of rearrangement breakpoints</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/plast-intensive-sequence-comparison/">PLAST</a>:&nbsp;intensive bank-to-bank sequence comparison</li>
<li><a href="https://github.com/stephanierobin/DrjBreakpointFinder">DRJBreakpointFinder</a>: detection and precise localization of excision sites in proviral segments</li>
</ul>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37674/qualimap2-evaluating-next-generation-sequencing-alignment-data</guid>
	<pubDate>Tue, 11 Sep 2018 04:44:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37674/qualimap2-evaluating-next-generation-sequencing-alignment-data</link>
	<title><![CDATA[Qualimap2: Evaluating next generation sequencing alignment data]]></title>
	<description><![CDATA[<p><strong>Qualimap 2</strong><span>&nbsp;is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.&nbsp;</span><br><br><span>Supported types of experiments include:</span></p>
<ul>
<li>Whole-genome sequencing</li>
<li>Whole-exome sequencing</li>
<li>RNA-seq (speical mode available)</li>
<li>ChIP-seq</li>
</ul><p>Address of the bookmark: <a href="http://qualimap.bioinfo.cipf.es/" rel="nofollow">http://qualimap.bioinfo.cipf.es/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38535/nanopack-visualizing-and-processing-long-read-sequencing-data</guid>
	<pubDate>Tue, 25 Dec 2018 21:20:50 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38535/nanopack-visualizing-and-processing-long-read-sequencing-data</link>
	<title><![CDATA[NanoPack: visualizing and processing long-read sequencing data]]></title>
	<description><![CDATA[The NanoPack tools are written in Python3 and released under the GNU GPL3.0 License. The source code can be found at https://github.com/wdecoster/nanopack, together with links to separate scripts and their documentation. The scripts are compatible with Linux, Mac OS and the MS Windows 10 subsystem for Linux and are available as a graphical user interface, a web service at http://nanoplot.bioinf.be and command line tools.<p>Address of the bookmark: <a href="https://github.com/wdecoster/nanopack" rel="nofollow">https://github.com/wdecoster/nanopack</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40598/mitoz-a-toolkit-for-animal-mitochondrial-genome-assembly-annotation-and-visualization</guid>
	<pubDate>Fri, 24 Jan 2020 04:09:15 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40598/mitoz-a-toolkit-for-animal-mitochondrial-genome-assembly-annotation-and-visualization</link>
	<title><![CDATA[MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization]]></title>
	<description><![CDATA[<p><span>MitoZ is a Python3-based toolkit which aims to automatically filter pair-end raw data (fastq files), assemble genome, search for mitogenome sequences from the genome assembly result, annotate mitogenome (genbank file as result), and mitogenome visualization. MitoZ is available from&nbsp;</span><code>https://github.com/linzhi2013/MitoZ</code><span>.</span></p>
<p><span><a href="https://academic.oup.com/nar/article/47/11/e63/5377471">https://academic.oup.com/nar/article/47/11/e63/5377471</a></span></p><p>Address of the bookmark: <a href="https://github.com/linzhi2013/MitoZ" rel="nofollow">https://github.com/linzhi2013/MitoZ</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41009/genomics-public-data-links</guid>
	<pubDate>Thu, 13 Feb 2020 00:20:00 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41009/genomics-public-data-links</link>
	<title><![CDATA[genomics public data links !]]></title>
	<description><![CDATA[<p>List of publically available databases on google server.</p>
<p>More at <a href="https://software.broadinstitute.org/gatk/download/bundle">https://software.broadinstitute.org/gatk/download/bundle</a></p>
<p><a href="ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/VCF/GATK/">ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/VCF/GATK/</a>.</p>
<p><a href="ftp://ftp.broadinstitute.org/bundle/hg38/hg38bundle/">ftp://ftp.broadinstitute.org/bundle/hg38/hg38bundle/</a></p><p>Address of the bookmark: <a href="https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0?pli=1" rel="nofollow">https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0?pli=1</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41148/pbmm2a-minimap2-frontend-for-pacbio-native-data-formats</guid>
	<pubDate>Tue, 18 Feb 2020 03:36:22 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41148/pbmm2a-minimap2-frontend-for-pacbio-native-data-formats</link>
	<title><![CDATA[pbmm2:A minimap2 frontend for PacBio native data formats]]></title>
	<description><![CDATA[<p><em>pbmm2</em> is a SMRT C++ wrapper for <a href="https://github.com/lh3/minimap2">minimap2</a>'s C API. Its purpose is to support native PacBio in- and output, provide sets of recommended parameters, generate sorted output on-the-fly, and postprocess alignments. Sorted output can be used directly for polishing using GenomicConsensus, if BAM has been used as input to <em>pbmm2</em>. Benchmarks show that <em>pbmm2</em> outperforms BLASR in sequence identity, number of mapped bases, and especially runtime. <em>pbmm2</em> is the official replacement for BLASR.</p><p>Address of the bookmark: <a href="https://github.com/PacificBiosciences/pbmm2" rel="nofollow">https://github.com/PacificBiosciences/pbmm2</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/38277/understating-pacbio-reads-name</guid>
	<pubDate>Fri, 23 Nov 2018 07:36:46 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/38277/understating-pacbio-reads-name</link>
	<title><![CDATA[Understating pacbio reads name !]]></title>
	<description><![CDATA[<pre>m140415_143853_42175_c100635972550000001823121909121417_s1_p0/553/3100_11230 0.99 24
└1┘└─────2─────┘└──3─┘└────────────────4────────────────┘└5┘└6┘└7┘└────8────┘└─9─┘└10┘
</pre><ol>
<li>"<code>m</code>" =&nbsp;<em>movie</em></li>
<li>Time of Run Start (<code>yymmdd_hhmmss</code>)</li>
<li>Instrument Serial Number</li>
<li>SMRT Cell Barcode</li>
<li>Set Number (a.k.a. "Look Number". Deprecated field, used in earlier version of RS)</li>
<li>Part Number (usually "<code>p0</code>", "<code>X0</code>" when using expired reagents)</li>
<li>ZMW hole number</li>
<li>Subread Region (<code>start_stop</code>&nbsp;using polymerase read coordinates)</li>
<li>readScore</li>
<li>barcodeScore</li>
</ol>]]></description>
	<dc:creator>BioJoker</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35033/bbsplit-read-binning-tool-for-metagenomes-and-contaminated-libraries</guid>
	<pubDate>Wed, 03 Jan 2018 00:25:27 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35033/bbsplit-read-binning-tool-for-metagenomes-and-contaminated-libraries</link>
	<title><![CDATA[BBSplit: Read Binning Tool for Metagenomes and Contaminated Libraries]]></title>
	<description><![CDATA[<p>BBSplit internally uses BBMap to map reads to multiple genomes at once, and determine which genome they match best. This is different than with ordinary mapping. If a genome (say, human) contains an exact repeat somewhere, reads mapping to it will be mapped ambiguously. But if you want to determine whether reads are mouse or human, it does not matter whether they map ambiguously within human, only whether they are ambiguous between human and mouse. BBSplit tracks this additional ambiguity information and decides how to use it based on the &ldquo;ambig2&rdquo; flag. The normal use of BBSplit is like Seal, either quantifying how many reads go to each reference, or splitting the reads into multiple output files, one per reference. BBSplit can only be run using references indexed with BBSplit, as they contain additional information regarding which sequences came from which reference file.</p><p><span>BBSplit is a tool that bins reads by mapping to multiple references simultaneously, using&nbsp;</span><a href="http://seqanswers.com/forums/showthread.php?t=41057" target="_blank">BBMap</a><span>. The reads go to the bin of the reference they map to best. There are also disambiguation options, such that reads that map to multiple references can be binned with all of them, none of them, one of them, or put in a special "ambiguous" file for each of them. Paired reads will always be kept together.</span><br /><br /><span>For example, if you had a library of something that was contaminated with e.coli and salmonella, you could do this:</span><br /><br /><strong>bbsplit.sh in=reads.fq ref=ecoli.fa,salmonella.fa basename=out_%.fq outu=clean.fq int=t</strong><br /><br /><span>This will produce 3 output files:</span><br /><strong>out_ecoli.fq</strong><span>&nbsp;(ecoli reads)</span><br /><strong>out_salmonella.fq</strong><span>&nbsp;(salmonella reads)</span><br /><strong>clean.fq</strong><span>&nbsp;(unmapped reads)</span><br /><br /><span>In this case, "int=t" means that the input file is paired and interleaved. For single-end reads you would leave that out. For paired reads in 2 files, you would do this:</span><br /><strong>bbsplit.sh in1=reads1.fq in2=reads2.fq ref=ecoli.fa,salmonella.fa basename=out_%.fq outu1=clean1.fq outu2=clean2.fq</strong></p><p><strong><span>BBSplit is available here:</span><br /><a href="https://sourceforge.net/projects/bbmap/" target="_blank">https://sourceforge.net/projects/bbmap/</a></strong></p><p><span>The sensitivity can be raised to be equivalent to BBMap with these flags: "minratio=0.56 minhits=1 maxindel=16000"</span></p>]]></description>
	<dc:creator>Poonam Mahapatra</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/43977/read-simulators</guid>
	<pubDate>Fri, 30 Sep 2022 06:48:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/43977/read-simulators</link>
	<title><![CDATA[Read Simulators]]></title>
	<description><![CDATA[<h1>Short Read Simulators</h1><p>With the popularity of next-generation sequencing (NGS) technologies, many NGS read simulators have been developed. Currently, many of the popular short read simulators are designed to simulate reads mimicking many Illumina, 454 and SOLiD platforms. Listed below are some popular short read simulators. Links to their publications are provided as well.</p><ol>
<li><a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0003373" target="_blank">MetaSim</a></li>
<li><a href="https://github.com/lh3/wgsim" target="_blank">wgsim</a></li>
<li><a href="https://github.com/timmassingham/simNGS" target="_blank">SimNGS</a></li>
<li><a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0049110" target="_blank">ArtificialFastqGenerator</a></li>
<li id="e943"><a href="https://academic.oup.com/bioinformatics/article/35/3/521/5055123" target="_blank">InSilicoSeq</a></li>
</ol><h1>Long Read Simulators</h1><p id="d469">With the advancements in sequencing technologies, scientists have shown an increasing interest in using third-generation sequencing (TGS) technologies. Currently, many of the popular long read simulators are designed to simulate reads mimicking the two main TGS technologies; (1)&nbsp;<em>Pacific Biosciences (PacBio)</em>&nbsp;and (2)&nbsp;<em>Oxford Nanopore (ONT)</em>. Listed below are some of the popular and recently introduced PacBio and ONT simulators. Links to their publications are provided as well.</p><h2><span>PacBio Simulators</span></h2><ol>
<li><a href="https://academic.oup.com/bioinformatics/article/29/1/119/273243" target="_blank">PBSIM</a></li>
<li><a href="https://academic.oup.com/bioinformatics/article/32/24/3829/2525710" target="_blank">LongISLND</a></li>
<li><a href="https://academic.oup.com/bioinformatics/article/32/17/2704/2450740" target="_blank">SimLoRD</a></li>
<li><a href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2208-0" target="_blank">NPBSS</a></li>
<li id="fed0"><a href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2901-7" target="_blank">PaSS</a></li>
</ol><h2><span>ONT Simulators</span></h2><ol>
<li id="f145"><a href="https://academic.oup.com/gigascience/article/6/4/gix010/3051934" target="_blank">NanoSim</a></li>
<li id="c6f5"><a href="https://ieeexplore.ieee.org/document/8621253" target="_blank">Nanopore SimulatION</a></li>
<li><a href="https://academic.oup.com/bioinformatics/article/34/17/2899/4962495" target="_blank">DeepSimulator</a></li>
<li><a href="https://academic.oup.com/bioinformatics/article/36/8/2578/5698265" target="_blank">DeepSimulator1.5</a></li>
</ol>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/39190/chipulate-a-python3-framework-to-simulate-read-counts-in-a-chip-seq-experiment</guid>
	<pubDate>Mon, 25 Mar 2019 12:46:47 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/39190/chipulate-a-python3-framework-to-simulate-read-counts-in-a-chip-seq-experiment</link>
	<title><![CDATA[ChIPulate: A Python3 framework to simulate read counts in a ChIP-seq experiment]]></title>
	<description><![CDATA[<p><span style="color: #202020; font-size: 13px; font-style: normal; font-weight: 400; text-align: start; background-color: #ffffff; float: none;">ChIP-seq simulation pipeline, ChIPulate, we assess the impact of various biological and experimental sources of variation on several outcomes of a ChIP-seq experiment, viz., the recoverability of the TF binding motif, accuracy of TF-DNA binding detection, the sensitivity of inferred TF-DNA binding strength, and number of replicates needed to confidently infer binding strength.<span> <br></span></span></p><p>Address of the bookmark: <a href="https://github.com/vishakad/chipulate" rel="nofollow">https://github.com/vishakad/chipulate</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>

</channel>
</rss>