<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/35057?offset=50</link>
	<atom:link href="https://bioinformaticsonline.com/related/35057?offset=50" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44731/exploring-bacterial-comparative-genomics-a-bioinformatics-approach</guid>
	<pubDate>Sat, 14 Dec 2024 12:31:14 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44731/exploring-bacterial-comparative-genomics-a-bioinformatics-approach</link>
	<title><![CDATA[Exploring Bacterial Comparative Genomics: A Bioinformatics Approach]]></title>
	<description><![CDATA[<p>In the world of microbiology, bacteria have long fascinated scientists for their diversity, adaptability, and crucial roles in ecosystems and human health. Comparative genomics&mdash;a field that involves analyzing and comparing the genomes of different organisms&mdash;has revolutionized our understanding of bacterial evolution, adaptation, and pathogenicity. By leveraging bioinformatics tools and techniques, researchers can uncover genomic insights that were once hidden. This blog delves into the principles, methodologies, and applications of bacterial comparative genomics from a bioinformatics perspective.</p><h4><strong>What is Bacterial Comparative Genomics?</strong></h4><p>Comparative genomics involves the systematic comparison of genomes across different bacterial species or strains. This approach allows scientists to:</p><ul>
<li>
<p>Identify conserved and unique genes.</p>
</li>
<li>
<p>Explore genetic determinants of pathogenicity.</p>
</li>
<li>
<p>Understand bacterial evolution and phylogenetics.</p>
</li>
<li>
<p>Investigate horizontal gene transfer and its role in antibiotic resistance.</p>
</li>
</ul><p>Bioinformatics is central to these analyses, enabling the processing and interpretation of large-scale genomic data.</p><h4><strong>Key Steps in Bacterial Comparative Genomics</strong></h4><ol>
<li>
<p><strong>Genome Sequencing and Assembly</strong>: The process begins with obtaining high-quality bacterial genome sequences. Advances in next-generation sequencing (NGS) technologies have made it faster and more affordable to sequence bacterial genomes. Tools such as SPAdes and Velvet are commonly used for genome assembly.</p>
</li>
<li>
<p><strong>Genome Annotation</strong>: Annotating a genome involves identifying genes, regulatory elements, and other genomic features. Automated tools like Prokka and RAST provide functional annotations, allowing researchers to predict the roles of genes and proteins.</p>
</li>
<li>
<p><strong>Genome Alignment</strong>: Aligning genomes is crucial for identifying conserved regions, single-nucleotide polymorphisms (SNPs), and structural variations. Tools like Mauve and progressiveMauve are commonly employed for whole-genome alignments.</p>
</li>
<li>
<p><strong>Comparative Analyses</strong>:</p>
<ul>
<li>
<p><strong>Core and Pan-genome Analysis</strong>: The core genome consists of genes shared across all strains of a species, while the pan-genome includes all genes found in any strain. Software like Roary and BPGA can perform core and pan-genome analyses.</p>
</li>
<li>
<p><strong>Phylogenetic Analysis</strong>: Comparative genomics often involves reconstructing evolutionary relationships. Tools such as MEGA and IQ-TREE facilitate phylogenetic tree construction based on genomic data.</p>
</li>
<li>
<p><strong>Functional Enrichment Analysis</strong>: To understand the biological significance of unique or shared genes, functional enrichment analysis using databases like GO (Gene Ontology) and KEGG is essential.</p>
</li>
</ul>
</li>
</ol><div>&nbsp;<strong style="font-size: 1em;">Recommended Bioinformatics Tools for Comparative Genomics</strong></div><p>Here are some additional bioinformatics tools that can aid bacterial comparative genomics:</p><ul>
<li>
<p><strong>OrthoFinder</strong>: For accurate ortholog identification across multiple genomes.</p>
</li>
<li>
<p><strong>PanOCT</strong>: Specifically designed for pan-genome clustering and annotation.</p>
</li>
<li>
<p><strong>FASTANI</strong>: A tool for calculating Average Nucleotide Identity (ANI) for microbial genome comparisons.</p>
</li>
<li>
<p><strong>CIRCOS</strong>: For visually comparing genomic data through circular genome plots.</p>
</li>
<li>
<p><strong>Galaxy Platform</strong>: A user-friendly web-based platform offering numerous genomic analysis tools.</p>
</li>
<li>
<p><strong>BLAST</strong>: Essential for sequence alignment and similarity searches.</p>
</li>
<li>
<p><strong>PhyloSift</strong>: Focused on phylogenetic analysis of microbial genomes using marker genes.</p>
</li>
</ul><p>These tools, in combination with the methods discussed, provide a robust framework for conducting comprehensive comparative genomic studies.</p><h4><strong>Applications of Bacterial Comparative Genomics</strong></h4><ol>
<li>
<p><strong>Understanding Pathogenicity</strong>: Comparative genomics helps identify virulence factors that distinguish pathogenic strains from non-pathogenic relatives. For instance, comparing genomes of <em>Escherichia coli</em> strains has revealed key genetic determinants of pathogenicity in enterohemorrhagic strains.</p>
</li>
<li>
<p><strong>Antibiotic Resistance Research</strong>: The spread of antibiotic resistance genes through horizontal gene transfer is a major global concern. Comparative analyses can trace the origins and dissemination of resistance genes, aiding in the development of countermeasures.</p>
</li>
<li>
<p><strong>Microbial Ecology and Evolution</strong>: By studying genomic variations, researchers can understand how bacteria adapt to different environments. This is particularly relevant for extremophiles and symbiotic bacteria.</p>
</li>
<li>
<p><strong>Vaccine Development</strong>: Identifying conserved antigens across pathogenic strains is critical for vaccine design. Comparative genomics has been instrumental in developing vaccines against pathogens like <em>Neisseria meningitidis</em>.</p>
</li>
<li>
<p><strong>Biotechnology Applications</strong>: Comparative studies can uncover unique metabolic pathways in bacteria, paving the way for applications in bioremediation, synthetic biology, and industrial microbiology.</p>
</li>
</ol><h4><strong>Challenges in Bacterial Comparative Genomics</strong></h4><p>While the field has made significant strides, several challenges remain:</p><ul>
<li>
<p><strong>Data Overload</strong>: The rapid growth of sequencing data requires robust computational infrastructure and efficient algorithms.</p>
</li>
<li>
<p><strong>Genome Plasticity</strong>: High rates of horizontal gene transfer and genome rearrangements in bacteria complicate comparative analyses.</p>
</li>
<li>
<p><strong>Annotation Accuracy</strong>: Automated annotation tools are not infallible, and manual curation is often needed for high-confidence results.</p>
</li>
<li>
<p><strong>Interpreting Non-Coding Regions</strong>: Understanding the functional significance of non-coding genomic regions remains a challenge.</p>
</li>
</ul><h4><strong>Future Directions</strong></h4><p>The integration of bacterial comparative genomics with other &lsquo;omics&rsquo; approaches&mdash;such as transcriptomics, proteomics, and metabolomics&mdash;promises a more comprehensive understanding of bacterial biology. Additionally, advancements in machine learning and artificial intelligence are likely to further enhance bioinformatics analyses, enabling the prediction of complex phenotypes from genomic data.</p><h4><strong>Conclusion</strong></h4><p>Bacterial comparative genomics, driven by bioinformatics, continues to unravel the complexities of bacterial life. From combating antibiotic resistance to uncovering the secrets of microbial evolution, this interdisciplinary field holds immense potential for addressing pressing challenges in microbiology and beyond. As technology advances, so too will our ability to harness the power of comparative genomics for scientific and societal benefit.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/fun/view/4196/chemical-elements-of-bioinformatics</guid>
	<pubDate>Tue, 03 Sep 2013 16:35:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/fun/view/4196/chemical-elements-of-bioinformatics</link>
	<title><![CDATA[Chemical Elements of Bioinformatics]]></title>
	<description><![CDATA[<p>You must be familiar with periodic table and colour pattern, but this time you are going to amaze by new elements table by Eagle genomics. Just check it out and have fun :)</p><p><a href="http://elements.eaglegenomics.com/">http://elements.eaglegenomics.com/</a></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34221/alignment-free-sequence-comparison-tools-available-for-next-generation-sequencing-data-analysis</guid>
	<pubDate>Tue, 07 Nov 2017 05:33:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34221/alignment-free-sequence-comparison-tools-available-for-next-generation-sequencing-data-analysis</link>
	<title><![CDATA[Alignment-free sequence comparison tools available for next-generation sequencing data analysis]]></title>
	<description><![CDATA[<div><p><span>kallisto</span></p></div><div><p>Transcript abundance quantification from RNA-seq data (uses pseudoalignment for rapid determination of read compatibility with targets)</p><p>Software (C++)</p><p><a href="https://pachterlab.github.io/kallisto/">https://pachterlab.github.io/kallisto/</a></p><p>Sailfish</p><p>Estimation of isoform abundances from reference sequences and RNA-seq data (<em>k</em>-mer based)</p><p>Software (C++)</p><p><a href="http://www.cs.cmu.edu/~ckingsf/software/sailfish/">http://www.cs.cmu.edu/~ckingsf/software/sailfish/</a></p><p>Salmon</p><p>Quantification of the expression of transcripts using RNA-seq data (uses&nbsp;<em>k</em>-mers)</p><p><a href="https://combine-lab.github.io/salmon/">https://combine-lab.github.io/salmon/</a></p><p>RNA-Skim</p><p>RNA-seq quantification at transcript-level (partitions the transcriptome into disjoint transcript clusters; uses&nbsp;<em>sig</em>-mers, a special type of&nbsp;<em>k</em>-mers)</p><p>Software (C++)</p><p><a href="http://www.csbio.unc.edu/rs/">http://www.csbio.unc.edu/rs/</a></p><p>Variant calling</p><p>ChimeRScope</p><p>Fusion transcript prediction using gene&nbsp;<em>k</em>-mers profiles of the RNA-seq paired-end reads</p><p>Software (Java)</p><p><a href="https://github.com/ChimeRScope/ChimeRScope/wiki">https://github.com/ChimeRScope/ChimeRScope/wiki</a></p><p>FastGT</p><p>Genotyping of known SNV/SNP variants directly from raw NGS sequence reads by counting unique&nbsp;<em>k</em>-mers</p><p>Software (C)</p><p><a href="https://github.com/bioinfo-ut/GenomeTester4/">https://github.com/bioinfo-ut/GenomeTester4/</a></p><p>Phy-Mer</p><p>Reference-independent mitochondrial haplogroup classifier from NGS data (<em>k</em>-mer based)</p><p>Software (Python)</p><p><a href="https://github.com/danielnavarrogomez/phy-mer">https://github.com/danielnavarrogomez/phy-mer</a></p><p>LAVA</p><p>Genotyping of known SNPs (dbSNP and Affymetrix's Genome-Wide Human SNP Array) from raw NGS reads (<em>k</em>-mer based)</p><p>Software (C)</p><p><a href="http://lava.csail.mit.edu/">http://lava.csail.mit.edu/</a></p><p>MICADo</p><p>Detection of mutations in targeted third-generation NGS data (can distinguish patients&rsquo; specific mutations; algorithm uses&nbsp;<em>k</em>-mers and is based on colored de Bruijn graphs)</p><p>Software (Python)</p><p><a href="http://github.com/cbib/MICADo">http://github.com/cbib/MICADo</a></p><p>General mapper</p><p>Minimap</p><p>Lightweight and fast read mapper and read overlap detector (uses the concept of &ldquo;minimazers&rdquo;, a special type of&nbsp;<em>k</em>-mers)</p><p>Software (C)</p><p><a href="https://github.com/lh3/minimap">https://github.com/lh3/minimap</a></p><p>Assembly</p><p>De novo genome assembly</p><p>MHAP</p><p>Produces highly continuous assembly (fully resolved chromosome arms) from third-generation long and noisy reads (10 kbp) using a dimensionality reduction technique MinHash</p><p>Software (Java)</p><p><a href="https://github.com/marbl/MHAP">https://github.com/marbl/MHAP</a></p><p>Miniasm</p><p>Assembler of long noisy reads (SMRT, ONT) using the Overlap-Layout Consensus (OLC) approach without the necessity of an error correction stage (uses minimap)</p><p>Software (C)</p><p><a href="https://github.com/lh3/miniasm">https://github.com/lh3/miniasm</a></p><p>LINKS</p><p>Scaffolding genome assembly with error-containing long sequence (e.g., ONT or PacBio reads, draft genomes)</p><p>Software (Perl)</p><p><a href="https://github.com/warrenlr/LINKS/">https://github.com/warrenlr/LINKS/</a></p><p>Read clustering</p><p>afcluster</p><p>Clustering of reads from different genes and different species based on&nbsp;<em>k</em>-mer counts</p><p>Software (C++)</p><p><a href="https://github.com/luscinius/afcluster">https://github.com/luscinius/afcluster</a></p><p>QCluster</p><p>Clustering of reads with alignment-free measures (<em>k</em>-mer based) and quality values</p><p>Software (C++)</p><p><a href="http://www.dei.unipd.it/~ciompin/main/qcluster.html">http://www.dei.unipd.it/~ciompin/main/qcluster.html</a></p><p>Reads error correction</p><p>Lighter</p><p>Correction of sequencing errors in raw, whole genome sequencing reads (<em>k</em>-mer based)</p><p>Software (C++)</p><p><a href="https://github.com/mourisl/Lighter">https://github.com/mourisl/Lighter</a></p><p>QuorUM</p><p>Error corrector for Illumina reads using k-mers</p><p>Software (C++)</p><p><a href="https://github.com/gmarcais/Quorum">https://github.com/gmarcais/Quorum</a></p><p>Trowel</p><p>Software (C++)</p><p><a href="https://sourceforge.net/projects/trowel-ec/">https://sourceforge.net/projects/trowel-ec/</a></p><p>Metagenomics</p><p>Assembly-free phylogenomics</p><p>AAF</p><p>Phylogeny reconstruction directly from unassembled raw sequence data from whole genome sequencing projects; provides bootstrap support to assess uncertainty in the tree topology (<em>k</em>-mer based)</p><p>Software (Python)</p><p><a href="https://github.com/fanhuan/AAF">https://github.com/fanhuan/AAF</a></p><p>kSNP v3</p><p>Reference-free SNP identification and estimation of phylogenetic trees using SNPs (based on&nbsp;<em>k</em>-mer analysis)</p><p>Software (C)</p><p><a href="https://sourceforge.net/projects/ksnp/files/">https://sourceforge.net/projects/ksnp/files/</a></p><p>NGS-MC</p><p>Phylogeny of species based on NGS reads using alignment-free sequence dissimilarity measures d2* and d2&nbsp;S&nbsp;under different Markov chain models (using&nbsp;<em>k</em>-words)</p><p>R package</p><p><a href="http://www-rcf.usc.edu/~fsun/Programs/NGS-MC/NGS-MC.html">http://www-rcf.usc.edu/~fsun/Programs/NGS-MC/NGS-MC.html</a></p><p>Species identification/taxonomic profiling</p><p>CLARK</p><p>Taxonomic classification of metagenomic reads to known bacterial genomes using&nbsp;<em>k</em>-mer search and LCA assignment</p><p>Software (C++)</p><p><a href="http://clark.cs.ucr.edu/">http://clark.cs.ucr.edu/</a></p><p>FOCUS</p><p>Reports organisms present in metagenomic samples and profiles their abundances (uses composition-based approach and non-negative least squares for prediction)</p><p>Web service Software (Python)</p><p><a href="http://edwards.sdsu.edu/FOCUS/">http://edwards.sdsu.edu/FOCUS/</a></p><p>GSM</p><p>Estimation of abundances of microbial genomes in metagenomic samples (<em>k</em>-mer based)</p><p>Software (Go)</p><p><a href="https://github.com/pdtrang/GSM">https://github.com/pdtrang/GSM</a></p><p>Mash</p><p>Species identification using assembled or unassembled Illumina, PacBio, and ONT data (based on MinHash dimensionality-reduction technique)</p><p>Software (C++)</p><p><a href="https://github.com/marbl/mash">https://github.com/marbl/mash</a></p><p>Kraken</p><p>Taxonomic assignment in metagenome analysis by exact&nbsp;<em>k</em>-mer search; LCA assignment of short reads based on a comprehensive sequence database</p><p>Software (C++)</p><p><a href="https://ccb.jhu.edu/software/kraken/">https://ccb.jhu.edu/software/kraken/</a></p><p>LMAT</p><p>Assignment of taxonomic labels to reads by&nbsp;<em>k</em>-mers searches in precomputed database</p><p>Software (C++/Python)</p><p><a href="https://sourceforge.net/projects/lmat/">https://sourceforge.net/projects/lmat/</a></p><p>stringMLST</p><p><em>k</em>-mer-based tool for MLST directly from the genome sequencing reads</p><p>Software (Python)</p><p><a href="http://jordan.biology.gatech.edu/page/software/stringMLST">http://jordan.biology.gatech.edu/page/software/stringMLST</a></p><p>Taxonomer</p><p><em>k</em>-mer-based ultrafast metagenomics tool for assigning taxonomy to sequencing reads from clinical and environmental samples</p><p>Web service</p><p><a href="http://taxonomer.iobio.io/">http://taxonomer.iobio.io/</a></p><p>Other</p><p>d2-tools</p><p>Word-based (<em>k</em>-tuple) comparison (pairwise dissimilarity matrix using d2S measure) of metatranscriptomic samples from NGS reads</p><p>Software (Python/R)</p><p><a href="https://code.google.com/p/d2-tools/">https://code.google.com/p/d2-tools/</a></p><p>VirHostMatcher</p><p>Prediction of hosts from metagenomic viral sequences based on ONF using various distance measures (e.g., d2)</p><p>Software (C++)</p><p><a href="https://github.com/jessieren/VirHostMatcher">https://github.com/jessieren/VirHostMatcher</a></p><p>MetaFast</p><p>Statistics calculation of metagenome sequences and the distances between them based on assembly using de Bruijn graphs and Bray&ndash;Curtis dissimilarity measure</p><p>Software (Java)</p><p><a href="https://github.com/ctlab/metafast">https://github.com/ctlab/metafast</a></p></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/35429/list-of-visualization-tools-for-genome-alignments</guid>
	<pubDate>Fri, 02 Feb 2018 13:25:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/35429/list-of-visualization-tools-for-genome-alignments</link>
	<title><![CDATA[List of visualization tools for genome alignments]]></title>
	<description><![CDATA[<p><span>Genome</span><span>&nbsp;browsers are useful not only for showing final results but also for improving analysis protocols, testing data quality, and generating result drafts. Its integration in analysis pipelines allows the optimization of parameters, which leads to better results. But sometime, we need publication ready figure of genomes. Following are the list of genome alignment visualization tools, which could be useful for analysis and&nbsp;interpretation of results:</span></p><p>ABySS Explorer</p><p>Interactive Java application that uses a novel graph-based representation to display a sequence assembly and associated metadata</p><p>http://www.bcgsc.ca/platform/bioinfo/software/abyss-explorer</p><p>BamView</p><p>Genome browser and annotation tool that allows visualization of sequence features, next-generation sequencing (NGS) data and the results of analyses within the context of the sequence, and also its six-frame translation</p><p>http://www.sanger.ac.uk/resources/software/artemis/</p><p>DNannotator&nbsp;</p><p>Annotation web toolkit for regional genomic sequences</p><p>http://bioapp.psych.uic.edu/DNannotator.htm</p><p>JVM&nbsp;</p><p>Java Visual Mapping tool for NGS reads</p><p>http://www.springer.com/cda/content/document/cda_downloaddocument/9789401792448-c2.pdf?SGWID=0-0-45-1487072-p176815501</p><p>LookSeq&nbsp;</p><p>Web-based visualization of sequences derived from multiple sequencing technologies. Low- or high-depth read pileups and easy visualization of putative single nucleotide and structural variation</p><p>http://lookseq.sourceforge.net</p><p>MagicViewer&nbsp;</p><p>Visualization of short read alignment, identification of genetic variation and association with annotation information of a reference genome</p><p>http://bioinformatics.zj.cn/magicviewer/</p><p>MapView&nbsp;</p><p>Alignments of huge-scale single-end and pair-end short reads</p><p>http://omictools.com/mapview-s1367.html</p><p>MultiPipMaker</p><p>Computes alignments of similar regions in two DNA sequences. The resulting alignments are summarized with a &lsquo;percent identity plot&rsquo; (pip)</p><p>http://pipmaker.bx.psu.edu/pipmaker/</p><p>PileLineGUI&nbsp;</p><p>Handling genome position files in NGS studies</p><p>http://sing.ei.uvigo.es/pileline/pilelinegui.html</p><p>SAMtools tview&nbsp;</p><p>Simple and fast text alignment viewer; NGS compatible</p><p>http://www.htslib.org/</p><p>SEWAL</p><p>Uses a locality-sensitive hashing algorithm to enumerate all unique sequences in an entire Illumina sequencing run</p><p>http://www.sourceforge.net/projects/sewal</p><p>STAR&nbsp;</p><p>A web-based integrated solution to management and visualization of sequencing data</p><p>http://wanglab.ucsd.edu/star/browser</p><p>SVA&nbsp;</p><p>Software for annotating and visualizing sequenced human genomes</p><p>http://www.svaproject.org</p><p>Viewer (IGV)&nbsp;</p><p>Visualization of large heterogeneous datasets, providing a smooth and intuitive user experience at all levels of genome resolution</p><p>https://www.broadinstitute.org/igv/</p><p>ZOOM Lite&nbsp;</p><p>NGS data mapping and visualization software</p><p>http://bioinfor.com/zoom/lite/</p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36842/gap-filling-or-contigs-extensions-tools</guid>
	<pubDate>Fri, 01 Jun 2018 08:07:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36842/gap-filling-or-contigs-extensions-tools</link>
	<title><![CDATA[Gap filling or Contigs extensions tools !]]></title>
	<description><![CDATA[
<p>There are many tools to perform gap filling using Illumina short reads, for example "GapFiller: a de novo assembly approach to fill the gap within paired reads" or "Toward almost closed genomes with GapFiller". There are also some tools like GAPresolution that can help to perform local re-assemblies using 454 reads. We used GAPresolution but it is not a very good software, it is useful only in some specific situations.</p>

<p>Take a look at the PRICE software from the DeRisi lab. Its meant to do something very similar. http://derisilab.ucsf.edu/index.php?page=software</p>

<p>You could also look at SSPACE (http://www.baseclear.com/landingpages/basetools-a-wide-range-of-bioinformatics-solutions/sspacev12/), ATLAS tools (http://www.hgsc.bcm.tmc.edu/content/bcm-hgsc-software), and SCARPA (http://compbio.cs.toronto.edu/hapsembler/scarpa.html).</p>

<p>See the PAGIT protocol: http://www.sanger.ac.uk/resources/software/pagit/ </p>

<p>In particular, take a look at the IMAGE tool: http://genomebiology.com/2010/11/4/R41 </p>

<p>Also SOAPdenovo has ha function for scaffolding. Not sure about ABYSS</p>

<p>Here there is a useful explanation of several tools.</p>

<p>https://bioinformaticsonline.com/search?q=scaffolding&amp;entity_type=object&amp;entity_subtype=bookmarks&amp;offset=0&amp;search_type=entities</p>

<p>I could be wrong, but the above answers to your hypothetical scenario appear to miss the point that you aren't interested in assembling the full genome, just the 100 kb part you're interested in. I suggest the following algorithm:</p>

<p>1. Start with the initial assembly C0 of the contigs you have identified as overlapping your region of interest, and the set S of reads those contigs contain. Let C = C0.</p>

<p>2. Repeat:<br />a. Identify paired-end reads (not in C) for which one or both ends align within, or extending, contigs in C.<br />b. Identify unpaired reads that align extending these new paired-end reads.<br />c. Construct a new assembly C' from C and the new reads identified in (a) and (b).<br />d. Trim C' so it does not extend more than 100 kb to either end of C0. Set C = C'.<br />e. Let S' denote the reads that contribute to C'. If S' does not contain any reads not present in S, stop. Otherwise, Set S = S'.</p>

<p>3. If you don't have a complete assembly of the region of interest, generate an STS for each end of each contig, probe a library for clones including these STSes, subclone these clones into a paired-end sequencing vector, and generate paired-end reads for this library; then try steps (1) and (2) again, adding these new sequencing reads to what you had before.</p>

<p>4. If your average sequencing depth for the region of interest exceeds 25 or so without filling all gaps, it is likely that the remaining gaps represent sequences that are not getting cloned in your sequencing vectors. Try different sequencing vectors.</p>
]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38743/molinspiration-broad-range-of-cheminformatics-software-tools-supporting-molecule-manipulation</guid>
	<pubDate>Sun, 20 Jan 2019 05:32:40 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38743/molinspiration-broad-range-of-cheminformatics-software-tools-supporting-molecule-manipulation</link>
	<title><![CDATA[molinspiration: broad range of cheminformatics software tools supporting molecule manipulation]]></title>
	<description><![CDATA[<p><span>Molinspiration offers&nbsp;</span><a href="https://www.molinspiration.com/products.html">broad range of cheminformatics software tools</a><span>&nbsp;supporting molecule manipulation and processing, including SMILES and SDfile conversion, normalization of molecules, generation of tautomers, molecule fragmentation, calculation of various molecular properties needed in QSAR, molecular modelling and drug design, high quality molecule depiction, molecular database tools supporting substructure and similarity searches. Our products support also fragment-based virtual screening, bioactivity prediction and data visualization. Molinspiration tools are written in Java, therefore can be used practically on any computer platform.</span></p><p>Address of the bookmark: <a href="https://www.molinspiration.com/" rel="nofollow">https://www.molinspiration.com/</a></p>]]></description>
	<dc:creator>BioJoker</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41559/dahak-benchmarking-and-containerization-of-tools-for-analysis-of-complex-non-clinical-metagenomes</guid>
	<pubDate>Thu, 09 Apr 2020 04:56:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41559/dahak-benchmarking-and-containerization-of-tools-for-analysis-of-complex-non-clinical-metagenomes</link>
	<title><![CDATA[Dahak: benchmarking and containerization of tools for analysis of complex non-clinical metagenomes.]]></title>
	<description><![CDATA[<p><span>Dahak is a software suite that integrates state-of-the-art open source tools for metagenomic analyses. Tools in the dahak software suite will perform various steps in metagenomic analysis workflows including data pre-processing, metagenome assembly, taxonomic and functional classification, genome binning, and gene assignment. We aim to deliver the analytical framework as a robust and reliable containerized workflow system, which will be free from dependency, installation, and execution problems typically associated with other open-source bioinformatics solutions. This will maximize the transparency, data provenance (i.e., the process of tracing the origins of data and its movement through the workflow), and reproducibility.</span></p>
<p><span>More at&nbsp;<a href="https://dahak-metagenomics.github.io/dahak/">https://dahak-metagenomics.github.io/dahak/</a></span></p><p>Address of the bookmark: <a href="https://github.com/dahak-metagenomics/dahak" rel="nofollow">https://github.com/dahak-metagenomics/dahak</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</guid>
	<pubDate>Tue, 27 Oct 2020 19:42:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/42275/frequent-parameters-for-bioinformatics-tools</link>
	<title><![CDATA[Frequent parameters for bioinformatics tools !]]></title>
	<description><![CDATA[<div><div>Third party executable parameters and options.</div><div>&nbsp;</div><div>Trimmomatic</div><div>&nbsp;</div><div>&ldquo;ILLUMINACLIP:...:2:30:10&rdquo;</div><div>&ldquo;LEADING:15&rdquo;</div><div>&ldquo;TRAILING:15&rdquo;</div><div>&ldquo;SLIDINGWINDOW:4:20&rdquo;</div><div>&ldquo;MINLEN:20&rdquo;</div><div>&ldquo;TOPHRED33&rdquo;</div><div>&nbsp;</div><div>Filtlong</div><div>--min_length 500</div><div>--min_mean_q 85</div><div>--min_window_q 65</div><div>&nbsp;</div><div>FastQ Screen</div><div>--aligner bowtie2' (bwa for PacBio)</div><div>--subset 1000 (for PacBio)</div><div>&nbsp;</div><div>SPAdes</div><div>--careful</div><div>--disable-gzip-output</div><div>--cov-cutoff auto</div><div>--phred-offset 33</div><div>&nbsp;</div><div>HGAP</div><div>Pbalign.task_options.min_accuracy: 70</div><div>Pbalign.task_options.no_split_subreads: false</div><div>Genomic_consensus.task_options.min_confidence: 40</div><div>falcon_ns.task_options.HGAP_GenomeLength_str:</div><div>6000000</div><div>Pbcoretools.task_options.read_length: 0</div><div>Genomic_consensus.task_options.use_score: 0</div><div>Pbalign.task_options.min_length: 50</div><div>Pbalign.task_options.algorithm_options: --minMatch 12</div><div>--bestn 10 --minPctSimilarity 70.0</div><div>Pbalign.task_options.hit_policy: randombest</div><div>Pbcoretools.task_options.other_filters: rq &gt;= 0.7</div><div>Pbalign.task_options.concordant: false</div><div>Genomic_consensus.task_options.min_coverage: 5</div><div>falcon_ns.task_options.HGAP_SeedCoverage_str: 30</div><div>falcon_ns.task_options.HGAP_AggressiveAsm_bool: false</div><div>Genomic_consensus.task_options.algorithm: best</div><div>falcon_ns.task_options.HGAP_SeedLengthCutoff_str: -1</div><div>Genomic_consensus.task_options.diploid: false</div><div>&nbsp;</div><div>MeDuSa</div><div>-random 100</div><div>&nbsp;</div><div>Prokka</div><div>--usegenus</div><div>--force</div><div>--addgenes</div><div>--rfam</div><div>--rawproduct</div><div>&nbsp;</div><div>cmsearch (taxonomy, 16S)</div><div>--rfam</div><div>--noali</div><div>&nbsp;</div><div>blastn (taxonomy, 16S)</div><div>-evalue 1E-10</div><div>&nbsp;</div><div>blastn (MLST)</div><div>-ungapped</div></div><div><div>-dust no</div><div>-evalue 1E-20</div><div>-word_size 32</div><div>-culling_limit 2</div><div>-perc_identity 95</div><div>&nbsp;</div><div>blastp (VF)</div><div>-culling_limit 2</div><div>&nbsp;</div><div>RGI (ABR)</div><div>--input_type contig</div><div>&nbsp;</div><div>bowtie2 (mapping)</div><div>--sensitive</div><div>&nbsp;</div><div>minimap2 (mapping)</div><div>-a</div><div>-x map-ont</div><div>&nbsp;</div><div>samtools mpileup (SNP&nbsp;detection)</div><div>-uRI</div><div>&nbsp;</div><div>bcftools call (SNP detection)</div><div>--variants-only</div><div>--skip-variants indels</div><div>--output-type v</div><div>--ploidy 1</div><div>-c</div><div>&nbsp;</div><div>SNPsift filter (SNP detection)</div><div>"( QUAL &gt;= 30 ) &amp; (( na FILTER ) | (FILTER = 'PASS')) &amp;</div><div>( DP &gt;= 20 ) &amp; ( MQ &gt;= 20 )"</div><div>&nbsp;</div><div>SNPeff ann (SNP detection)</div><div>-nodownload</div><div>-no-intron</div><div>-no-downstream</div><div>-no SPLICE_SITE_REGION</div><div>-upDownStreamLen 250</div><div>&nbsp;</div><div>bcftools consensus</div><div>(phylogenetic tree)</div><div>--haplotype 1</div><div>&nbsp;</div><div>fasttreemp</div><div>-nt</div><div>-boot 100</div><div>&nbsp;</div><div>roary</div><div>-e</div><div>-n</div><div>-cd 100</div><div>-g 100000</div></div>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43260/bioinformatics-tools-for-telomere-to-telomere-assembly</guid>
	<pubDate>Tue, 17 Aug 2021 13:17:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43260/bioinformatics-tools-for-telomere-to-telomere-assembly</link>
	<title><![CDATA[Bioinformatics tools for telomere to telomere assembly !]]></title>
	<description><![CDATA[<p>●&nbsp;<a href="https://github.com/arangrhie/merfin" target="_blank">Merfin</a>&nbsp;&ndash; k-mer-based assembly and variant calling evaluation for improved consensus accuracy (Arang Rhie)<br />●&nbsp;<a href="https://www.biorxiv.org/content/10.1101/2020.11.11.378133v1" target="_blank">PanGenie</a>&nbsp;&ndash; algorithm that leverages a pangenome reference built from haplotype-resolved genome assemblies in conjunction with k-mer count information from raw, short-read sequencing data to genotype a wide spectrum of genetic variation (Tobias Marschall)<br />●&nbsp;<a href="https://github.com/ConesaLab/SQANTI3" target="_blank">SQANTI3</a>&nbsp;&ndash; an automated pipeline for the classification of long-read transcripts that can assess the quality of data and the preprocessing pipeline (Roc&iacute;o Amor&iacute;n de Heged&uuml;s&nbsp;<a href="https://twitter.com/rocioadh" target="_blank">@rocioadh</a>)<br />●&nbsp;<a href="https://github.com/GenomeRIK/tama" target="_blank">tama</a>&nbsp;(Transcriptome Annotation by Modular Algorithms) &ndash; software designed for processing Iso-Seq data and other long-read transcriptome data (Richard Kuo&nbsp;<a href="https://twitter.com/GenomeRIK" target="_blank">@GenomeRIK</a>)<br />●&nbsp;<a href="https://github.com/PacificBiosciences/pbAA" target="_blank">pbaa</a>&nbsp;(PacBio Amplicon Analysis) &ndash; separates complex mixtures of amplicon targets from genomic samples to cluster and generate high-quality consensus sequences from HiFi reads (Zev Kronenberg&nbsp;<a href="https://twitter.com/zevkronenberg" target="_blank">@zevkronenberg</a>)<br />●&nbsp;<a href="https://github.com/yuanyuan929/bellerophon" target="_blank">bellerophon</a>&nbsp;&ndash; analyzes MHC typing and other low-complexity gene amplicon data; performs allele calling while detecting polymorphic sites within the sequences and removing potential chimeric sequence variants (Yuanyuan Cheng&nbsp;<a href="https://twitter.com/Yuanyuan929" target="_blank">@Yuanyuan929</a>)<br />●&nbsp;<a href="https://github.com/amwenger/svpack" target="_blank">svpack</a>&nbsp;&ndash; tools for filtering, comparing, and annotating structural variant (SV) calls in VCF format (Aaron Wenger)<br />●&nbsp;<a href="https://github.com/AntonBankevich/jumboDB" target="_blank">JumboDB</a>&nbsp;&ndash; tool for de Bruijn graph construction (Anton Bankevich&nbsp;<a href="https://twitter.com/AntonBankevich" target="_blank">@AntonBankevich</a>)<br />●&nbsp;<a href="https://github.com/ksahlin/ultra" target="_blank">uLTRA</a>&nbsp;&ndash; tool for splice alignment of long transcriptomic reads to a genome, guided by a database of exon annotations. (Kristoffer Sahlin&nbsp;<a href="https://twitter.com/krsahlin" target="_blank">@krsahlin</a>)<br />●&nbsp;<a href="https://www.biorxiv.org/content/10.1101/2021.01.25.428044v1.full.pdf" target="_blank">LeafGo</a>&nbsp;&ndash; workflow to rapidly produce high-quality de novo plant genomes (Luca Ermini&nbsp;<a href="https://twitter.com/ermini_luca" target="_blank">@ermini_luca</a>)</p><p>Reference:</p><p>https://www.pacb.com/blog/young-investigators-share-stellar-science-career-advice-and-bioinformatics-tools-at-smrt-leiden-2021/</p><p>&nbsp;</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43791/comparative-genomics-visualisation-tools</guid>
	<pubDate>Thu, 17 Feb 2022 05:37:55 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43791/comparative-genomics-visualisation-tools</link>
	<title><![CDATA[Comparative genomics visualisation tools !]]></title>
	<description><![CDATA[<p>Comparative genomics visualisation tools !</p><p>Address of the bookmark: <a href="https://cmdcolin.github.io/awesome-genome-visualization/?latest=true&amp;selected=%23BRIG&amp;tag=Comparative" rel="nofollow">https://cmdcolin.github.io/awesome-genome-visualization/?latest=true&amp;selected=%23BRIG&amp;tag=Comparative</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>

</channel>
</rss>