<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/40531?offset=710</link>
	<atom:link href="https://bioinformaticsonline.com/related/40531?offset=710" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44002/interesting-bioinformatics-resources</guid>
	<pubDate>Fri, 11 Nov 2022 06:30:46 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44002/interesting-bioinformatics-resources</link>
	<title><![CDATA[Interesting Bioinformatics Resources !]]></title>
	<description><![CDATA[<p>1. a reproducible workflow.&nbsp;<a href="https://www.youtube.com/watch?v=s3JldKoA0zw">https://www.youtube.com/watch?v=s3JldKoA0zw</a>&nbsp;This two minute video will change your mind on reproducible research&nbsp;</p><p>2. Parallel sequencing lives, or what makes large sequencing projects successful&nbsp;<a href="https://academic.oup.com/gigascience/article/6/11/gix100/4557140?login=false">https://academic.oup.com/gigascience/article/6/11/gix100/4557140?login=false</a></p><p>3. Common-sense approaches to sharing tabular data alongside publication&nbsp;<a href="https://www.sciencedirect.com/science/article/pii/S2666389921002300">https://www.sciencedirect.com/science/article/pii/S2666389921002300</a></p><p>4. A Reproducible Data Analysis Workflow with R Markdown, Git, Make, and Docker&nbsp;<a href="https://psyarxiv.com/8xzqy/">https://psyarxiv.com/8xzqy/</a></p><p>5. Practical Computational Reproducibility in the Life Sciences&nbsp;<a href="https://www.cell.com/cell-systems/fulltext/S2405-4712(18)30140-6">https://www.cell.com/cell-systems/fulltext/S2405-4712(18)30140-6</a></p><p>6. A video by Dr.Keith A. Baggerly from MD Anderson [The Importance of Reproducible Research in High-Throughput Biology](<a href="https://www.youtube.com/watch?v=7gYIs7uYbMo">https://www.youtube.com/watch?v=7gYIs7uYbMo</a>) highly recommended.</p><p>7. Ten Simple Rules for Reproducible Computational Research&nbsp;<a href="http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003285">http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003285</a>)</p><p>8. Good Enough Practices in Scientific Computing&nbsp;<a href="http://arxiv.org/abs/1609.00037">http://arxiv.org/abs/1609.00037</a>&nbsp;</p><p>9. Best Practices for Scientific Computing&nbsp;<a href="https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001745">https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001745</a></p><p>10. A Quick Guide to Organizing Computational Biology Projects&nbsp;<a href="http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.100042">http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.100042</a>&nbsp; A must read for computational biologists!</p><p>11. Reproducibility of computational workflows is automated using continuous analysis&nbsp;<a href="https://www.nature.com/articles/nbt.3780">https://www.nature.com/articles/nbt.3780</a></p><p>12. Five selfish reasons to work reproducibly&nbsp;<a href="https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0850-7">https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0850-7</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44618/important-bioinformatics-tools</guid>
	<pubDate>Tue, 30 Jul 2024 05:03:29 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44618/important-bioinformatics-tools</link>
	<title><![CDATA[Important Bioinformatics Tools !]]></title>
	<description><![CDATA[<p><span>1. Ktrim: An extra-fast, accurate adapter trimmer for sequencing data. It processes FASTQ files from multiple lanes with minimal mismatching and over-trimming of adapters.</span><span><br /></span><span><br /></span><span>2. BWA MEM: A reliable alignment tool (particularly for mapping ALT contigs and HLA genes, which are not fully addressed in BWA-MEM2).</span><span><br /></span><span><br /></span><span>3. Sambamba markdup: Quickly marks or removes duplicate reads using Picard's criteria.</span><span><br /></span><span><br /></span><span>4. ichorCNA: Estimates the tumor DNA fraction in cell-free DNA from ultra-low-pass whole genome sequencing (0.1x coverage) based on copy number alterations (CNA).</span><span><br /></span><span><br /></span><span>5. Fragle: A deep learning method for quantifying ctDNA levels from cell-free DNA fragmentomic profiles. It detects TF as low as ~1% ctDNA and works with targeted genomic panel sequencing data.</span><span><br /></span><span><br /></span><span>6. AlfredQC: A quality control tool for high-throughput sequencing data. It assesses metrics like read quality scores, GC content, and duplication rates, visualized through detailed plots and summary statistics.</span><span><br /></span><span><br /></span><span>7. Mosdepth: A fast tool for calculating sequencing coverage depth, offering a quicker alternative to samtools/sambamba depth by processing BAM and CRAM files.</span><span><br /></span><span><br /></span><span>8. Bedtools: A versatile toolkit for genomics, enabling operations like intersect, merge, count, and shuffle on genomic intervals across formats such as BAM, BED, GFF/GTF, and VCF.</span><span><br /></span><span><br /></span><span>9. Datamash: A command-line tool for basic numeric, textual, and statistical operations on input data streams. It supports operations such as grouping, sorting, transposing, and performing arithmetic calculations on tabular data.</span><span><br /></span><span><br /></span><span>10.</span><span> </span><a href="http://gwf.app/" target="_self">gwf.app</a><span>: A pragmatic alternative to Snakemake. Developed at</span><span> </span><a href="https://www.linkedin.com/company/aarhus-university-denmark-/" target="_self"><span>Aarhus University</span></a><span>, this flexible, generic workflow tool builds and runs large scientific workflows.</span></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44914/predicting-pathogen-virulence-using-bioinformatics-tools</guid>
	<pubDate>Tue, 04 Nov 2025 07:55:53 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44914/predicting-pathogen-virulence-using-bioinformatics-tools</link>
	<title><![CDATA[Predicting Pathogen Virulence Using Bioinformatics Tools]]></title>
	<description><![CDATA[<p>In the genomic era, the ability to predict the virulence potential of pathogens has become an indispensable part of infectious disease research. With the exponential growth of microbial genome data, bioinformatics tools now enable scientists to identify virulence factors, model pathogen behavior, and even forecast outbreak risks &mdash; all from sequence data.</p><p>In an age where pathogens continue to evolve and cross boundaries, understanding <strong>what makes them virulent</strong>&mdash;that is, capable of causing disease&mdash;has become a critical focus in modern microbiology and genomics. <strong>Virulence prediction</strong> bridges computational biology, genomics, and machine learning to forecast the pathogenic potential of microbes before they strike.</p><h3>What Is Virulence?</h3><p><em>Virulence</em> refers to the degree of damage a pathogen can inflict on its host. It is determined by a combination of genetic factors&mdash;called <strong>virulence factors (VFs)</strong>&mdash;that allow the organism to attach, invade, evade, and harm the host. These include genes coding for toxins, secretion systems, adhesins, and enzymes that disrupt host defenses.</p><p>Understanding virulence factors not only helps in deciphering the mechanisms of infection but also provides early warning signs for emerging threats.</p><h3>Why Predict Virulence?</h3><p>Traditional virulence studies relied heavily on experimental infection models, which, although accurate, are <strong>time-consuming, expensive, and ethically constrained</strong>.<br /> Today, the availability of whole-genome sequences and large-scale pathogen databases has paved the way for <strong>in silico virulence prediction</strong>&mdash;a computational approach that can screen thousands of genomes within hours.</p><p>This approach enables researchers to:</p><ul>
<li>
<p>Rapidly identify potential <strong>high-risk strains</strong>.</p>
</li>
<li>
<p>Prioritize pathogens for <strong>containment, surveillance, or further study</strong>.</p>
</li>
<li>
<p>Guide <strong>vaccine development</strong> and <strong>drug target discovery</strong>.</p>
</li>
<li>
<p>Support <strong>One Health frameworks</strong>, linking animal, human, and environmental health data.</p>
</li>
</ul><h3>How Is Virulence Predicted?</h3><p>Virulence prediction combines <strong>bioinformatics pipelines</strong> with <strong>machine learning</strong> and <strong>comparative genomics</strong>. The process generally involves:</p><ol>
<li>
<p><strong>Genome Annotation:</strong> Identifying genes and coding sequences in microbial genomes.</p>
</li>
<li>
<p><strong>Feature Extraction:</strong> Comparing sequences with curated databases like <strong>VFDB (Virulence Factor Database)</strong>, <strong>PATRIC</strong>, or <strong>Victors</strong>.</p>
</li>
<li>
<p><strong>Pattern Recognition:</strong> Using algorithms (e.g., Random Forest, SVM, or deep learning models) to classify genes or strains as virulent or non-virulent based on sequence patterns, motifs, and protein domains.</p>
</li>
<li>
<p><strong>Scoring and Visualization:</strong> Assigning a virulence score or confidence level and visualizing it through heatmaps or genome maps.</p>
</li>
</ol><h3>Tools and Resources for Virulence Prediction</h3><p>A number of tools and databases make virulence prediction accessible to the scientific community:</p><ul>
<li>
<p><strong>VFanalyzer</strong> &ndash; For identifying virulence genes based on VFDB.</p>
</li>
<li>
<p><strong>PathoFact</strong> &ndash; Predicts virulence, antimicrobial resistance (AMR), and toxin genes from metagenomic data.</p>
</li>
<li>
<p><strong>Pangenome-based models</strong> &ndash; Identify virulence-associated gene clusters across strains.</p>
</li>
<li>
<p><strong>Machine learning models</strong> &ndash; Use features like GC content, codon usage bias, or protein domains to predict pathogenicity.</p>
</li>
</ul><p>Emerging tools now integrate <strong>multi-omic data</strong>&mdash;including transcriptomics, proteomics, and metabolomics&mdash;to understand virulence in a systems biology framework.</p><h3>Applications in the Real World</h3><p>Virulence prediction has major implications across public health and research sectors:</p><ul>
<li>
<p><strong>Epidemic preparedness:</strong> Early identification of virulent strains in outbreak samples.</p>
</li>
<li>
<p><strong>AMR surveillance:</strong> Linking virulence profiles with antibiotic resistance determinants.</p>
</li>
<li>
<p><strong>Environmental monitoring:</strong> Predicting pathogenic potential of soil or waterborne microbes.</p>
</li>
<li>
<p><strong>Clinical diagnostics:</strong> Supporting personalized treatment through pathogen profiling.</p>
</li>
</ul><p>For instance, integrating virulence prediction pipelines into <strong>national surveillance networks</strong> could enable faster risk assessment and response to infectious outbreaks.</p><h3>The Road Ahead</h3><p>As machine learning and genomics advance, virulence prediction will evolve from simple gene-based detection to <strong>dynamic, context-aware models</strong> that account for host&ndash;pathogen interactions, environmental signals, and evolutionary adaptation.</p><p>Future tools may predict <strong>not just if a strain is virulent</strong>, but <strong>under what conditions</strong> it expresses that virulence&mdash;bridging the gap between genotype and phenotype.</p><h3>In Summary</h3><p>Virulence prediction is redefining how we understand and anticipate infectious diseases. By coupling <strong>genomic insights</strong> with <strong>computational intelligence</strong>, researchers can identify potential threats earlier, design smarter interventions, and ultimately, strengthen our preparedness against emerging pathogens.</p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34221/alignment-free-sequence-comparison-tools-available-for-next-generation-sequencing-data-analysis</guid>
	<pubDate>Tue, 07 Nov 2017 05:33:33 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34221/alignment-free-sequence-comparison-tools-available-for-next-generation-sequencing-data-analysis</link>
	<title><![CDATA[Alignment-free sequence comparison tools available for next-generation sequencing data analysis]]></title>
	<description><![CDATA[<div><p><span>kallisto</span></p></div><div><p>Transcript abundance quantification from RNA-seq data (uses pseudoalignment for rapid determination of read compatibility with targets)</p><p>Software (C++)</p><p><a href="https://pachterlab.github.io/kallisto/">https://pachterlab.github.io/kallisto/</a></p><p>Sailfish</p><p>Estimation of isoform abundances from reference sequences and RNA-seq data (<em>k</em>-mer based)</p><p>Software (C++)</p><p><a href="http://www.cs.cmu.edu/~ckingsf/software/sailfish/">http://www.cs.cmu.edu/~ckingsf/software/sailfish/</a></p><p>Salmon</p><p>Quantification of the expression of transcripts using RNA-seq data (uses&nbsp;<em>k</em>-mers)</p><p><a href="https://combine-lab.github.io/salmon/">https://combine-lab.github.io/salmon/</a></p><p>RNA-Skim</p><p>RNA-seq quantification at transcript-level (partitions the transcriptome into disjoint transcript clusters; uses&nbsp;<em>sig</em>-mers, a special type of&nbsp;<em>k</em>-mers)</p><p>Software (C++)</p><p><a href="http://www.csbio.unc.edu/rs/">http://www.csbio.unc.edu/rs/</a></p><p>Variant calling</p><p>ChimeRScope</p><p>Fusion transcript prediction using gene&nbsp;<em>k</em>-mers profiles of the RNA-seq paired-end reads</p><p>Software (Java)</p><p><a href="https://github.com/ChimeRScope/ChimeRScope/wiki">https://github.com/ChimeRScope/ChimeRScope/wiki</a></p><p>FastGT</p><p>Genotyping of known SNV/SNP variants directly from raw NGS sequence reads by counting unique&nbsp;<em>k</em>-mers</p><p>Software (C)</p><p><a href="https://github.com/bioinfo-ut/GenomeTester4/">https://github.com/bioinfo-ut/GenomeTester4/</a></p><p>Phy-Mer</p><p>Reference-independent mitochondrial haplogroup classifier from NGS data (<em>k</em>-mer based)</p><p>Software (Python)</p><p><a href="https://github.com/danielnavarrogomez/phy-mer">https://github.com/danielnavarrogomez/phy-mer</a></p><p>LAVA</p><p>Genotyping of known SNPs (dbSNP and Affymetrix's Genome-Wide Human SNP Array) from raw NGS reads (<em>k</em>-mer based)</p><p>Software (C)</p><p><a href="http://lava.csail.mit.edu/">http://lava.csail.mit.edu/</a></p><p>MICADo</p><p>Detection of mutations in targeted third-generation NGS data (can distinguish patients&rsquo; specific mutations; algorithm uses&nbsp;<em>k</em>-mers and is based on colored de Bruijn graphs)</p><p>Software (Python)</p><p><a href="http://github.com/cbib/MICADo">http://github.com/cbib/MICADo</a></p><p>General mapper</p><p>Minimap</p><p>Lightweight and fast read mapper and read overlap detector (uses the concept of &ldquo;minimazers&rdquo;, a special type of&nbsp;<em>k</em>-mers)</p><p>Software (C)</p><p><a href="https://github.com/lh3/minimap">https://github.com/lh3/minimap</a></p><p>Assembly</p><p>De novo genome assembly</p><p>MHAP</p><p>Produces highly continuous assembly (fully resolved chromosome arms) from third-generation long and noisy reads (10 kbp) using a dimensionality reduction technique MinHash</p><p>Software (Java)</p><p><a href="https://github.com/marbl/MHAP">https://github.com/marbl/MHAP</a></p><p>Miniasm</p><p>Assembler of long noisy reads (SMRT, ONT) using the Overlap-Layout Consensus (OLC) approach without the necessity of an error correction stage (uses minimap)</p><p>Software (C)</p><p><a href="https://github.com/lh3/miniasm">https://github.com/lh3/miniasm</a></p><p>LINKS</p><p>Scaffolding genome assembly with error-containing long sequence (e.g., ONT or PacBio reads, draft genomes)</p><p>Software (Perl)</p><p><a href="https://github.com/warrenlr/LINKS/">https://github.com/warrenlr/LINKS/</a></p><p>Read clustering</p><p>afcluster</p><p>Clustering of reads from different genes and different species based on&nbsp;<em>k</em>-mer counts</p><p>Software (C++)</p><p><a href="https://github.com/luscinius/afcluster">https://github.com/luscinius/afcluster</a></p><p>QCluster</p><p>Clustering of reads with alignment-free measures (<em>k</em>-mer based) and quality values</p><p>Software (C++)</p><p><a href="http://www.dei.unipd.it/~ciompin/main/qcluster.html">http://www.dei.unipd.it/~ciompin/main/qcluster.html</a></p><p>Reads error correction</p><p>Lighter</p><p>Correction of sequencing errors in raw, whole genome sequencing reads (<em>k</em>-mer based)</p><p>Software (C++)</p><p><a href="https://github.com/mourisl/Lighter">https://github.com/mourisl/Lighter</a></p><p>QuorUM</p><p>Error corrector for Illumina reads using k-mers</p><p>Software (C++)</p><p><a href="https://github.com/gmarcais/Quorum">https://github.com/gmarcais/Quorum</a></p><p>Trowel</p><p>Software (C++)</p><p><a href="https://sourceforge.net/projects/trowel-ec/">https://sourceforge.net/projects/trowel-ec/</a></p><p>Metagenomics</p><p>Assembly-free phylogenomics</p><p>AAF</p><p>Phylogeny reconstruction directly from unassembled raw sequence data from whole genome sequencing projects; provides bootstrap support to assess uncertainty in the tree topology (<em>k</em>-mer based)</p><p>Software (Python)</p><p><a href="https://github.com/fanhuan/AAF">https://github.com/fanhuan/AAF</a></p><p>kSNP v3</p><p>Reference-free SNP identification and estimation of phylogenetic trees using SNPs (based on&nbsp;<em>k</em>-mer analysis)</p><p>Software (C)</p><p><a href="https://sourceforge.net/projects/ksnp/files/">https://sourceforge.net/projects/ksnp/files/</a></p><p>NGS-MC</p><p>Phylogeny of species based on NGS reads using alignment-free sequence dissimilarity measures d2* and d2&nbsp;S&nbsp;under different Markov chain models (using&nbsp;<em>k</em>-words)</p><p>R package</p><p><a href="http://www-rcf.usc.edu/~fsun/Programs/NGS-MC/NGS-MC.html">http://www-rcf.usc.edu/~fsun/Programs/NGS-MC/NGS-MC.html</a></p><p>Species identification/taxonomic profiling</p><p>CLARK</p><p>Taxonomic classification of metagenomic reads to known bacterial genomes using&nbsp;<em>k</em>-mer search and LCA assignment</p><p>Software (C++)</p><p><a href="http://clark.cs.ucr.edu/">http://clark.cs.ucr.edu/</a></p><p>FOCUS</p><p>Reports organisms present in metagenomic samples and profiles their abundances (uses composition-based approach and non-negative least squares for prediction)</p><p>Web service Software (Python)</p><p><a href="http://edwards.sdsu.edu/FOCUS/">http://edwards.sdsu.edu/FOCUS/</a></p><p>GSM</p><p>Estimation of abundances of microbial genomes in metagenomic samples (<em>k</em>-mer based)</p><p>Software (Go)</p><p><a href="https://github.com/pdtrang/GSM">https://github.com/pdtrang/GSM</a></p><p>Mash</p><p>Species identification using assembled or unassembled Illumina, PacBio, and ONT data (based on MinHash dimensionality-reduction technique)</p><p>Software (C++)</p><p><a href="https://github.com/marbl/mash">https://github.com/marbl/mash</a></p><p>Kraken</p><p>Taxonomic assignment in metagenome analysis by exact&nbsp;<em>k</em>-mer search; LCA assignment of short reads based on a comprehensive sequence database</p><p>Software (C++)</p><p><a href="https://ccb.jhu.edu/software/kraken/">https://ccb.jhu.edu/software/kraken/</a></p><p>LMAT</p><p>Assignment of taxonomic labels to reads by&nbsp;<em>k</em>-mers searches in precomputed database</p><p>Software (C++/Python)</p><p><a href="https://sourceforge.net/projects/lmat/">https://sourceforge.net/projects/lmat/</a></p><p>stringMLST</p><p><em>k</em>-mer-based tool for MLST directly from the genome sequencing reads</p><p>Software (Python)</p><p><a href="http://jordan.biology.gatech.edu/page/software/stringMLST">http://jordan.biology.gatech.edu/page/software/stringMLST</a></p><p>Taxonomer</p><p><em>k</em>-mer-based ultrafast metagenomics tool for assigning taxonomy to sequencing reads from clinical and environmental samples</p><p>Web service</p><p><a href="http://taxonomer.iobio.io/">http://taxonomer.iobio.io/</a></p><p>Other</p><p>d2-tools</p><p>Word-based (<em>k</em>-tuple) comparison (pairwise dissimilarity matrix using d2S measure) of metatranscriptomic samples from NGS reads</p><p>Software (Python/R)</p><p><a href="https://code.google.com/p/d2-tools/">https://code.google.com/p/d2-tools/</a></p><p>VirHostMatcher</p><p>Prediction of hosts from metagenomic viral sequences based on ONF using various distance measures (e.g., d2)</p><p>Software (C++)</p><p><a href="https://github.com/jessieren/VirHostMatcher">https://github.com/jessieren/VirHostMatcher</a></p><p>MetaFast</p><p>Statistics calculation of metagenome sequences and the distances between them based on assembly using de Bruijn graphs and Bray&ndash;Curtis dissimilarity measure</p><p>Software (Java)</p><p><a href="https://github.com/ctlab/metafast">https://github.com/ctlab/metafast</a></p></div>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36512/hisat2-a-fast-and-sensitive-alignment-program-for-mapping-next-generation-sequencing-reads</guid>
	<pubDate>Tue, 08 May 2018 04:27:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36512/hisat2-a-fast-and-sensitive-alignment-program-for-mapping-next-generation-sequencing-reads</link>
	<title><![CDATA[HISAT2: a fast and sensitive alignment program for mapping next-generation sequencing reads]]></title>
	<description><![CDATA[<p><strong>HISAT2</strong><span>&nbsp;is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as to a single reference genome). Based on an extension of BWT for graphs&nbsp;</span><a href="http://dl.acm.org/citation.cfm?id=2674828">[Sir&eacute;n et al. 2014]</a><span>, we designed and implemented a graph FM index (GFM), an original approach and its first implementation to the best of our knowledge. In addition to using one global GFM index that represents a population of human genomes, HISAT2 uses a large set of small GFM indexes that collectively cover the whole genome (each index representing a genomic region of 56 Kbp, with 55,000 indexes needed to cover the human population). These small indexes (called local indexes), combined with several alignment strategies, enable rapid and accurate alignment of sequencing reads. This new indexing scheme is called a Hierarchical Graph FM index (HGFM).&nbsp;</span></p>
<p><span>more at&nbsp;https://ccb.jhu.edu/software/hisat2/index.shtml</span></p><p>Address of the bookmark: <a href="https://github.com/infphilo/hisat2" rel="nofollow">https://github.com/infphilo/hisat2</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/38238/list-of-motif-discovery-tools</guid>
	<pubDate>Tue, 20 Nov 2018 03:54:26 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/38238/list-of-motif-discovery-tools</link>
	<title><![CDATA[List of motif discovery tools !]]></title>
	<description><![CDATA[<div><div>In genetics, a sequence motif is a nucleotide or amino-acid sequence pattern that is widespread and has, or is conjectured to have, a biological significance. For proteins, a sequence motif is distinguished from a structural motif, a motif formed by the three-dimensional arrangement of amino acids which may not be adjacent.</div><div>&nbsp;</div><div>Following are the list of tools for motif discovery:</div><div>&nbsp;</div><div><a href="http://genius.embnet.dkfz-heidelberg.de/menu/biounit/open-husar/">2Dsweep -- protein annotation by secondary structure elements</a></div><p>Perform secondary structure predictions on protein sequences.</p></div><div><div><a href="http://floresta.eead.csic.es/3dfootprint/">3D-footprint -- database of DNA-binding protein structures</a></div><p>Find binding specificity information about DNA-protein complexes.</p></div><div><div><a href="http://floresta.eead.csic.es/3dfootprint/">3D-footprint: DNA-binding protein database</a></div><p>Find information about the binding specificity of DNA-binding proteins.</p></div><div><div><a href="http://3d-partner.life.nctu.edu.tw/">3D-partner -- a web server to infer interacting partners and binding models</a></div><p>Predict interacting partners and binding models.</p></div><div><div><a href="http://motif.stanford.edu/distributions/3motif/">3MOTIF -- a protein structure visualization system for conserved sequence motifs</a></div><p>Use this web-based sequence motif visualization system to display sequence motif information in its appropriate three-dimensional (3D) context.</p></div><div><div><a href="http://bioinfo.mpiz-koeln.mpg.de/afawe/">AFAWE -- Automatic functional annotation in a distributed Web Services Environment</a></div><p>Protein function prediction and annotation in an integrated environment powered by web service.</p></div><div><div><a href="http://anchor.enzim.hu/">ANCHOR -- Prediction of Protein Binding Regions in Disordered Proteins</a></div><p>Find information about protein binding.</p></div><div><div><a href="http://annie.bii.a-star.edu.sg/annie/home.do">ANNIE -- ANNotation and Interpretation Environment for Protein Sequences</a></div><p>Use to predict function from de novo protein sequences.</p></div><div><div><a href="http://bioinformatica.isa.cnr.it/ASC/">Active Sequences Collection (ASC) database -- A new tool to assign functions to protein sequences</a></div><p>Search for short active protein sequences with demonstrated biological activities.</p></div><div><div><a href="http://blocks.fhcrc.org/">Blocks -- Ungapped segments in conserved protein sequences</a></div><p>Search for ungapped segments corresponding to the most highly conserved regions of proteins.</p></div><div><div><a href="http://cast.engr.uic.edu/">CASTp -- computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues</a></div><p>Identify and measure surface accessible pockets as well as interior inaccessible cavities, for proteins and other molecules.</p></div><div><div><a href="http://www.ebi.ac.uk/thornton-srv/databases/CSA">CSA -- The Catalytic Site Atlas</a></div><p>To search for catalytic residue annotation for enzymes in the Protein Data Bank.</p></div><div><div><a href="http://www.sbg.bio.ic.ac.uk/~confunc/">ConFunc -- Conserved residue Protein Function Prediction Server</a></div><p>Predict protein function using Gene Ontology.</p></div><div><div><a href="http://consurf.tau.ac.il/">ConSurf-DB -- evolutionary conservation profiles of protein structures database</a></div><p>Automatically calculate evolutionary conservation scores of key amino acid residues and map them on protein structures.</p></div><div><div><a href="http://salilab.org/DBAli/">DBAli -- A Database of Structure Alignments</a></div><p>Mine the protein structure space.</p></div><div><div><a href="http://dilimot.embl.de/">DILIMOT -- discovery of linear motifs in proteins</a></div><p>Predict short linear motifs (3-8 residues) in a set of protein sequences.</p></div><div><div><a href="http://www.ebi.ac.uk/dasty/">Dasty2 -- an Ajax protein DAS client</a></div><p>A web client for visualizing protein sequence feature information using DAS.</p></div><div><div><a href="http://genius.embnet.dkfz-heidelberg.de/menu/biounit/open-husar/">DomainSweep -- protein annotation by domain analysis</a></div><p>Identify the domain architecture within a protein sequence.</p></div><div><div><a href="http://e1ds.csbb.ntu.edu.tw/">E1DS -- catalytic site prediction based on 1D signatures of concurrent conservation</a></div><p>Predict enzyme catalytic site.</p></div><div><div><a href="http://elm.eu.org/">ELM -- Eukarotic Linear Motif Resource</a></div><p>Predict functional sites in eukaryotic proteins.</p></div><div><div><a href="http://us.expasy.org/tools/#proteome">EXPASY Proteome Tools Collection</a></div><p>Use a collection of tools for protein analyses.</p></div><div><div><a href="http://us.expasy.org/tools/findmod/">EXPASY-Findmod</a></div><p>Predict potential protein post-translational modifications and find potential single amino acid substitutions in peptides.</p></div><div><div><a href="http://mbs.cbrc.jp/EzCatDB/">EzCatDB -- the Enzyme Catalytic-mechanism Database</a></div><p>Search for information related to the catalytic mechanisms of enzymes.</p></div><div><div><a href="http://bioinf.cs.ucl.ac.uk/ffpred/">FFPred -- feature-based function prediction</a></div><p>An integrated feature-based function prediction server for vertebrate proteomes.</p></div><div><div><a href="http://www.ebi.ac.uk/printsscan/">FingerPRINT Scan</a></div><p>Identify the closest matching PRINTS sequence motif fingerprints in a protein sequence.</p></div><div><div><a href="http://firedb.bioinfo.cnio.es/">FireDB -- a database of functionally important residues from proteins of known structure</a></div><p>Search for functional annotation of important sites in proteins with known structures.</p></div><div><div><a href="http://bioserv.rpbs.univ-paris-diderot.fr/cgi-bin/Frog2">Frog2 -- a FRee Online druG 3D conformation generator</a></div><p>Produce 3D conformations of small drug compounds.</p></div><div><div><a href="http://www.hgpd.jp/">HGPD -- Human Gene and Protein Database</a></div><p>A database presenting experiment-based results in human proteomics.</p></div><div><div><a href="http://hhsenser.tuebingen.mpg.de/">HHsenser -- exhaustive transitive profile search using HMMx96HMM comparison</a></div><p>Conduct exhaustive intermediate profile searches of a set of homologous protein sequences.</p></div><div><div><a href="http://loschmidt.chemi.muni.cz/hotspotwizard/">HotSpot Wizard -- Substrate Specificity Hot Spot Identification web server</a></div><p>Design protein mutations in site-directed mutagenesis.</p></div><div><div><a href="http://phylogenomics.berkeley.edu/intrepid/">INTREPID -- INformation-theoretic TREe traversal for Protein functional site IDentification</a></div><p>Use for protein functional site identification.</p></div><div><div><a href="http://www.cbs.dtu.dk/">Integrating protein annotation resources through the Distributed Annotation System</a></div><p>Annotate protein using this integrated annotation resource.</p></div><div><div><a href="http://www.ebi.ac.uk/InterProScan/">InterProScan -- protein domains identifier</a></div><p>Identify protein family (and DNA) domains, patterns, motifs, protein families, and functional sites.</p></div><div><div><a href="http://kfc.mitchell-lab.org/">KFC -- Knowledge-based FADE and Contacts</a></div><p>Interactive forecasting of protein interaction hot spots.</p></div><div><div><a href="http://biominer.bime.ntu.edu.tw/magiicpro/">MAGIIC-PRO -- detecting functional signatures by efficient discovery of long patterns in protein sequences</a></div><p>Discover long patterns in protein sequences.</p></div><div><div><a href="http://prodata.swmed.edu/malisam">MALISAM -- Manual ALIgnments for Structurally Analogous Motifs</a></div><p>Database containing pairs of structural analogs and their alignments.</p></div><div><div><a href="http://meme.nbcr.net/">MEME -- discovering and analyzing DNA and protein sequence motifs</a></div><p>Find sequence patterns in DNA and protein sequences.</p></div><div><div><a href="http://www.nii.res.in/modpropep.html">MODPROPEP -- a program for knowledge-based modeling of protein-peptide complexes</a></div><p>A web server for knowledge-based modeling of protein-peptide complexes, specifically peptides in complex with major histocompatibility complex (MHC) proteins and kinases.</p></div><div><div><a href="http://www.bioinfo.tsinghua.edu.cn/~tigerchen/memo.html">MeMo -- a web tool for prediction of protein methylation modifications</a></div><p>Predict protein methylation sites.</p></div><div><div><a href="http://caps.ncbs.res.in/MegaMotifbase/index.html">MegaMotifBase -- a database of structural motifs in protein families and superfamilies</a></div><p>Find structural segments or motifs for protein structures.</p></div><div><div><a href="http://mnm.engr.uconn.edu/MNM/SMSSearchServlet">Minimotif Miner -- a tool for investigating protein function</a></div><p>Find motifs in a protein sequence.</p></div><div><div><a href="http://umber.sbs.man.ac.uk/dbbrowser/motif3d/motif3d.html">Motif3D -- Relating protein sequence motifs to 3D structure</a></div><p>Visualize protein sequence motifs on the 3D protein structures.</p></div><div><div><a href="http://myhits.isb-sib.ch/cgi-bin/motif_scan">MotifScan</a></div><p>Find presence of any known protein motif (Prosite and Pfam) in a protein sequence.</p></div><div><div><a href="http://bioinfo3d.cs.tau.ac.il/MultiBind">MultiBind -- Multiple Alignment of Protein Binding Sites</a></div><p>Recognize spatial chemical binding patterns common to a set of protein structures.</p></div><div><div><a href="http://mendel.imp.univie.ac.at/myristate/SUPLpredictor.htm">NMT -- The MYR Predictor</a></div><p>Analyze proteins for the presence of N-terminal N-myristoylation site.</p></div><div><div><a href="http://www.cbs.dtu.dk/services/NetNGlyc/">NetNGlyc -- N-Glycosylation sites prediction tool</a></div><p>Find the presence of N-Glycosylation sites in human proteins.</p></div><div><div><a href="http://www.cbs.dtu.dk/services/NetOGlyc/">NetOGly 3.1 -- O-glycosylation sites prediction tool</a></div><p>Find the presence of O-GalNAc (mucin type) glycosylation sites in mammalian proteins.</p></div><div><div><a href="http://www.cbs.dtu.dk/services/NetPhos/">NetPhos 2.0 -- Phosphorylation sites predictions</a></div><p>Analyze eukaryotic proteins for the presence of serine, threonine and tyrosine phosphorylation sites.</p></div><div><div><a href="http://www.cbs.dtu.dk/services/NetPhosK/">NetPhosK 1.0 Server -- kinase specific eukaryotic protein phosphorylation sites prediction tool</a></div><p>Find possible kinase specific phosphorylation sites in eukaryotic proteins.</p></div><div><div><a href="http://networkin.info/search.php">NetworKIN -- a resource for exploring cellular phosphorylation networks</a></div><div>&nbsp;</div></div><div><div><a href="http://neuroproteomics.scs.uiuc.edu/neuropred.html">NeuroPred -- a tool to predict cleavage sites in neuropeptide precursors and provide the masses of the resulting peptides</a></div><p>Predict cleavage sites at basic amino acid locations in neuropeptide precursor sequences.</p></div><div><div><a href="http://www.ebi.ac.uk/patentdata/nr/">Non-Redundant Patent Sequences - Patented Sequence Database</a></div><p>Find information about patented nucleotide and protein sequences.</p></div><div><div><a href="http://www.cbs.dtu.dk/databases/OGLYCBASE/">O-GLYCBASE</a></div><p>Search for information about glycoproteins with O-linked and C-linked glycosylation sites.</p></div><div><div><a href="http://www.pandora.cs.huji.ac.il/">PANDORA -- Protein ANnotation Diagram ORiented Analysis</a></div><p>Find information about protein sequence annotations.</p></div><div><div><a href="http://sunserver.cdfd.org.in:8080/protease/PAR_3D/index.html">PAR-3D -- Protein Active site Residue - 3D structural motif</a></div><p>A server to predict protein active site residues.</p></div><div><div><a href="http://wwwmgs.bionet.nsc.ru/mgs/gnw/pdbsite/">PDBSite -- a database of the 3D structure of protein functional sites</a></div><p>Search for structural and functional information on the protein functional sites.</p></div><div><div><a href="http://wwwmgs.bionet.nsc.ru/mgs/systems/fastprot/pdbsitescan.html">PDBSiteScan -- A program for searching for active, binding and posttranslational modification sites in the 3D structures of proteins</a></div><p>Search 3D protein fragments similar in structure to known active, binding and posttranslational modification sites.</p></div><div><div><a href="http://pedant.gsf.de/">PEDANT -- Protein Extraction, Description and ANalysis Tool</a></div><p>Conduct genome wide functional and structural analysis.</p></div><div><div><a href="http://phosida.org/">PHOSIDA -- Phosphorylation site database</a></div><p>Search for phosphorylation data of any protein of interest.</p></div><div><div><a href="http://www.phosphorylation.biochem.vt.edu/">PHOSPHORYLATION SITE DATABASE</a></div><p>Search for information on prokaryotic proteins that undergo serine, threonine, or tyrosine phosphorylation.</p></div><div><div><a href="http://www.jcvi.org/pn-utility/web/smarty_wrapper/about.php">PNU -- Protein Naming Utility</a></div><p>Determine correct names for proteins.</p></div><div><div><a href="http://mbs.cbrc.jp/poodle/poodle-s.html">POODLE-S -- Predicition Of Order and Disorder by machine LEarning</a></div><p>Web application for predicting protein disorder by using physicochemical features and reduced amino acid set of a position-specific scoring matrix.</p></div><div><div><a href="http://gemdock.life.nctu.edu.tw/ppisearch/">PPISearch -- Protein-Protein Interaction Search</a></div><p>Find homologous protein-protein interactions across multiple species.</p></div><div><div><a href="http://www.ebi.ac.uk/ppsearch/">PPSearch</a></div><p>Search your query sequence against PROSITE pattern database for protein motifs.</p></div><div><div><a href="http://pridb.gdcb.iastate.edu/">PRIDB -- Protein-RNA Interface DataBase</a></div><p>Find information about protein-RNA complexes from the Protein Data Bank (PDB).</p></div><div><div><a href="http://umber.sbs.man.ac.uk/dbbrowser/PRINTS/">PRINTS and its automatic supplement, prePRINTS -- A compendium of protein fingerprints</a></div><p>Search for protein fingerprints.</p></div><div><div><a href="http://www.expasy.org/prosite/">PROSITE</a></div><p>Identify protein families and domains for a given protein sequence.</p></div><div><div><a href="http://www.imtech.res.in/raghava/prrdb/">PRRDB -- Pattern Recognition Receptor Database</a></div><p>A comprehensive database of pattern-recognition receptors and their ligands.</p></div><div><div><a href="http://www.arabidopsis.org/cgi-bin/patmatch/nph-patmatch.pl">PatMatch -- a program for finding patterns in peptide and nucleotide sequences</a></div><p>Search for short nucleotide or peptide sequences such as cis-elements in nucleotide sequences or small domains and motifs in protein sequences.</p></div><div><div><a href="http://pepcyber.umn.edu/PPEP/">PepCyber:P~PEP -- a database of human protein protein interactions mediated by phosphoprotein-binding domains</a></div><p>Database specialized in documenting human PPBD-containing proteins and PPBD-mediated interactions.</p></div><div><div><a href="http://us.expasy.org/tools/peptidecutter/">PeptideCutter -- protein cleavage sites prediction tool</a></div><p>Predicts potential protease cleavage sites and sites cleaved by chemicals in a given protein sequence.</p></div><div><div><a href="http://phobius.binf.ku.dk/">Phobius -- A combined transmembrane topology and signal peptide predictor</a></div><p>Predict combined transmembrane topology and signal peptides.</p></div><div><div><a href="http://phospho.elm.eu.org/">Phospho.ELM -- a database of phosphorylation sites</a></div><p>Search for eukaryotic phosphorylation sites.</p></div><div><div><a href="http://www.phospho3d.org/">Phospho3D -- a database of three-dimensional structures of protein phosphorylation sites</a></div><p>Search for 3D structure and functional annotation of phosphorylation sites in proteins.</p></div><div><div><a href="http://www.phosphosite.org/">PhosphoSite -- A bioinformatics resource dedicated to physiological protein phosphorylation.</a></div><p>Search the database of in vivo phosphorylation sites of human and mouse proteins</p></div><div><div><a href="http://pxgrid.med.monash.edu.au/polyq/">PolyQ -- Polyglutamine Database</a></div><p>Find information about polyglutamine (polyQ) repeats.</p></div><div><div><a href="http://www.ebi.ac.uk/pratt/">Pratt Protein motif and pattern discovery</a></div><p>Find the presence of protein motifs and patterns in an amino acid sequence.</p></div><div><div><a href="http://www.predisi.de/">PrediSi -- Prediction of Signal Peptides and their Cleavage Positions</a></div><p>Predict signal peptide sequences and their cleavage positions in bacterial and eukaryotic amino acid sequences.</p></div><div><div><a href="http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/">ProFunc -- a server for predicting protein function from 3D structure</a></div><p>Predict protein functions based on known structures.</p></div><div><div><a href="http://bioinfo41.weizmann.ac.il/promate/promateus.html">ProMateus--an open research approach to protein-binding sites analysis</a></div><p>Predict the location of potential protein-protein binding sites for unbound proteins.</p></div><div><div><a href="http://www.proteus.cs.huji.ac.il/">ProTeus -- identifying signatures in protein termini</a></div><p>Identify short linear signatures in protein termini.</p></div><div><div><a href="http://genius.embnet.dkfz-heidelberg.de/menu/cgi-bin/w2h-open/w2h.open/w2h.startthis?SIMGO=w2h%2ewelcome">ProtSweep -- protein annotation by homology</a></div><p>Analyze and identify newly obtained protein sequences.</p></div><div><div><a href="http://protemot.csbb.ntu.edu.tw/">Protemot -- prediction of protein binding sites with automatically extracted geometrical templates</a></div><p>Predict protein binding sites in a protein sequence based on geometrical analysis of protein tertiary substructures.</p></div><div><div><a href="http://quasimotifinder.tau.ac.il/">QuasiMotiFinder -- protein annotation by searching for evolutionarily conserved motif-like patterns</a></div><p>Search for evolutionarily conserved motif-like patterns in protein sequences.</p></div><div><div><a href="http://bindr.gdcb.iastate.edu/RNABindR">RNABindR -- software for prediction of RNA binding residues in proteins</a></div><p>Web-based server for analyzing and predicting RNA binding sites in proteins.</p></div><div><div><a href="http://caps.ncbs.res.in/scanmot/scanmot.html">SCANMOT -- searching for similar sequences using a simultaneous scan of multiple sequence motifs</a></div><p>Search for similarities between proteins by simultaneous matching of multiple motifs.</p></div><div><div><a href="http://bioinf.fbb.msu.ru/SDPpred/">SDPpred -- A Tool for Prediction of Amino Acid Residues that Determine Differences in Functional Specificity of Homologous Proteins</a></div><p>Predict residues in protein sequences that determine the proteins' functional specificity.</p></div><div><div><a href="http://tamm.mit.edu/SDR/">SDR -- Specificity Determining Residues Database</a></div><p>Predict specificity-determining residues in protein families.</p></div><div><div><a href="http://bioware.ucd.ie/~slimdisc/">SLiMDisc -- Short, Linear Motif Discovery</a></div><p>Find shared motifs in proteins with a common attribute.</p></div><div><div><a href="http://sumosp.biocuckoo.org/">SUMOsp -- a web server for sumoylation site prediction</a></div><p>Conduct in silico sumoylation sites prediction.</p></div><div><div><a href="http://oxytricha.princeton.edu/SWAKK/">SWAKK -- a web server for detecting positive selection in proteins using a sliding window substitution rate analysis</a></div><p>Detect protein sequence section under positive evolution selection.</p></div><div><div><a href="http://www.expasy.org/tools/scanprosite/">ScanProsite</a></div><p>Search for motifs and patterns within protein sequences.</p></div><div><div><a href="http://www.expasy.org/tools/scanprosite/">ScanProsite -- detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins</a></div><p>Detect patterns, profiles and motifs in a protein sequence.</p></div><div><div><a href="http://scansite.mit.edu/">ScanSite 2.0 -- Proteome-wide prediction of cell signaling interactions using short sequence motifs</a></div><p>Search for motifs within proteins that are likely to be phosphorylated by specific protein kinases or bind to domains such as SH2 domains, 14-3-3 domains or PDZ domains.</p></div><div><div><a href="http://sepresa.bio-x.cn/">SePreSA -- SErver for the PREdiction of populations susceptible to Serious Adverse drug reaction</a></div><p>Find information about populations carrying polymorphisms within protein binding pockets that make them susceptible to serious adverse drug reaction (SADR).</p></div><div><div><a href="http://motif.genome.jp/">Sequence Motif Search</a></div><p>Search the presence of a motif in either amino acid sequence or nucleotide sequence.</p></div><div><div><a href="http://www.csbio.sjtu.edu.cn/bioinf/Signal-3L/">Signal-3L -- A 3-layer approach for predicting signal peptides</a></div><p>Predict signal peptides.</p></div><div><div><a href="http://www.cbs.dtu.dk/services/SignalP/">SignalP -- Machine learning approaches to the prediction of signal peptides, their cleavage sites, and other protein sorting signals</a></div><p>Predict signal peptides and their cleavage sites.</p></div><div><div><a href="http://us.expasy.org/tools/sulfinator/">Sulfinator -- tyrosine sulfation sites prediction tool</a></div><p>Predict the presence of tyrosine sulfation sites in protein sequences</p></div><div><div><a href="http://bioinf-services.charite.de/supersite/">SuperSite -- Ligand Binding Site Database</a></div><p>Look at protein structure from a ligand and binding site perspective.</p></div><div><div><a href="http://www.ch.embnet.org/">Swiss EMBnet node web server</a></div><p>Use a collection of bioinformatics tools at this portal site.</p></div><div><div><a href="http://bioinfo.montp.cnrs.fr/?r=t-reks">T-REKS -- identification of Tandem REpeats in sequences with a K-meanS based algorithm</a></div><p>Find information about tandem repeats in proteins that carry fundamental biological functions and are related to a number of human diseases.</p></div><div><div><a href="http://tmbeta-genome.cbrc.jp/TMFunction/">TMFunction -- The Functional Database of Membrane Proteins</a></div><p>Find information about functional residues in alpha-helical and beta-barrel membrane proteins.</p></div><div><div><a href="http://topdom.enzim.hu/">TOPDOM -- Conservatively Located Domains and Motifs in Transmembrane Proteins</a></div><p>Database of domains and motifs with conservative location in transmembrane proteins.</p></div><div><div><a href="http://motif.stanford.edu/distributions/emotif/">The EMOTIF database</a></div><p>Search for highly conserved and specific protein sequence motifs.</p></div><div><div><a href="http://treedetv2.bioinfo.cnio.es/treedet/index.html">TreeDet -- Predicting Functional Residues in Protein Sequence Alignments</a></div><p>Predict functional sites in protein sequence alignments use different methodologies.</p></div><div><div><a href="http://motif.bmi.ohio-state.edu/ChIPMotifs/">W-ChIPMotifs -- ChIP-based protein Motif discovery web server</a></div><p>Find de novo protein motifs from chromatin immunoprecipitation data.</p></div><div><div><a href="http://feature.stanford.edu/webfeature/">WebFEATURE -- an interactive web tool for identifying and visualizing functional sites on macromolecular structures</a></div><p>Scan query structures for functional sites in both proteins and nucleic acids.</p></div><div><div><a href="http://wwwmgs.bionet.nsc.ru/mgs/programs/panalyst/">WebProAnalyst -- an interactive tool for analysis of quantitative structurex96activity relationships in protein families</a></div><p>Analyze quantitative structure-activity relationship of related protein families.</p></div><div><div><a href="http://motif.stanford.edu/distributions/eblocks/">eBLOCKs -- enumerating conserved protein blocks to achieve maximal sensitivity and specificity</a></div><p>Search for ungapped alignments of highly conserved regions among a protein family or superfamily.</p></div><div><div><a href="http://ef-site.hgc.jp/eF-seek/">eF-seek -- prediction of the functional sites of proteins by searching for similar electrostatic potential and molecular surface shape</a></div><p>Predict the functional sites of proteins.</p></div><div><div><a href="http://firedb.bioinfo.cnio.es/Php/FireStar.php">firestar -- prediction of functionally important residues using structural templates and alignment reliability</a></div><p>An expert system for predicting ligand-binding residues in protein structures.</p></div><div><div><a href="http://caps.ncbs.res.in/imotdb/">iMOTdb -- a comprehensive collection of spatially interacting motifs in proteins</a></div><p>Automatically identify spatially interacting motifs among distantly related proteins sharing similar folds and possessing common ancestral lineage.</p></div>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41996/wgd%E2%80%94simple-command-line-tools-for-the-analysis-of-ancient-whole-genome-duplications</guid>
	<pubDate>Thu, 23 Jul 2020 05:49:45 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41996/wgd%E2%80%94simple-command-line-tools-for-the-analysis-of-ancient-whole-genome-duplications</link>
	<title><![CDATA[wgd—simple command line tools for the analysis of ancient whole-genome duplications]]></title>
	<description><![CDATA[<p><span>wgd is a easy to use command-line tool for<span>&nbsp;</span></span><em>K</em><sub>S</sub><span><span>&nbsp;</span>distribution construction named wgd. The wgd suite provides commonly used<span>&nbsp;</span></span><em>K</em><sub>S</sub><span><span>&nbsp;</span>and colinearity analysis workflows together with tools for modeling and visualization, rendering these analyses accessible to genomics researchers in a convenient manner.</span></p>
<p><a href="https://academic.oup.com/bioinformatics/article/35/12/2153/5162749">https://academic.oup.com/bioinformatics/article/35/12/2153/5162749</a></p><p>Address of the bookmark: <a href="https://github.com/arzwa/wgd" rel="nofollow">https://github.com/arzwa/wgd</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/43084/frequently-used-bioinformatics-tools-for-viral-genome-analysis</guid>
	<pubDate>Wed, 23 Jun 2021 07:40:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/43084/frequently-used-bioinformatics-tools-for-viral-genome-analysis</link>
	<title><![CDATA[Frequently used bioinformatics tools for viral genome analysis !]]></title>
	<description><![CDATA[<p><strong>IVA: accurate de novo assembly of RNA virus genomes.</strong><br /> Hunt M, Gall A, Ong SH, Brener J, Ferns B, Goulder P, Nastouli E, Keane JA, Kellam P, Otto TD.<br /> Bioinformatics. 2015 Jul 15;31(14):2374-6. doi: <a href="http://bioinformatics.oxfordjournals.org/content/31/14/2374.long">10.1093/bioinformatics/btv120</a>. Epub 2015 Feb 28.</p><p><a href="http://www.nature.com/nmeth/journal/v9/n1/full/nmeth.1814.html"><strong>Adapter sequences</strong></a>:<br /> <strong>Optimal enzymes for amplifying sequencing libraries.</strong><br /> Quail, M. a et al. Nat. Methods 9, 10-1 (2012).</p><p><a href="http://genome.cshlp.org/content/early/2012/01/12/gr.131383.111"><strong>GAGE</strong></a>:<br /> <strong>GAGE: A critical evaluation of genome assemblies and assembly algorithms.</strong><br /> Salzberg, S. L. et al. Genome Res. 22, 557-67 (2012).</p><p><a href="http://www.biomedcentral.com/1471-2105/14/160"><strong>KMC</strong></a>:<br /> <strong>Disk-based k-mer counting on a PC.</strong><br /> Deorowicz, S., Debudaj-Grabysz, A. &amp; Grabowski, S. BMC Bioinformatics 14, 160 (2013).</p><p><a href="http://genomebiology.com/2014/15/3/R46"><strong>Kraken</strong></a>:<br /> <strong>Kraken: ultrafast metagenomic sequence classification using exact alignments.</strong><br /> Wood, D. E. &amp; Salzberg, S. L. Genome Biol. 15, R46 (2014).</p><p><a href="http://genomebiology.com/2004/5/2/r12"><strong>MUMmer</strong></a>:<br /> <strong>Versatile and open software for comparing large genomes.</strong><br /> Kurtz, S. et al. Genome Biol. 5, R12 (2004).</p><p><strong>R</strong>:<br /> <strong>R: A language and environment for statistical computing.</strong><br /> R Core Team (2013). R Foundation for Statistical Computing, Vienna, Austria. URL <a href="http://www.R-project.org/">http://www.R-project.org/</a>.</p><p><a href="http://nar.oxfordjournals.org/content/39/9/e57"><strong>RATT</strong></a>:<br /> <strong>RATT: Rapid Annotation Transfer Tool.</strong><br /> Otto, T. D., Dillon, G. P., Degrave, W. S. &amp; Berriman, M. Nucleic Acids Res. 39, e57 (2011).</p><p><a href="http://bioinformatics.oxfordjournals.org/content/25/16/2078.abstract"><strong>SAMtools</strong></a>:<br /> <strong>The Sequence Alignment/Map format and SAMtools.</strong><br /> Li, H. et al. Bioinformatics 25, 2078-9 (2009).</p><p><a href="http://bioinformatics.oxfordjournals.org/content/early/2014/04/12/bioinformatics.btu170"><strong>Trimmomatic</strong></a>:<br /> <strong>Trimmomatic: A flexible trimmer for Illumina Sequence Data.</strong><br /> Bolger, A. M., Lohse, M. &amp; Usadel, B. Bioinformatics 1-7 (2014).</p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44551/bioinformatic-tools-for-pathogens-informatics-at-cvr</guid>
	<pubDate>Sat, 08 Jun 2024 15:59:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44551/bioinformatic-tools-for-pathogens-informatics-at-cvr</link>
	<title><![CDATA[Bioinformatic tools for pathogens informatics at CVR]]></title>
	<description><![CDATA[<div><div><div><div><div><p>Novel sequencing and analytical approaches focused on studying viruses and virus-host interactions. Below you will find summaries and links to a number of bioinformatic tools that have been developed @ CVR.</p></div><div><h3><a href="http://giffordlabcvr.github.io/DIGS-tool/" target="_blank" title="DIGS">DIGS</a></h3></div><div><p>The database-integrated genome-screening (DIGS) tool provides a framework for implementing automated in silico screening of sequence databases using BLAST in combination with a relational database (MySQL).</p></div><div><h3><a href="https://bioinformatics.cvr.ac.uk/software/discvr/" target="" title="DisCVR">DisCVR</a></h3></div><div><p>DisCVR is a Diagnostic tool for detecting known human viruses in clinical samples from Next-Generation Sequencing (NGS) data. The tool uses a simple and straightforward Graphical User Interface and is optimized on Windows OS without compromising speed and accuracy.</p></div><div><h3><a href="http://josephhughes.github.io/DiversiTools/" target="_blank" title="DiversiTools">DiversiTools</a></h3></div><div><p>DiversiTools is a computational tool that is specifically tailored towards viral HTS data sets and the analysis of the underlying viral populations that they represent. It was initially developed in collaboration with a number of virologists interested in characterising the intra-host diversity of viral populations and studying their evolution across transmission chains at the micro-evolutionary scale.</p></div><div><h3><a href="http://glue-tools.cvr.gla.ac.uk/" target="_blank" title="GLUE">GLUE</a></h3></div><div><p>GLUE is a flexible data-centric bioinformatics environment for virus sequence data, with a focus on virus evolution and genomic variation. GLUE has been applied to a range of viruses. A GLUE-based resource focused on Hepatitis C virus is HCV-GLUE.</p></div><div><h3><a href="https://bioinformatics.cvr.ac.uk/tanoti/" target="_blank" title="Tanoti">Tanoti</a></h3></div><div><p>Tanoti is a BLAST guided reference based short read aligner. It is developed for maximising alignment in highly variable next generation sequence data sets (Illumina).</p></div><div><h3><a href="https://bioinformatics.cvr.ac.uk/victree/" target="_blank" title="VicTREE">ViCTree</a></h3></div><div><p>ViCTree is a bioinformatic framework that automatically selects new candidate virus sequences from GenBank, generates multiple sequence alignments, calculates a maximum likelihood phylogeny and integrates the sequences into the existing phylogenetic trees.&nbsp;<span>For more information click&nbsp;</span><a href="https://bioinformatics.cvr.ac.uk/victree_web/" target="_blank">here</a>.</p></div></div></div></div></div><div><div><div><div><div><h3><a href="https://bioinformatics.cvr.ac.uk/software/viral-host-predictor/" target="" title="Viral Host Predictor">Viral Host Predictor</a></h3></div><div><p>Viral Host Predictor provides a fast and simple way to predict the hosts and vectors of RNA viruses from viral sequences.</p></div><div><h3><a href="https://github.com/salvocamiolo/GRACy/releases/tag/v0.4.4" target="_blank" title="GRACy">GRACy</a></h3></div><div><p>GRACy is a bioinformatic tool designed for the analysis of Illumina data originated from Human cytomegalovirus samples. GRACy can be used to perform read quality filtering, genotyping, de novo assembly, variant detection, annotation and data submission to public database.</p></div><div><h3><a href="https://github.com/salvocamiolo/LoReTTA/releases/tag/v0.1" target="_blank" title="LoReTTA">LoReTTA</a></h3></div><div><p>LoReTTA (Long Read Template Targeted Assembler) is a reference assisted de novo assembler specifically designed to deal with PacBio reads generated from viral genomes.&nbsp;</p></div><div><h3><a href="https://bioinformatics.cvr.ac.uk/software/bingleseq/" target="" title="BingleSeq">BingleSeq</a></h3></div><div><p>BingleSeq is a R-package enables the user-friendly analysis of count tables obtained by both Bulk RNA-Seq and single-cell RNA-Seq protocols. The development of BingleSeq focused on providing a flexible and intuitive user experience.</p></div></div></div></div></div>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42477/hifiasm-a-haplotype-resolved-assembler-for-accurate-hifi-reads</guid>
	<pubDate>Thu, 24 Dec 2020 10:03:36 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42477/hifiasm-a-haplotype-resolved-assembler-for-accurate-hifi-reads</link>
	<title><![CDATA[Hifiasm: a haplotype-resolved assembler for accurate Hifi reads]]></title>
	<description><![CDATA[<p><span>Hifiasm is a fast haplotype-resolved de novo assembler for PacBio Hifi reads. It can assemble a human genome in several hours and works with the California redwood genome, one of the most complex genomes sequenced so far. Hifiasm can produce primary/alternate assemblies of quality competitive with the best assemblers. It also introduces a new graph binning algorithm and achieves the best haplotype-resolved assembly given trio data.</span></p><p>Address of the bookmark: <a href="https://github.com/chhylp123/hifiasm" rel="nofollow">https://github.com/chhylp123/hifiasm</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

</channel>
</rss>