<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/33869?offset=20</link>
	<atom:link href="https://bioinformaticsonline.com/related/33869?offset=20" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44375/phyloherb-a-high%E2%80%90throughput-phylogenomic-pipeline-for-processing-genome-skimming-data</guid>
	<pubDate>Wed, 06 Sep 2023 00:14:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44375/phyloherb-a-high%E2%80%90throughput-phylogenomic-pipeline-for-processing-genome-skimming-data</link>
	<title><![CDATA[PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome skimming data]]></title>
	<description><![CDATA[<p dir="auto"><span>Phylo</span>genomic Analysis Pipeline for&nbsp;<span>Herb</span>arium Specimens</p>
<p dir="auto"><span>What is PhyloHerb</span>: PhyloHerb is a wrapper program to process&nbsp;<span>genome skimming</span>&nbsp;data collected from plant materials. The outcomes include the plastid genome (plastome) assemblies, mitochondrial genome assemblies, nuclear ribosomal DNAs (NTS+ETS+18S+ITS1+5.8S+ITS2+28S), alignments of gene and intergenic regions, and a species tree. It is designed to be a high throughput program dealing with lower quality data. Examples include&nbsp;<span>low-coverage (5x cpDNA) plastome phylogeny, recycling plastid genes from target enrichment data, retrieving low-copy nuclear genes from medium coverage (5x nucDNA) genome skimming</span>.</p>
<p dir="auto"><span>License</span>: GNU General Public License</p>
<p dir="auto"><span>Citation</span>:</p>
<ul dir="auto">
<li>Cai, Liming, Hongrui Zhang, and Charles C. Davis. 2022. PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome‐skimming data. Applications in Plant Sciences 10(3): 1&ndash;9.&nbsp;<a href="https://doi.org/10.1002/aps3.11475">https://doi.org/10.1002/aps3.11475</a></li>
</ul><p>Address of the bookmark: <a href="https://github.com/lmcai/PhyloHerb/" rel="nofollow">https://github.com/lmcai/PhyloHerb/</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</guid>
	<pubDate>Fri, 04 Oct 2024 02:45:06 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/44672/libraries-or-management-tools-for-high-throughput-sequencing-data</link>
	<title><![CDATA[Libraries or management tools for high throughput sequencing data]]></title>
	<description><![CDATA[<ul>
<li><a href="http://gatb.inria.fr/"><span>GATB</span></a>&nbsp;Library.&nbsp;The&nbsp;<span>Genome Analysis Toolbox with de-Bruijn graph.&nbsp;</span>A large part of tools developed by the GenScale team are based on this library.<br />These methods enable the analysis of data sets of any size on multi-core desktop computers, including very huge amount of reads data coming from any kind of organisms such as bacteria, plants, animals and even complex samples (<em>e.g.</em>&nbsp;metagenomes). Among them are (the full is available here:&nbsp;<a href="https://gatb.inria.fr/software/">https://gatb.inria.fr/software/</a>):</li>
<li><a href="https://github.com/morispi/LRez"><span>LRez</span></a>: C++ Library and toolkit for the barcode-based management and indexation of linked-read datasets.</li>
</ul><h2>Variant calling and/or genotyping</h2><ul>
<li><a href="https://gatb.inria.fr/software/discosnp/" title="DiscoSNP">DiscoSNP++ and&nbsp;discoSnpRAD</a>: Reference-free small variant discovery (SNPs and indels)</li>
<li><a href="https://gatb.inria.fr/software/mind-the-gap/" title="MindTheGap">MindTheGap</a>: Detection and assembly of large insertion variants</li>
<li><a href="https://gatb.inria.fr/software/takeabreak/" title="TakeABreak">TakeABreak</a>:&nbsp;reference-free inversion discovery tool</li>
<li><a href="https://github.com/llecompte/SVJedi">SVJedi</a>: Structural Variant genotyper with long read data</li>
<li><a href="https://github.com/SandraLouise/SVJedi-graph">SVJedi-graph</a>: Structural Variant genotyper with long read data using a variation graph</li>
</ul><h2>Sequence assembly</h2><ul>
<li><a href="https://github.com/cguyomar/MinYS">MinYS</a>: reference-guided genome assembly in metagenomics data</li>
<li><a href="https://github.com/anne-gcd/MTG-Link">MTG-link</a>: local assembly tool for linked-read data</li>
<li><a href="https://gatb.inria.fr/software/minia/" title="Minia">Minia</a>: De novo short read assembler</li>
<li><a href="https://gatb.inria.fr/de-novo-genome-assembly/">de-novo pipeline</a>:&nbsp;<em>de-novo</em>&nbsp;assembly pipeline (error correction / contigs / scaffolding) for genomes and meta-genomes</li>
<li><a href="https://gatb.inria.fr/software/mapsembler/" title="Mapsembler2">Mapsembler2</a>: Targeted assembly (not maintained)</li>
</ul><h2>Managing k-mers &amp; indexation</h2><ul>
<li><a href="https://github.com/lrobidou/findere">findere</a>:&nbsp;simple strategy for speeding up queries and for reducing false positive calls from any Approximate Membership Query data structure.
<ul>
<li><a href="https://github.com/lrobidou/fimpera">fimpera</a>&nbsp;extends findere adding the abundance information.</li>
</ul>
</li>
<li><a href="https://github.com/tlemane/kmtricks">kmtricks</a>:&nbsp;modular tool suite for counting kmers, and constructing Bloom filters or kmer matrices, for large collections of sequencing data.</li>
<li><a href="https://github.com/tlemane/kmindex">kmindex&nbsp;</a>is a tool for indexing and querying sequencing samples. It is built on top of kmtricks.</li>
<li><a href="https://github.com/pierrepeterlongo/back_to_sequences">back to sequences</a>: Find sequences (reads, unitigs, genes) related to a set of kmers in large datasets, in a matter of seconds.</li>
<li><a href="https://github.com/vicLeva/bqf">Backpack Quotient Filter</a>:&nbsp;k-mer indexing data structure with abundance</li>
<li><a href="http://github.com/GATB/rconnector">short read connector</a>:&nbsp;Detect similar reads from potentially large read set</li>
<li><a href="https://gatb.inria.fr/software/dsk/" title="DSK">DSK</a>:&nbsp;Count K-mer in sequences</li>
</ul><h2>Pangenome graph manipulation</h2><ul>
<li><a href="https://github.com/Tharos-ux/pancat">Pancat</a>: Pangenome Comparison and Analysis Toolkit</li>
<li><a href="https://pypi.org/project/gfagraphs/">GFAGraphs</a>: a Python library to handle pangenome graph files in GFA format.</li>
</ul><h2>Comparative metagenomics with k-mers</h2><ul>
<li><a href="https://github.com/GATB/simka">Simka and SimkaMin</a>:&nbsp;Comparative metagenomics for large-scale datasets</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/compreads-metagenomic-data-analysis/">Comparead &amp; Commet</a>:&nbsp;comparison of metagenomic datasets</li>
</ul><h2>Species and bacterial strains identification</h2><ul>
<li><a href="https://github.com/gsiekaniec/ORI">ORI</a>: software using long nanopore reads to identify bacteria present in a sample at the strain level</li>
<li><a href="https://github.com/kevsilva/StrainFLAIR">StrainFLAIR</a>:&nbsp;STRAIN-level proFiLing using vArIation gRaph</li>
</ul><h2>General-purpose sequencing data manipulation</h2><ul>
<li><a href="https://team.inria.fr/genscale/ngs-software/gassst/">GASSST</a>:&nbsp;long read mapper</li>
<li><a href="https://gatb.inria.fr/software/leon/" title="Leon">Leon</a>: short read compressor (now included in GATB-core)</li>
<li><a href="https://gatb.inria.fr/software/bloocoo/" title="Bloocoo">Bloocoo</a>:&nbsp;short read corrector</li>
<li><a href="https://github.com/GATB/bcalm">BCALM</a>:&nbsp;Construct compacted de Bruijn graphs (unitigs)</li>
</ul><h2>&nbsp;Protein Structure</h2><ul>
<li><a href="https://team.inria.fr/genscale/protein-structure/a-purva-contact-map-overlap-solver/">A_Purva</a>:&nbsp;Contact Map Overlap solver</li>
<li><a href="https://team.inria.fr/genscale/protein-structure/md-jeep-distance-geomtry-solver/">MD-Jeep</a>:&nbsp;Distance Geometry solver</li>
<li><a href="https://team.inria.fr/genscale/csa-comparative-structural-alignment/">CSA</a>:&nbsp;Comparative Structural Alignment</li>
</ul><h2>Workflow</h2><ul>
<li><a href="https://team.inria.fr/genscale/workflows/slicee/">SLICEE</a>:&nbsp;parallel execution of bioinformatics workflows</li>
</ul><h3>Comparative Genomics</h3><ul>
<li><a href="https://team.inria.fr/genscale/comparative-genomics/cassis/">CASSIS</a>:&nbsp;detection of rearrangement breakpoints</li>
<li><a href="https://team.inria.fr/genscale/high-throughput-sequence-analysis/plast-intensive-sequence-comparison/">PLAST</a>:&nbsp;intensive bank-to-bank sequence comparison</li>
<li><a href="https://github.com/stephanierobin/DrjBreakpointFinder">DRJBreakpointFinder</a>: detection and precise localization of excision sites in proviral segments</li>
</ul>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38634/eyechrom-visualizing-chromosome-count-data-from-plants</guid>
	<pubDate>Tue, 08 Jan 2019 10:20:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38634/eyechrom-visualizing-chromosome-count-data-from-plants</link>
	<title><![CDATA[EyeChrom: Visualizing Chromosome Count Data From Plants]]></title>
	<description><![CDATA[<p><span>It's goal is to show chromosmal data per genus. Select the genus, and the plot will show the records found for it in the Chromosome Counts Database. note: Report an issue via Gihub: github.com/roszenil/CCDBcurator and github.com/RodrigoRivero/EyeChrom</span></p>
<p>https://bsapubs.onlinelibrary.wiley.com/doi/pdf/10.1002/aps3.1207</p><p>Address of the bookmark: <a href="http://eyechrom.com:3838/EyeChrom/" rel="nofollow">http://eyechrom.com:3838/EyeChrom/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41328/deephic-a-generative-adversarial-network-for-enhancing-hi-c-data-resolution</guid>
	<pubDate>Tue, 03 Mar 2020 01:12:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41328/deephic-a-generative-adversarial-network-for-enhancing-hi-c-data-resolution</link>
	<title><![CDATA[DeepHiC: A Generative Adversarial Network for Enhancing Hi-C Data Resolution]]></title>
	<description><![CDATA[<p><strong>DeepHiC</strong> is a GAN-based model for enhancing Hi-C data resolution. We developed this server for helping researchers to enhance their own low-resolution data by a few steps of clicks. <em>Ab initio</em> training could be performed according to our published <a href="https://github.com/omegahh/DeepHiC">code</a>. We provided trained models for various depth of low-coverage sequencing Hi-C data. The depth of input data is estimated by its distribution comparing with those of the downsampled Hi-C data we used in training</p><p>Address of the bookmark: <a href="http://sysomics.com/deephic" rel="nofollow">http://sysomics.com/deephic</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43374/reference-sequence-resource</guid>
	<pubDate>Wed, 15 Sep 2021 21:15:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43374/reference-sequence-resource</link>
	<title><![CDATA[Reference Sequence Resource!]]></title>
	<description><![CDATA[<p><span>The ENCODE project uses Reference Genomes from&nbsp;</span><a href="http://www.ncbi.nlm.nih.gov/genome/browse/reference/">NCBI</a><span>&nbsp;or&nbsp;</span><a href="http://hgdownload.cse.ucsc.edu/downloads.html">UCSC</a><span>&nbsp;to provide a consistent framework for mapping high-throughput sequencing data.&nbsp;In general, ENCODE data are mapped consistently to 2 human (GRCH38, hg19) and 2 mouse (mm9/mm10) genomes for historical comparability.&nbsp;</span><em>Drosophia melanogaster</em><span>&nbsp;experiments are mapped to either dm3 or dm6 and&nbsp;</span><em>Caenorhabdilis elegans&nbsp;</em><span>experiments are mapped to ce10 or ce11.&nbsp;T</span></p><p>Address of the bookmark: <a href="https://www.encodeproject.org/data-standards/reference-sequences/" rel="nofollow">https://www.encodeproject.org/data-standards/reference-sequences/</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44751/large-language-models-in-bioinformatics-transforming-data-analysis-and-interpretation</guid>
	<pubDate>Thu, 02 Jan 2025 11:26:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44751/large-language-models-in-bioinformatics-transforming-data-analysis-and-interpretation</link>
	<title><![CDATA[Large Language Models in Bioinformatics: Transforming Data Analysis and Interpretation]]></title>
	<description><![CDATA[<p>The integration of artificial intelligence (AI) into bioinformatics has ushered in a new era of computational biology. Among the most transformative advancements are large language models (LLMs), such as GPT and BERT, which leverage deep learning to process and interpret vast amounts of text data. These models are reshaping bioinformatics by enhancing data analysis, hypothesis generation, and literature mining.</p><h3>Understanding Large Language Models</h3><p>LLMs are AI systems trained on extensive datasets of natural language. Their ability to model context, identify patterns, and generate coherent language has proven invaluable across domains, including bioinformatics. By fine-tuning these models on biological datasets, researchers can unlock insights into molecular biology, systems biology, and beyond.</p><h3>Key Applications of LLMs in Bioinformatics</h3><h4>1. <strong>Annotating Biological Data</strong></h4><p>Annotating genomic and proteomic data is fundamental yet labor-intensive. LLMs streamline this process by extracting functional annotations from literature and databases, predicting gene and protein functions, and providing automated insights.</p><h4>2. <strong>Mining Scientific Literature</strong></h4><p>The exponential growth of publications presents a challenge for researchers to stay updated. LLMs can process large volumes of text to extract key findings, summarize papers, and identify trends, thereby facilitating efficient literature reviews.</p><h4>3. <strong>Predicting Gene and Protein Functions</strong></h4><p>By leveraging sequence data and annotations, LLMs can predict the functions of uncharacterized genes and proteins. This capability is particularly useful for studying non-model organisms and orphan genes.</p><h4>4. <strong>Drug Discovery and Repurposing</strong></h4><p>LLMs enable pattern recognition across chemical, genomic, and clinical datasets, identifying novel drug candidates and repurposing existing drugs for new therapeutic targets. They can simulate interactions between drugs and biological molecules, accelerating the discovery pipeline.</p><h4>5. <strong>Generating Hypotheses for Research</strong></h4><p>LLMs analyze complex datasets to propose testable hypotheses. For example, they can predict protein-protein interactions, identify regulatory motifs, or model evolutionary processes in genomes.</p><h3>Advantages of LLMs in Bioinformatics</h3><ul>
<li>
<p><strong>Scalability:</strong> LLMs process massive datasets rapidly, reducing the time required for data analysis.</p>
</li>
<li>
<p><strong>Versatility:</strong> These models adapt to diverse bioinformatics tasks, from genomic annotation to network analysis.</p>
</li>
<li>
<p><strong>Contextual Insights:</strong> By synthesizing information across disparate datasets, LLMs provide integrative insights into biological systems.</p>
</li>
</ul><h3>Challenges in Applying LLMs</h3><p>Despite their promise, LLMs face limitations:</p><ul>
<li>
<p><strong>Data Quality and Bias:</strong> Inaccurate or biased datasets can affect model predictions, necessitating rigorous data curation.</p>
</li>
<li>
<p><strong>Interpretability:</strong> Understanding the decision-making process of LLMs remains a critical challenge, especially in high-stakes fields like genomics and medicine.</p>
</li>
<li>
<p><strong>Resource Intensity:</strong> Training and deploying LLMs require substantial computational power, which can limit accessibility.</p>
</li>
<li>
<p><strong>Ethical Concerns:</strong> Handling sensitive genomic data raises privacy and security issues, emphasizing the need for ethical guidelines.</p>
</li>
</ul><h3>Future Prospects</h3><p>The continued development of LLMs tailored for bioinformatics promises exciting advancements. Specialized models trained on omics data, open-access platforms, and interdisciplinary collaborations will expand the utility of LLMs. Moreover, integrating LLMs with other AI technologies, such as graph neural networks and reinforcement learning, can unlock deeper biological insights.</p><h3>Conclusion</h3><p>Large language models are revolutionizing bioinformatics by addressing longstanding challenges in data annotation, literature mining, and function prediction. Their ability to analyze complex biological datasets efficiently positions them as indispensable tools for modern research. As bioinformatics embraces AI, the synergy between LLMs and biological sciences holds the potential to unravel the complexities of life with unprecedented precision and scale.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/924/try-r-online</guid>
	<pubDate>Tue, 16 Jul 2013 06:15:11 -0500</pubDate>
	<link>https://bioinformaticsonline.com/news/view/924/try-r-online</link>
	<title><![CDATA[Try R Online]]></title>
	<description><![CDATA[<p>One of the best R tutorial website, which provide an online interative interface to try and learn R language without any hassle.</p><p>Link @ http://tryr.codeschool.com/</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/3046/r-and-bioconductor-tutorial</guid>
	<pubDate>Fri, 23 Aug 2013 08:23:59 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/3046/r-and-bioconductor-tutorial</link>
	<title><![CDATA[R and Bioconductor Tutorial]]></title>
	<description><![CDATA[<p>This tutorial is intended to introduce users quickly to the basics of R, focusing on a few common tasks that &nbsp;biologists need to perform &nbsp;some basic analysis: &nbsp;load a table, plot some graphs, and perform some basic statistics. More extensive tutorials can be found on the project website and via bioconductor (not covered here).</p>
<p>You can add more tutorial links in comments if found new pages.</p><p>Address of the bookmark: <a href="http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual" rel="nofollow">http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual</a></p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/11030/r-programming-and-jobs-website</guid>
	<pubDate>Sun, 25 May 2014 14:43:57 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/11030/r-programming-and-jobs-website</link>
	<title><![CDATA[R programming and Jobs website]]></title>
	<description><![CDATA[<p>Welcome to the R Jobs section of ProgrammingR.com. If your organization has an R employment opportunity that you would like to have posted here, submit it via the <a href="http://www.programmingr.com/contact" title="contact page">contact page</a>. Prospective employees: use the contact information provided in the position listing to apply or contact the hiring organization.</p><p>Address of the bookmark: <a href="http://www.programmingr.com/category/stype/r-job-listings/" rel="nofollow">http://www.programmingr.com/category/stype/r-job-listings/</a></p>]]></description>
	<dc:creator>Pragati Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/news/view/19087/dcgor</guid>
	<pubDate>Sat, 08 Nov 2014 14:54:28 -0600</pubDate>
	<link>https://bioinformaticsonline.com/news/view/19087/dcgor</link>
	<title><![CDATA[dcGOR]]></title>
	<description><![CDATA[<p>An R package for analysing ontologies and protein domain annotations has been published in PLoS Computational Biology (http://dx.doi.org/10.1371/journal.pcbi.1003929). The package is distributed as part of CRAN (http://cran.r-project.org/package=dcGOR), and also at GitHub for version control.<br /><br />The dedicated website is available in http://supfam.org/dcGOR, from which several demos are also provided:<br /><br />1. Analysing SCOP domains: http://supfam.org/dcGOR/demo-Fang.html<br /><br />2. Analysing Pfam domains: http://supfam.org/dcGOR/demo-Basu.html<br /><br />3. Analysing InterPro domains: http://supfam.org/dcGOR/demo-Customisation.html<br /><br />&nbsp;</p>]]></description>
	<dc:creator>Martin Jones</dc:creator>
</item>

</channel>
</rss>