<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/36271?offset=80</link>
	<atom:link href="https://bioinformaticsonline.com/related/36271?offset=80" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/38634/eyechrom-visualizing-chromosome-count-data-from-plants</guid>
	<pubDate>Tue, 08 Jan 2019 10:20:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/38634/eyechrom-visualizing-chromosome-count-data-from-plants</link>
	<title><![CDATA[EyeChrom: Visualizing Chromosome Count Data From Plants]]></title>
	<description><![CDATA[<p><span>It's goal is to show chromosmal data per genus. Select the genus, and the plot will show the records found for it in the Chromosome Counts Database. note: Report an issue via Gihub: github.com/roszenil/CCDBcurator and github.com/RodrigoRivero/EyeChrom</span></p>
<p>https://bsapubs.onlinelibrary.wiley.com/doi/pdf/10.1002/aps3.1207</p><p>Address of the bookmark: <a href="http://eyechrom.com:3838/EyeChrom/" rel="nofollow">http://eyechrom.com:3838/EyeChrom/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41328/deephic-a-generative-adversarial-network-for-enhancing-hi-c-data-resolution</guid>
	<pubDate>Tue, 03 Mar 2020 01:12:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41328/deephic-a-generative-adversarial-network-for-enhancing-hi-c-data-resolution</link>
	<title><![CDATA[DeepHiC: A Generative Adversarial Network for Enhancing Hi-C Data Resolution]]></title>
	<description><![CDATA[<p><strong>DeepHiC</strong> is a GAN-based model for enhancing Hi-C data resolution. We developed this server for helping researchers to enhance their own low-resolution data by a few steps of clicks. <em>Ab initio</em> training could be performed according to our published <a href="https://github.com/omegahh/DeepHiC">code</a>. We provided trained models for various depth of low-coverage sequencing Hi-C data. The depth of input data is estimated by its distribution comparing with those of the downsampled Hi-C data we used in training</p><p>Address of the bookmark: <a href="http://sysomics.com/deephic" rel="nofollow">http://sysomics.com/deephic</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43374/reference-sequence-resource</guid>
	<pubDate>Wed, 15 Sep 2021 21:15:22 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43374/reference-sequence-resource</link>
	<title><![CDATA[Reference Sequence Resource!]]></title>
	<description><![CDATA[<p><span>The ENCODE project uses Reference Genomes from&nbsp;</span><a href="http://www.ncbi.nlm.nih.gov/genome/browse/reference/">NCBI</a><span>&nbsp;or&nbsp;</span><a href="http://hgdownload.cse.ucsc.edu/downloads.html">UCSC</a><span>&nbsp;to provide a consistent framework for mapping high-throughput sequencing data.&nbsp;In general, ENCODE data are mapped consistently to 2 human (GRCH38, hg19) and 2 mouse (mm9/mm10) genomes for historical comparability.&nbsp;</span><em>Drosophia melanogaster</em><span>&nbsp;experiments are mapped to either dm3 or dm6 and&nbsp;</span><em>Caenorhabdilis elegans&nbsp;</em><span>experiments are mapped to ce10 or ce11.&nbsp;T</span></p><p>Address of the bookmark: <a href="https://www.encodeproject.org/data-standards/reference-sequences/" rel="nofollow">https://www.encodeproject.org/data-standards/reference-sequences/</a></p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44751/large-language-models-in-bioinformatics-transforming-data-analysis-and-interpretation</guid>
	<pubDate>Thu, 02 Jan 2025 11:26:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44751/large-language-models-in-bioinformatics-transforming-data-analysis-and-interpretation</link>
	<title><![CDATA[Large Language Models in Bioinformatics: Transforming Data Analysis and Interpretation]]></title>
	<description><![CDATA[<p>The integration of artificial intelligence (AI) into bioinformatics has ushered in a new era of computational biology. Among the most transformative advancements are large language models (LLMs), such as GPT and BERT, which leverage deep learning to process and interpret vast amounts of text data. These models are reshaping bioinformatics by enhancing data analysis, hypothesis generation, and literature mining.</p><h3>Understanding Large Language Models</h3><p>LLMs are AI systems trained on extensive datasets of natural language. Their ability to model context, identify patterns, and generate coherent language has proven invaluable across domains, including bioinformatics. By fine-tuning these models on biological datasets, researchers can unlock insights into molecular biology, systems biology, and beyond.</p><h3>Key Applications of LLMs in Bioinformatics</h3><h4>1. <strong>Annotating Biological Data</strong></h4><p>Annotating genomic and proteomic data is fundamental yet labor-intensive. LLMs streamline this process by extracting functional annotations from literature and databases, predicting gene and protein functions, and providing automated insights.</p><h4>2. <strong>Mining Scientific Literature</strong></h4><p>The exponential growth of publications presents a challenge for researchers to stay updated. LLMs can process large volumes of text to extract key findings, summarize papers, and identify trends, thereby facilitating efficient literature reviews.</p><h4>3. <strong>Predicting Gene and Protein Functions</strong></h4><p>By leveraging sequence data and annotations, LLMs can predict the functions of uncharacterized genes and proteins. This capability is particularly useful for studying non-model organisms and orphan genes.</p><h4>4. <strong>Drug Discovery and Repurposing</strong></h4><p>LLMs enable pattern recognition across chemical, genomic, and clinical datasets, identifying novel drug candidates and repurposing existing drugs for new therapeutic targets. They can simulate interactions between drugs and biological molecules, accelerating the discovery pipeline.</p><h4>5. <strong>Generating Hypotheses for Research</strong></h4><p>LLMs analyze complex datasets to propose testable hypotheses. For example, they can predict protein-protein interactions, identify regulatory motifs, or model evolutionary processes in genomes.</p><h3>Advantages of LLMs in Bioinformatics</h3><ul>
<li>
<p><strong>Scalability:</strong> LLMs process massive datasets rapidly, reducing the time required for data analysis.</p>
</li>
<li>
<p><strong>Versatility:</strong> These models adapt to diverse bioinformatics tasks, from genomic annotation to network analysis.</p>
</li>
<li>
<p><strong>Contextual Insights:</strong> By synthesizing information across disparate datasets, LLMs provide integrative insights into biological systems.</p>
</li>
</ul><h3>Challenges in Applying LLMs</h3><p>Despite their promise, LLMs face limitations:</p><ul>
<li>
<p><strong>Data Quality and Bias:</strong> Inaccurate or biased datasets can affect model predictions, necessitating rigorous data curation.</p>
</li>
<li>
<p><strong>Interpretability:</strong> Understanding the decision-making process of LLMs remains a critical challenge, especially in high-stakes fields like genomics and medicine.</p>
</li>
<li>
<p><strong>Resource Intensity:</strong> Training and deploying LLMs require substantial computational power, which can limit accessibility.</p>
</li>
<li>
<p><strong>Ethical Concerns:</strong> Handling sensitive genomic data raises privacy and security issues, emphasizing the need for ethical guidelines.</p>
</li>
</ul><h3>Future Prospects</h3><p>The continued development of LLMs tailored for bioinformatics promises exciting advancements. Specialized models trained on omics data, open-access platforms, and interdisciplinary collaborations will expand the utility of LLMs. Moreover, integrating LLMs with other AI technologies, such as graph neural networks and reinforcement learning, can unlock deeper biological insights.</p><h3>Conclusion</h3><p>Large language models are revolutionizing bioinformatics by addressing longstanding challenges in data annotation, literature mining, and function prediction. Their ability to analyze complex biological datasets efficiently positions them as indispensable tools for modern research. As bioinformatics embraces AI, the synergy between LLMs and biological sciences holds the potential to unravel the complexities of life with unprecedented precision and scale.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35907/alienness-rapid-detection-of-candidate-horizontal-gene-transfers-across-the-tree-of-life</guid>
	<pubDate>Mon, 12 Mar 2018 09:24:40 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35907/alienness-rapid-detection-of-candidate-horizontal-gene-transfers-across-the-tree-of-life</link>
	<title><![CDATA[alienness : Rapid Detection of Candidate Horizontal Gene Transfers across the Tree of Life]]></title>
	<description><![CDATA[<p><span>Horizontal gene transfer (HGT) is the transmission of genes between organisms by other means than parental to offspring inheritance. While it is prevalent in prokaryotes, HGT is less frequent in eukaryotes and particularly in Metazoa. Here, we propose Alienness, a taxonomy-aware web application available at&nbsp;</span>http://alienness.sophia.inra.fr</p>
<p>http://www.mdpi.com/2073-4425/8/10/248</p><p>Address of the bookmark: <a href="http://alienness.sophia.inra.fr/cgi/index.cgi" rel="nofollow">http://alienness.sophia.inra.fr/cgi/index.cgi</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37241/remilo-reference-assisted-misassembly-detection-algorithm-using-short-and-long-reads</guid>
	<pubDate>Fri, 06 Jul 2018 04:27:49 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37241/remilo-reference-assisted-misassembly-detection-algorithm-using-short-and-long-reads</link>
	<title><![CDATA[ReMILO: reference assisted misassembly detection algorithm using short and long reads.]]></title>
	<description><![CDATA[ReMILO, a reference assisted misassembly detection algorithm that uses both short reads and PacBio SMRT long reads. ReMILO aligns the initial short reads to both the contigs and reference genome, and then constructs a novel data structure called red-black multipositional de Bruijn graph to detect misassemblies. In addition, ReMILO also aligns the contigs to long reads and find their differences from the long reads to detect more misassemblies.<p>Address of the bookmark: <a href="https://github.com/songc001/remilo" rel="nofollow">https://github.com/songc001/remilo</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41969/shadowcaster-a-hybrid-approach-for-the-detection-of-horizontal-gene-transfer-events-in-prokaryotes</guid>
	<pubDate>Tue, 14 Jul 2020 06:42:10 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41969/shadowcaster-a-hybrid-approach-for-the-detection-of-horizontal-gene-transfer-events-in-prokaryotes</link>
	<title><![CDATA[ShadowCaster: a hybrid approach for the detection of horizontal gene transfer events in prokaryotes]]></title>
	<description><![CDATA[<p><span>ShadowCaster implements an evolutionary model to calculate Bayesian likelihoods for each &lsquo;alien genes&rsquo; with an unusual sequence composition according to the host genome background to detect HGT events in prokaryotes.</span></p>
<p><a href="https://www.mdpi.com/2073-4425/11/7/756/htm">https://www.mdpi.com/2073-4425/11/7/756/htm</a></p>
<p><a href="https://shadowcaster.readthedocs.io/en/latest/">https://shadowcaster.readthedocs.io/en/latest/</a></p>
<p><a href="https://github.com/dani2s/ShadowCaster_testData">https://github.com/dani2s/ShadowCaster_testData</a></p><p>Address of the bookmark: <a href="https://github.com/dani2s/ShadowCaster" rel="nofollow">https://github.com/dani2s/ShadowCaster</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/36026/mmseqs20-ultra-fast-and-sensitive-protein-search-and-clustering-suite</guid>
	<pubDate>Thu, 22 Mar 2018 10:40:51 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/36026/mmseqs20-ultra-fast-and-sensitive-protein-search-and-clustering-suite</link>
	<title><![CDATA[MMseqs2.0: ultra fast and sensitive protein search and clustering suite]]></title>
	<description><![CDATA[<p>MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The software is designed to run on multiple cores and servers and exhibits very good scalability. MMseqs2 can run 10000 times faster than BLAST. At 100 times its speed it achieves almost the same sensitivity. It can perform profile searches with the same sensitivity as PSI-BLAST at over 400 times its speed.</p>
<p>The MMseqs2 user guide is available as&nbsp;<a href="https://github.com/soedinglab/mmseqs2/wiki">Github Wiki</a>&nbsp;or as&nbsp;<a href="https://mmseqs.com/latest/userguide.pdf">PDF file</a>&nbsp;(Thanks to&nbsp;<a href="https://github.com/jgm/pandoc">pandoc</a>!)</p>
<p>Please cite:&nbsp;<a href="https://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3988.html">Steinegger M and Soeding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature Biotechnology, doi: 10.1038/nbt.3988 (2017)</a>.</p><p>Address of the bookmark: <a href="https://github.com/soedinglab/MMseqs2" rel="nofollow">https://github.com/soedinglab/MMseqs2</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44329/metabuli-%EB%B6%84%EB%A6%AC-improves-metagenomic-read-classification</guid>
	<pubDate>Sat, 03 Jun 2023 20:15:04 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44329/metabuli-%EB%B6%84%EB%A6%AC-improves-metagenomic-read-classification</link>
	<title><![CDATA[Metabuli 분리 improves metagenomic read classification]]></title>
	<description><![CDATA[<p><span>Metabuli 분리 improves metagenomic read classification through metamers, DNA-AA k-mers, to be sensitive and specific, recovering 99% and 98% of DNA or AA classifiers.</span></p>
<p>&nbsp;</p>
<p><span><span>Metabuli is metagenomic classifier that jointly analyze both DNA and amino acid (AA) sequences. DNA-based classifiers can make specific classifications, exploiting point mutations to distinguish close taxa. AA-based classifiers have higher sensitivity in detecting homology between query and reference sequences, leverageing higher conservation of AA sequences. Metabuli combines the information of both sequence types using a novel k-mer structure,&nbsp;</span><em>metamer</em><span>, to enable both specific and sensitive characterization of metagenomic samples. In addition, it can classify reads against a database of any size as long as it fits in the hard disk.</span> </span></p><p>Address of the bookmark: <a href="https://github.com/steineggerlab/Metabuli" rel="nofollow">https://github.com/steineggerlab/Metabuli</a></p>]]></description>
	<dc:creator>Abhi</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/view/1926</guid>
	<pubDate>Sun, 11 Aug 2013 11:42:32 -0500</pubDate>
	<link>https://bioinformaticsonline.com/view/1926</link>
	<title><![CDATA[Want to Know which genome assembler rule the world ?]]></title>
	<description><![CDATA[<p><span><strong>Assemblathon 2</strong>: evaluating de novo methods of genome assembly&nbsp;</span></p><p><span><a href="http://www.gigasciencejournal.com/content/2/1/10/abstract">http://www.gigasciencejournal.com/content/2/1/10/abstract</a></span></p><p><span><a href="http://blogs.nature.com/news/2013/07/genome-assembly-contest-prompts-soul-searching.html">http://blogs.nature.com/news/2013/07/genome-assembly-contest-prompts-soul-searching.html</a></span></p><p><a href="http://assemblathon.org/post/44431915644/feedback-and-analysis-of-the-assemblathon-2-p">http://assemblathon.org/post/44431915644/feedback-and-analysis-of-the-assemblathon-2-p</a></p><p>&nbsp;</p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>

</channel>
</rss>