<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/40611?offset=30</link>
	<atom:link href="https://bioinformaticsonline.com/related/40611?offset=30" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35249/gpopsim-a-simulation-tool-for-whole-genome-genetic-data</guid>
	<pubDate>Wed, 17 Jan 2018 03:47:46 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35249/gpopsim-a-simulation-tool-for-whole-genome-genetic-data</link>
	<title><![CDATA[GPOPSIM: a simulation tool for whole-genome genetic data]]></title>
	<description><![CDATA[<p><span>GPOPSIM is a simulation tool for pedigree, phenotypes, and genomic data, with a variety of population and genome structures and trait genetic architectures. It provides flexible parameter settings for a wide discipline of users, especially can simulate multiple genetically correlated traits with desired genetic parameters and underlying genetic architectures.</span></p><p>Address of the bookmark: <a href="https://github.com/SCAU-AnimalGenetics/GPOPSIM" rel="nofollow">https://github.com/SCAU-AnimalGenetics/GPOPSIM</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/37527/nanopack-visualizing-and-processing-long-read-sequencing-data</guid>
	<pubDate>Fri, 10 Aug 2018 18:41:34 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/37527/nanopack-visualizing-and-processing-long-read-sequencing-data</link>
	<title><![CDATA[NanoPack: visualizing and processing long-read sequencing data]]></title>
	<description><![CDATA[<p>The NanoPack tools are written in Python3 and released under the GNU GPL3.0 License. The source code can be found at&nbsp;<a href="https://github.com/wdecoster/nanopack" target="">https://github.com/wdecoster/nanopack</a>, together with links to separate scripts and their documentation. The scripts are compatible with Linux, Mac OS and the MS Windows 10 subsystem for Linux and are available as a graphical user interface, a web service at&nbsp;<a href="http://nanoplot.bioinf.be/" target="">http://nanoplot.bioinf.be</a>&nbsp;and command line tools.</p>
<p>&nbsp;https://academic.oup.com/bioinformatics/article/34/15/2666/4934939</p><p>Address of the bookmark: <a href="https://github.com/wdecoster/nanoQC" rel="nofollow">https://github.com/wdecoster/nanoQC</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42271/mcclintock-meta-pipeline-to-identify-transposable-element-insertions-using-next-generation-sequencing-data</guid>
	<pubDate>Tue, 27 Oct 2020 00:21:18 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42271/mcclintock-meta-pipeline-to-identify-transposable-element-insertions-using-next-generation-sequencing-data</link>
	<title><![CDATA[McClintock: Meta-pipeline to identify transposable element insertions using next generation sequencing data]]></title>
	<description><![CDATA[<p><span>an integrated bioinformatics pipeline for the detection of TE insertions in whole-genome shotgun data, called McClintock (</span><a href="https://github.com/bergmanlab/mcclintock">https://github.com/bergmanlab/mcclintock</a><span>), which automatically runs and standardizes output for multiple TE detection methods. We demonstrate the utility of McClintock by evaluating six TE detection methods using simulated and real genome data from the model microbial eukaryote,&nbsp;</span><em>Saccharomyces cerevisiae</em><span>.&nbsp;</span></p><p>Address of the bookmark: <a href="https://github.com/bergmanlab/mcclintock" rel="nofollow">https://github.com/bergmanlab/mcclintock</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/44387/creating-genetic-maps-from-gbs-data</guid>
	<pubDate>Fri, 08 Sep 2023 06:31:24 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/44387/creating-genetic-maps-from-gbs-data</link>
	<title><![CDATA[Creating Genetic Maps from GBS data]]></title>
	<description><![CDATA[<p><span>Genetic map, as the name suggest is simply knowing the relative positions of specific sequences across the genome. There are various methods to generate them, but most popular method is to use a cross between the known parents and examining their progenies. These kinds of crosses to create specific group of individuals of known ancestry is called as mapping population. Many types of mapping population exist. Here we will use the data collected from a Recombinant Inbred Line (RIL) (through selfing) to create a genetic map.</span></p><p>Address of the bookmark: <a href="https://bioinformaticsworkbook.org/dataAnalysis/GenomeAssembly/GeneticMaps/creating-genetic-maps.html" rel="nofollow">https://bioinformaticsworkbook.org/dataAnalysis/GenomeAssembly/GeneticMaps/creating-genetic-maps.html</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/41328/deephic-a-generative-adversarial-network-for-enhancing-hi-c-data-resolution</guid>
	<pubDate>Tue, 03 Mar 2020 01:12:47 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/41328/deephic-a-generative-adversarial-network-for-enhancing-hi-c-data-resolution</link>
	<title><![CDATA[DeepHiC: A Generative Adversarial Network for Enhancing Hi-C Data Resolution]]></title>
	<description><![CDATA[<p><strong>DeepHiC</strong> is a GAN-based model for enhancing Hi-C data resolution. We developed this server for helping researchers to enhance their own low-resolution data by a few steps of clicks. <em>Ab initio</em> training could be performed according to our published <a href="https://github.com/omegahh/DeepHiC">code</a>. We provided trained models for various depth of low-coverage sequencing Hi-C data. The depth of input data is estimated by its distribution comparing with those of the downsampled Hi-C data we used in training</p><p>Address of the bookmark: <a href="http://sysomics.com/deephic" rel="nofollow">http://sysomics.com/deephic</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/view/982</guid>
	<pubDate>Wed, 17 Jul 2013 15:25:09 -0500</pubDate>
	<link>https://bioinformaticsonline.com/view/982</link>
	<title><![CDATA[Is reference genome necessary for gene expression study in transcriptome sequencing or for variant discovery in genome sequencing?]]></title>
	<description><![CDATA[<p><span>Like in case of plant genomes where nature of genome is too complex and huge in size to accomplish complete<em> de novo</em> assembly by current sequencing technology. What would be alternate solution? Can we live in reference free world?</span></p>]]></description>
	<dc:creator>Rahul Agarwal</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29583/graph-genome-suite</guid>
	<pubDate>Fri, 28 Oct 2016 07:59:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29583/graph-genome-suite</link>
	<title><![CDATA[Graph Genome Suite]]></title>
	<description><![CDATA[<p><span>Seven Bridges is the biomedical data analysis company accelerating breakthroughs in genomics research for cancer, drug development and precision medicine. We build self-improving systems to analyze millions of genomes, including the&nbsp;</span><strong>Graph Genome Suite</strong><span>&nbsp;&mdash; the most advanced population genomics tools in the world.</span></p><p>Address of the bookmark: <a href="https://www.sbgenomics.com/graph/" rel="nofollow">https://www.sbgenomics.com/graph/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40613/genome-in-a-bottle-giab-consortium</guid>
	<pubDate>Sat, 25 Jan 2020 13:50:52 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40613/genome-in-a-bottle-giab-consortium</link>
	<title><![CDATA[Genome in a Bottle (GIAB) Consortium]]></title>
	<description><![CDATA[<p><span>The</span><a href="http://www.genomeinabottle.org/"> Genome in a Bottle (GIAB) Consortium</a><span> is a public-private-academic consortium hosted by </span><a href="http://www.nist.gov/" target="_blank">NIST</a><span> to develop the technical infrastructure (reference standards, reference methods, and reference data) to enable translation of whole human genome sequencing to clinical practice. </span></p>
<p><span><a href="https://www.nist.gov/news-events/news/2016/09/nist-releases-new-family-standardized-genomes">https://www.nist.gov/news-events/news/2016/09/nist-releases-new-family-standardized-genomes</a></span></p><p>Address of the bookmark: <a href="https://jimb.stanford.edu/giab/" rel="nofollow">https://jimb.stanford.edu/giab/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44751/large-language-models-in-bioinformatics-transforming-data-analysis-and-interpretation</guid>
	<pubDate>Thu, 02 Jan 2025 11:26:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44751/large-language-models-in-bioinformatics-transforming-data-analysis-and-interpretation</link>
	<title><![CDATA[Large Language Models in Bioinformatics: Transforming Data Analysis and Interpretation]]></title>
	<description><![CDATA[<p>The integration of artificial intelligence (AI) into bioinformatics has ushered in a new era of computational biology. Among the most transformative advancements are large language models (LLMs), such as GPT and BERT, which leverage deep learning to process and interpret vast amounts of text data. These models are reshaping bioinformatics by enhancing data analysis, hypothesis generation, and literature mining.</p><h3>Understanding Large Language Models</h3><p>LLMs are AI systems trained on extensive datasets of natural language. Their ability to model context, identify patterns, and generate coherent language has proven invaluable across domains, including bioinformatics. By fine-tuning these models on biological datasets, researchers can unlock insights into molecular biology, systems biology, and beyond.</p><h3>Key Applications of LLMs in Bioinformatics</h3><h4>1. <strong>Annotating Biological Data</strong></h4><p>Annotating genomic and proteomic data is fundamental yet labor-intensive. LLMs streamline this process by extracting functional annotations from literature and databases, predicting gene and protein functions, and providing automated insights.</p><h4>2. <strong>Mining Scientific Literature</strong></h4><p>The exponential growth of publications presents a challenge for researchers to stay updated. LLMs can process large volumes of text to extract key findings, summarize papers, and identify trends, thereby facilitating efficient literature reviews.</p><h4>3. <strong>Predicting Gene and Protein Functions</strong></h4><p>By leveraging sequence data and annotations, LLMs can predict the functions of uncharacterized genes and proteins. This capability is particularly useful for studying non-model organisms and orphan genes.</p><h4>4. <strong>Drug Discovery and Repurposing</strong></h4><p>LLMs enable pattern recognition across chemical, genomic, and clinical datasets, identifying novel drug candidates and repurposing existing drugs for new therapeutic targets. They can simulate interactions between drugs and biological molecules, accelerating the discovery pipeline.</p><h4>5. <strong>Generating Hypotheses for Research</strong></h4><p>LLMs analyze complex datasets to propose testable hypotheses. For example, they can predict protein-protein interactions, identify regulatory motifs, or model evolutionary processes in genomes.</p><h3>Advantages of LLMs in Bioinformatics</h3><ul>
<li>
<p><strong>Scalability:</strong> LLMs process massive datasets rapidly, reducing the time required for data analysis.</p>
</li>
<li>
<p><strong>Versatility:</strong> These models adapt to diverse bioinformatics tasks, from genomic annotation to network analysis.</p>
</li>
<li>
<p><strong>Contextual Insights:</strong> By synthesizing information across disparate datasets, LLMs provide integrative insights into biological systems.</p>
</li>
</ul><h3>Challenges in Applying LLMs</h3><p>Despite their promise, LLMs face limitations:</p><ul>
<li>
<p><strong>Data Quality and Bias:</strong> Inaccurate or biased datasets can affect model predictions, necessitating rigorous data curation.</p>
</li>
<li>
<p><strong>Interpretability:</strong> Understanding the decision-making process of LLMs remains a critical challenge, especially in high-stakes fields like genomics and medicine.</p>
</li>
<li>
<p><strong>Resource Intensity:</strong> Training and deploying LLMs require substantial computational power, which can limit accessibility.</p>
</li>
<li>
<p><strong>Ethical Concerns:</strong> Handling sensitive genomic data raises privacy and security issues, emphasizing the need for ethical guidelines.</p>
</li>
</ul><h3>Future Prospects</h3><p>The continued development of LLMs tailored for bioinformatics promises exciting advancements. Specialized models trained on omics data, open-access platforms, and interdisciplinary collaborations will expand the utility of LLMs. Moreover, integrating LLMs with other AI technologies, such as graph neural networks and reinforcement learning, can unlock deeper biological insights.</p><h3>Conclusion</h3><p>Large language models are revolutionizing bioinformatics by addressing longstanding challenges in data annotation, literature mining, and function prediction. Their ability to analyze complex biological datasets efficiently positions them as indispensable tools for modern research. As bioinformatics embraces AI, the synergy between LLMs and biological sciences holds the potential to unravel the complexities of life with unprecedented precision and scale.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/35292/pgap-x-extension-on-pan-genome-analysis-pipeline</guid>
	<pubDate>Tue, 23 Jan 2018 11:41:43 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/35292/pgap-x-extension-on-pan-genome-analysis-pipeline</link>
	<title><![CDATA[PGAP-X: Extension on pan-genome analysis pipeline]]></title>
	<description><![CDATA[<p>PGAP-X is a microbial comparative genomic analysis platform with graphic interface. Serials of algorithms and methodologies have been developed and integrated to analyze and visualize genomics structure variation, gene distribution with different conservative levels, and genetic variation from pan-genome sight. At the same time, analytical result data from many other programs, including genome alignment result and orthologs clusters, are also supported to be further analyzed or visualized in PGAP-X. The workflow and feature snapshot in PGAP-X were shown as Fig.1 and Fig.2.</p>
<div><img src="https://pgapx.ybzhao.com/image/f1.jpg" alt="image" style="border: 0px; border: 0px;"></div>
<div>&nbsp;</div>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://pgapx.ybzhao.com/" rel="nofollow">https://pgapx.ybzhao.com/</a></p>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>

</channel>
</rss>