<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/30140?offset=200</link>
	<atom:link href="https://bioinformaticsonline.com/related/30140?offset=200" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32152/upsetr-shiny-app</guid>
	<pubDate>Fri, 14 Apr 2017 06:19:54 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32152/upsetr-shiny-app</link>
	<title><![CDATA[UpSetR Shiny App!]]></title>
	<description><![CDATA[<p>UpSetR generates static&nbsp;<a href="http://vcg.github.io/upset/?dataset=0&amp;duration=1000&amp;orderBy=subsetSize&amp;grouping=groupByIntersectionSize&amp;selection=">UpSet plots</a>. The UpSet technique visualizes set intersections in a matrix layout and introduces aggregates based on groupings and queries. The matrix layout enables the effective representation of associated data, such as the number of elements in the aggregates and intersections, as well as additional summary statistics derived from subset or element attributes.</p>
<h4>To begin, input your data using one of the three input styles.</h4>
<ol>
<li>"File" takes a correctly formatted.csv file.</li>
<li>"List" takes up to 6 different lists that contain unique elements, similar to that used in the web applications BioVenn&nbsp;<a href="http://www.biomedcentral.com/content/pdf/1471-2164-9-488.pdf">(Hulsen et al., 2008)</a>&nbsp;and jvenn&nbsp;<a href="http://www.biomedcentral.com/content/pdf/1471-2105-15-293.pdf">(Bardou et al., 2014)</a></li>
<li>"Expression" takes the input used by the venneuler R package&nbsp;<a href="https://cran.r-project.org/web/packages/venneuler/venneuler.pdf">(Wilkinson, 2015)</a></li>
</ol><p>Address of the bookmark: <a href="https://gehlenborglab.shinyapps.io/upsetr/" rel="nofollow">https://gehlenborglab.shinyapps.io/upsetr/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/28119/kraken-ultrafast-metagenomic-sequence-classification-using-exact-alignments</guid>
	<pubDate>Mon, 27 Jun 2016 11:01:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/28119/kraken-ultrafast-metagenomic-sequence-classification-using-exact-alignments</link>
	<title><![CDATA[Kraken: ultrafast metagenomic sequence classification using exact alignments]]></title>
	<description><![CDATA[<p>Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster abundance estimation programs, which only classify small subsets of metagenomic data. Using exact alignment of <em>k</em>-mers, Kraken achieves classification accuracy comparable to the fastest BLAST program. In its fastest mode, Kraken classifies 100 base pair reads at a rate of over 4.1 million reads per minute, 909 times faster than Megablast and 11 times faster than the abundance estimation program MetaPhlAn. Kraken is available at <a href="http://ccb.jhu.edu/software/kraken/" target="pmc_ext">http://ccb.jhu.edu/software/kraken/</a>.</p>
<p>Krona</p>
<p>https://sourceforge.net/p/krona/home/krona/</p><p>Address of the bookmark: <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053813/" rel="nofollow">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053813/</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32483/cla-contig-layout-authenticator</guid>
	<pubDate>Fri, 05 May 2017 05:58:36 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32483/cla-contig-layout-authenticator</link>
	<title><![CDATA[CLA: Contig-Layout-Authenticator]]></title>
	<description><![CDATA[<p><span>To improve upon the shortcomings associated with the construction of draft genomes with Illumina paired-end sequencing, we developed Contig-Layout-Authenticator (CLA). The CLA pipeline can scaffold reference-sorted contigs based on paired reads, resulting in better assembled genomes. Moreover, CLA also hints at probable misassemblies and contaminations, for the users to cross-check before constructing the consensus draft. The CLA pipeline was designed and trained extensively on various bacterial genome datasets for the ordering and scaffolding of large repetitive contigs. The tool has been validated and compared favorably with other widely-used scaffolding and ordering tools using both simulated and real sequence datasets. CLA is a user friendly tool that requires a single command line input to generate ordered scaffolds.</span></p>
<p><span>Script&nbsp;https://sourceforge.net/projects/c-l-authenticator/files/</span></p><p>Address of the bookmark: <a href="http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0155459" rel="nofollow">http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0155459</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/blog/view/44751/large-language-models-in-bioinformatics-transforming-data-analysis-and-interpretation</guid>
	<pubDate>Thu, 02 Jan 2025 11:26:29 -0600</pubDate>
	<link>https://bioinformaticsonline.com/blog/view/44751/large-language-models-in-bioinformatics-transforming-data-analysis-and-interpretation</link>
	<title><![CDATA[Large Language Models in Bioinformatics: Transforming Data Analysis and Interpretation]]></title>
	<description><![CDATA[<p>The integration of artificial intelligence (AI) into bioinformatics has ushered in a new era of computational biology. Among the most transformative advancements are large language models (LLMs), such as GPT and BERT, which leverage deep learning to process and interpret vast amounts of text data. These models are reshaping bioinformatics by enhancing data analysis, hypothesis generation, and literature mining.</p><h3>Understanding Large Language Models</h3><p>LLMs are AI systems trained on extensive datasets of natural language. Their ability to model context, identify patterns, and generate coherent language has proven invaluable across domains, including bioinformatics. By fine-tuning these models on biological datasets, researchers can unlock insights into molecular biology, systems biology, and beyond.</p><h3>Key Applications of LLMs in Bioinformatics</h3><h4>1. <strong>Annotating Biological Data</strong></h4><p>Annotating genomic and proteomic data is fundamental yet labor-intensive. LLMs streamline this process by extracting functional annotations from literature and databases, predicting gene and protein functions, and providing automated insights.</p><h4>2. <strong>Mining Scientific Literature</strong></h4><p>The exponential growth of publications presents a challenge for researchers to stay updated. LLMs can process large volumes of text to extract key findings, summarize papers, and identify trends, thereby facilitating efficient literature reviews.</p><h4>3. <strong>Predicting Gene and Protein Functions</strong></h4><p>By leveraging sequence data and annotations, LLMs can predict the functions of uncharacterized genes and proteins. This capability is particularly useful for studying non-model organisms and orphan genes.</p><h4>4. <strong>Drug Discovery and Repurposing</strong></h4><p>LLMs enable pattern recognition across chemical, genomic, and clinical datasets, identifying novel drug candidates and repurposing existing drugs for new therapeutic targets. They can simulate interactions between drugs and biological molecules, accelerating the discovery pipeline.</p><h4>5. <strong>Generating Hypotheses for Research</strong></h4><p>LLMs analyze complex datasets to propose testable hypotheses. For example, they can predict protein-protein interactions, identify regulatory motifs, or model evolutionary processes in genomes.</p><h3>Advantages of LLMs in Bioinformatics</h3><ul>
<li>
<p><strong>Scalability:</strong> LLMs process massive datasets rapidly, reducing the time required for data analysis.</p>
</li>
<li>
<p><strong>Versatility:</strong> These models adapt to diverse bioinformatics tasks, from genomic annotation to network analysis.</p>
</li>
<li>
<p><strong>Contextual Insights:</strong> By synthesizing information across disparate datasets, LLMs provide integrative insights into biological systems.</p>
</li>
</ul><h3>Challenges in Applying LLMs</h3><p>Despite their promise, LLMs face limitations:</p><ul>
<li>
<p><strong>Data Quality and Bias:</strong> Inaccurate or biased datasets can affect model predictions, necessitating rigorous data curation.</p>
</li>
<li>
<p><strong>Interpretability:</strong> Understanding the decision-making process of LLMs remains a critical challenge, especially in high-stakes fields like genomics and medicine.</p>
</li>
<li>
<p><strong>Resource Intensity:</strong> Training and deploying LLMs require substantial computational power, which can limit accessibility.</p>
</li>
<li>
<p><strong>Ethical Concerns:</strong> Handling sensitive genomic data raises privacy and security issues, emphasizing the need for ethical guidelines.</p>
</li>
</ul><h3>Future Prospects</h3><p>The continued development of LLMs tailored for bioinformatics promises exciting advancements. Specialized models trained on omics data, open-access platforms, and interdisciplinary collaborations will expand the utility of LLMs. Moreover, integrating LLMs with other AI technologies, such as graph neural networks and reinforcement learning, can unlock deeper biological insights.</p><h3>Conclusion</h3><p>Large language models are revolutionizing bioinformatics by addressing longstanding challenges in data annotation, literature mining, and function prediction. Their ability to analyze complex biological datasets efficiently positions them as indispensable tools for modern research. As bioinformatics embraces AI, the synergy between LLMs and biological sciences holds the potential to unravel the complexities of life with unprecedented precision and scale.</p>]]></description>
	<dc:creator>LEGE</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29280/nemo-%E2%80%93-a-stochastic-individual-base-genetically-explicit-simulation-platform</guid>
	<pubDate>Sat, 01 Oct 2016 14:45:02 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29280/nemo-%E2%80%93-a-stochastic-individual-base-genetically-explicit-simulation-platform</link>
	<title><![CDATA[Nemo – A stochastic, individual-base, genetically explicit simulation platform]]></title>
	<description><![CDATA[<ul>
<li>
<p>A&nbsp;<strong>recombination map</strong>&nbsp;has been added for all multi-locus traits. The map positions (chromosomal) for neutral markers (e.g. SNPs) and loci under selection (QTLs, deleterious mutations, DMIs) can now be specified explicitly, or set at random. The map can hold an unlimited number of loci of different types jointly, at any recombination scale (cM or lower). The effects of linkage can thus be finely explored.</p>
</li>
<li>
<p>A new trait coding for (Bateson-)<strong>Dobzhansky-Muller incompatibility loci</strong>. Multiple haploid or diploid pairs of incompatible loci can be spread throughout the genome and affect individual fitness.</p>
</li>
<li>
<p><strong>Multi-type selection</strong>:&nbsp;<a href="http://nemo2.sourceforge.net/classIndividual.html" title="This class contains traits along with other individual information (sex, pedigree, etc. ).">Individual</a>&nbsp;fitness can be jointly determined by different types of loci under selectinon, such as QTLs coding for quantitative traits under spatially variable selection, universally deleterious mutations, and Dobzhansky-Muller incompatibility loci.</p>
</li>
<li>
<p><strong>An unlimited number of quantitative traits</strong>&nbsp;under different forms of selection can be modelled, based on universally pleiotropic loci with several bi- or multi-allelic models.</p>
</li>
<li>
<p><strong>Spatial and temporal variation of selection</strong>&nbsp;on quantitative traits is possible, modelling shifts of environmental conditions over time.</p>
</li>
<li>
<p>The dispersal matrix describing the movement of individuals among sub-populations can be replaced by a connectivity matrix and a reduced dispersal matrix describing migration only among the connected sub-populations. This offers a substantial gain in computing time and system memory when simulating very large grids.</p>
</li>
<li>
<p>Input parameters' arguments may be specified in separate files. This is particularly convenient when specifying large matrices.</p>
</li>
<li>
<p>Many adjustments have been made for refined control of the input of parameters and data output. See updates in the manual.</p>
</li>
</ul><p>Address of the bookmark: <a href="http://nemo2.sourceforge.net/index.html" rel="nofollow">http://nemo2.sourceforge.net/index.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29379/bbmap-help</guid>
	<pubDate>Mon, 10 Oct 2016 06:29:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29379/bbmap-help</link>
	<title><![CDATA[BBMap help]]></title>
	<description><![CDATA[<div>
<div>BBMAP <span> &bull; <span>a solution for everything</span></span><a href="https://www.biostarhandbook.com/"><span></span></a></div>
<div>That content has been reformatted and it is being expanded to include more information.<span><span></span></span></div>
</div>
<hr>
<p>There are common options for most BBMap suite programs and depending on the file extension the input/output format is automatically chosen/set.</p>
<hr>
<h3>Using BBMap</h3>
<h4>Mapping Nanopore reads</h4>
<p>BBMap.sh has a length cap of 6kbp. Reads longer than this will be broken into 6kbp pieces and mapped independently.</p>
<p>More at https://www.biostarhandbook.com/tools/bbmap/bbmap-help.html</p><p>Address of the bookmark: <a href="https://www.biostarhandbook.com/tools/bbmap/bbmap-help.html" rel="nofollow">https://www.biostarhandbook.com/tools/bbmap/bbmap-help.html</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/29574/beagle</guid>
	<pubDate>Thu, 27 Oct 2016 11:19:00 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/29574/beagle</link>
	<title><![CDATA[Beagle]]></title>
	<description><![CDATA[<p>Beagle is a software package that performs genotype calling, genotype phasing, imputation of ungenotyped markers, and identity-by-descent segment detection.</p>
<p>Beagle version 4.1 has a more accurate genotype phasing algorithm and a very fast and accurate genotype imputation algorithm. Version 4.1 also has several changes to the command line arguments which are described in the&nbsp;<a href="http://faculty.washington.edu/browning/beagle/release_notes" target="_blank">release notes</a>. The "ped" argument has no effect in version 4.1. If your data contains nuclear families and you want to model the parent-offspring relationships when phasing genotypes, please use&nbsp;<a href="https://faculty.washington.edu/browning/beagle/b4_0.html">version 4.0</a>.</p>
<p>If you use Beagle 4.1 in a published analysis, please report the program version and cite the appropriate article.</p>
<p>The citation for Beagle's phasing algorithm is:</p>
<p>S R Browning and B L Browning (2007) Rapid and accurate haplotype phasing and missing data inference for whole genome association studies by use of localized haplotype clustering. Am J Hum Genet 81:1084-1097.<a href="http://dx.doi.org/doi:10.1086/521987" target="_blank">doi:10.1086/521987</a></p>
<p>The citation for Beagle's genotype imputation algorithm is:</p>
<p>B L Browning and S R Browning (2016). Genotype imputation with millions of reference samples. Am J Hum Genet 98:116-126.<a href="http://dx.doi.org/doi:10.1016/j.ajhg.2015.11.020" target="_blank">doi:10.1016/j.ajhg.2015.11.020</a></p>
<p>The citation for Beagle's IBD detection algorithm is:</p>
<p>B L Browning and S R Browning (2013). Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194(2):459-71.<a href="http://dx.doi.org/doi:10.1534/genetics.113.150029" target="_blank">doi:10.1534/genetics.113.150029</a></p><p>Address of the bookmark: <a href="http://faculty.washington.edu/browning/beagle/beagle.html" rel="nofollow">http://faculty.washington.edu/browning/beagle/beagle.html</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/29601/statistics-using-r-with-biological-examples</guid>
	<pubDate>Thu, 03 Nov 2016 04:55:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/file/view/29601/statistics-using-r-with-biological-examples</link>
	<title><![CDATA[Statistics Using R   with Biological Examples]]></title>
	<description><![CDATA[<p>This book is a manifestation of my desire to teach researchers in biology a bit more about statistics than an ordinary introductory course covers and to introduce the utilization of R as a tool for analyzing their data. My goal is to reach those with little or no training in higher level statistics so that they can do more of their own data analysis, communicate more with statisticians, and appreciate the great potential statistics has to offer as a tool to answer biological questions. </p><p>This is necessary in light of the increasing use of higher level statistics in biomedical research. I hope it accomplishes this mission and encourage its free distribution and use as a course text or supplement.</p><p>K Seefeld, May 2007</p>]]></description>
	<dc:creator>Neel</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/29601" length="4581031" type="application/pdf" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/42559/sample-bandage-input-file-for-visual-analysis</guid>
	<pubDate>Wed, 06 Jan 2021 03:51:50 -0600</pubDate>
	<link>https://bioinformaticsonline.com/file/view/42559/sample-bandage-input-file-for-visual-analysis</link>
	<title><![CDATA[Sample bandage input file for visual analysis]]></title>
	<description><![CDATA[<p>Sample bandage input file for visual analysis ...</p>]]></description>
	<dc:creator>Jit</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/42559" length="112199" type="text/plain" />
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/31018/j-circos</guid>
	<pubDate>Fri, 17 Feb 2017 09:06:54 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/31018/j-circos</link>
	<title><![CDATA[J-Circos]]></title>
	<description><![CDATA[<p>Circos plot tool (J-Circos) that is an interactive visualization tool that can plot Circos figures, as well as being able to dynamically add data to the figure, and providing information for specific data points using mouse hover display and zoom in/out functions. J-Circos uses the Java computer language to enable it to be used on most operating systems (Windows, MacOS, Linux). Users can input data into J-Circos using flat data formats, as well as from the GUI. J-Circos will enable biologists to better study more complex chromosomal interactions and fusion transcripts that are otherwise difficult to visualize from next-generation sequencing data.</p><p>Address of the bookmark: <a href="http://www.australianprostatecentre.org/research/software/jcircos" rel="nofollow">http://www.australianprostatecentre.org/research/software/jcircos</a></p>]]></description>
	<dc:creator>Shruti Paniwala</dc:creator>
</item>

</channel>
</rss>