<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Related items]]></title>
	<link>https://bioinformaticsonline.com/related/30090?offset=990</link>
	<atom:link href="https://bioinformaticsonline.com/related/30090?offset=990" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/40099/contiguator</guid>
	<pubDate>Fri, 04 Oct 2019 01:27:58 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/40099/contiguator</link>
	<title><![CDATA[CONTIGuator !]]></title>
	<description><![CDATA[<p><span>CONTIGuator is a Python script for Linux environments whose purpose is to speed-up the bacterial genome assembly process and to obtain a first insight of the genome structure using the well-known artemis comparison tool (ACT).</span></p>
<p>&nbsp;</p><p>Address of the bookmark: <a href="https://sourceforge.net/projects/contiguator/" rel="nofollow">https://sourceforge.net/projects/contiguator/</a></p>]]></description>
	<dc:creator>BioStar</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/32631/barrnap-bacterial-ribosomal-rna-predictor</guid>
	<pubDate>Fri, 12 May 2017 09:24:41 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/32631/barrnap-bacterial-ribosomal-rna-predictor</link>
	<title><![CDATA[Barrnap: Bacterial ribosomal RNA predictor]]></title>
	<description><![CDATA[<p>Barrnap predicts the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S).</p>
<p>It takes FASTA DNA sequence as input, and write GFF3 as output. It uses the new NHMMER tool that comes with HMMER 3.1 for HMM searching in RNA:DNA style. NHMMER binaries for 64-bit Linux and Mac OS X are included and will be auto-detected. Multithreading is supported and one can expect roughly linear speed-ups with more CPUs.&nbsp;</p><p>Address of the bookmark: <a href="https://github.com/tseemann/barrnap" rel="nofollow">https://github.com/tseemann/barrnap</a></p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/view/34362</guid>
	<pubDate>Thu, 16 Nov 2017 08:47:52 -0600</pubDate>
	<link>https://bioinformaticsonline.com/view/34362</link>
	<title><![CDATA[Tryst with a Bioinformatician # Dr Altan Kara]]></title>
	<description><![CDATA[<p style="text-align: justify;">&nbsp;</p><p style="text-align: justify;"><a href="http://bioinformaticsonline.com/profile/altan"><strong>Dr Altan Kara</strong></a> is a Bioinformatics specialist at the faculty of Gene Engineering and Biotechnology Institute at TUBITAK MAM Research Center. His research interest revolves around the cancer informatics and computational aided-drug design. I applaud Dr Altan for clearly setting out both his expectations of people that join his lab/university in addition to listing his responsibilities to his research members at TUBITAK MAM Research Instit&uuml;te. Hopefully, this interview will prove useful to others in the field, especially to those who are just starting their bioinformatics careers.</p><p style="text-align: justify;"><img src="https://photos-4.dropbox.com/t/2/AACboDtsdWXl6WLM8ijWiKVTxcLCdQaHuOxglRGVSIYqlQ/12/85115969/jpeg/32x32/1/_/1/2/altanLondon.JPG/EOfXoUIYmJ8CIAcoBw/HYCj2M1qYATfPnq3Lg_ETCtxjGzDJ34mwQP0ycTpMMM?size=1280x960&amp;size_mode=3" alt="image" width="720" height="720" style="border: 0px; border: 0px;"></p><p style="text-align: justify;">You can find out more about Dr Altan by visiting his (well documented) lab page (<a href="http://gmbe.mam.tubitak.gov.tr/en">http://gmbe.mam.tubitak.gov.tr/en</a>) and BOL page <a href="http://bioinformaticsonline.com/profile/altan">http://bioinformaticsonline.com/profile/altan</a> . And now, on to the BOL:&ldquo;Tryst with a Bioinformatician&rdquo; interview series ...</p><ul>
<li>
<p style="text-align: justify;"><strong>What push you to join Computational Biology/Bioinformatics?</strong></p>
</li>
</ul><p style="text-align: justify;">According to me, bioinformatics is the center of modern biological research and if a researcher wants to discover new biological insights by evaluating the globally produced biological data to derivate unified solutions for specific biological problems, learning bioinformatics is the only way to achieve this goal.</p><ul>
<li>
<p style="text-align: justify;"><strong>What fascinates you about Computational Biology/Bioinformatics?</strong></p>
</li>
</ul><p style="text-align: justify;">It's flexibility. As well known, there are highly diverse and complex biological questions are waiting to be enlightened and it's impossible to bring solutions to this diversity by using similar approaches. Thus, the employed method has to be unique for the targeted biological problem and by using bioinformatics tools this can be easily achieved.&nbsp;</p><ul>
<li>
<p style="text-align: justify;"><strong>What is the </strong><em><strong>one word</strong></em><strong> you would use to </strong><em><strong>describe yourself</strong></em><strong>?</strong></p>
</li>
</ul><p>Bioinformatician. :)</p><ul>
<li>
<p style="text-align: justify;"><strong>Can you please describe your research work in a nutshell for BOL users.</strong></p>
</li>
</ul><p style="text-align: justify;">At my current Institute, I am working in the field of cancer bioinformatics. Briefly, the overall aim of the project which I am working for (AKMARK (Project CODE:5153403)) is, applying a bioinformatics-supported genome, transcriptome, proteome, and metabolome analysis to reveal the molecular profile of the disease through an integrated approach, and to develop an early diagnosis and scanning kit based on this profile. Alterations in the gene, transcript, protein, and metabolite profiles between normal tissue, normal tissue adjoined to the tumor (reactive stroma), tumor tissue, lymph node metastasis, and blood samples taken from the same patient and the reflection of these changes in some other selected body fluids will be revealed within the scope of the project. The molecular structures involved in the development and progression of NSCLC will be determined and relations with the clinical, tumor-node-metastasis (TNM) staging and histology will be made. The development of a diagnostic kit for immediate clinical purposes and an electrochemical biosensor for quick on-site applications are targeted through the development of a number of antibody and aptamer formed against the most specific biomarker selected from the panel.</p><ul>
<li>
<p style="text-align: justify;"><strong>Is there anything else we should know about you and your research?</strong></p>
</li>
</ul><p style="text-align: justify;">Besides AKMARK, I am also in preparation of having a side project that aims for the development of a computational method to design inhibitors for prokaryotic two-component systems. In this project, I will be in collaboration with Prof. Maria Kontoyianni, SIUE: Southern Illinois University Edwardsville, School of Pharmacy.</p><ul>
<li>
<p style="text-align: justify;"><strong>What was your greatest scientific disappointment in life till now?</strong></p>
</li>
</ul><p>So far I do not experience any memorable scientific disappointment in my life. :)</p><ul>
<li>
<p style="text-align: justify;"><strong>What major research challenges and problems did you face yet? How did you handle them? </strong></p>
</li>
</ul><p style="text-align: justify;">The major challenge which I faced so far in my scientific career was predicting the interaction between the prokaryotic two-component proteins. To be able to accurately predict the interactions between these proteins, I create a meta-predictor by using a support vector machine. By using this technique I integrated six different protein-protein interaction methods in a way to cover disadvantage of one method with the advantage of another one. The meta-predictor which I developed during this work is accessible via <a href="http://metapred2cs.ibers.aber.ac.uk/">http://metapred2cs.ibers.aber.ac.uk/</a> and for more detailed information about the system the articles with the PMID IDs; PMID: 27378293 and PMID: 26384938 can be read.</p><ul>
<li>
<p style="text-align: justify;"><strong>What's your all-time favourite bioinformatics package, and why?</strong></p>
</li>
</ul><p style="text-align: justify;">For me, the best bioinformatics package is R/Bioconductor. The reason why I like this package is, it provides lots of useful tools for comprehensive analysis and comparison of high-throughput experimental data in an integrated manner and besides lots of the packages it provides, it is open source and also open for development. As a result, it provides strong and flexible ways to do science.</p><ul>
<li>
<p style="text-align: justify;"><strong>In bioinformatics, do you see yourself in which of the following roles-scientist, analyst, developer, engineer or pure academician?</strong></p>
</li>
</ul><p>Scientist / Developer.</p><ul>
<li>
<p style="text-align: justify;"><strong>What will you like to accomplish in next five years / ten years? </strong></p>
</li>
</ul><p style="text-align: justify;">For my current research, I would like to design a pipeline to automatically integrate and analyse omics data for cancer research which will be specifically aiming for biomarker and novel drug target discovery. In addition to this, I also like to develop another pipeline for prokaryotic TCS protein structure prediction and inhibitor design.</p><ul>
<li>
<p style="text-align: justify;"><strong>When you will be retired, what would you tell next generation bioinformaticians?</strong></p>
</li>
</ul><p style="text-align: justify;">Bioinformatics is not all about scripting and researchers who study in this field should never expect a tool to do their analyses for them. Besides computational skills, a bioinformatician must have a strong biological background in his/her research area which will allow them to understand if anything went wrong during their run by only looking at the results instead of just blindly trusting the output of the bioinformatics tools.</p><ul>
<li>
<p style="text-align: justify;"><strong>What you always miss in bioinformatics when you will no longer working in this field?</strong></p>
</li>
</ul><p style="text-align: justify;">Bioinformatics is open to doing multi-discipliner research with scientists all around the world. As a result, while I studying in this field I can interactively learn a lot from wide range research community. I think this is the one thing which I will miss the most.</p><ul>
<li>
<p style="text-align: justify;"><strong>If there will be bioinformatics company owned by you in future, What are your company focus and aim?</strong></p>
</li>
</ul><p style="text-align: justify;">With the increasing amount of data in databases, there is already a massive need for effective methods to eliminate the manipulated data and reach to clean/useful information. As days pass, the requirement of data mining will be the first step of any research project. For this reason, the major goal of my bioinformatics company will be developing effective tools to eliminate manipulated datasets and information that exist in the literature and provide trustworthy clean information/datasets for researchers.</p><ul>
<li>
<p style="text-align: justify;"><strong>How much bioinformatics change in 2050, according to your wild imagination?</strong></p>
</li>
</ul><p style="text-align: justify;">Bioinformatics is a field that constantly and dynamically changes. As the bioinformatics progress, new tools and methods become available and they provide a better application of existing methods or totally new methods that offer an alternative solution to various biological problems. A long with these updates, developers also provide easy to use GUIs for most of the tools. Considering this, if the field carries on developing like this, every single researcher with a strong biological background can be able to perform bioinformatics analyses by him/herself without needing a professional help. As a result, almost all of the bioinformaticians will be responsible just for development of new methods/tools.</p><ul>
<li>
<p style="text-align: justify;"><strong>What would one piece of advice you give someone who's trying to reinvent themselves and enter into bioinformatics sector?</strong></p>
</li>
</ul><p style="text-align: justify;">Bioinformatics is a wide field with a lot of career options. Thus, if a researcher likes to step into this field first he/she should be clear about the branch of the bioinformatics they like to study in. Following to this decision they should first learn at least one programing language and investigate the ways of how other researcher employed that language in their researches and WHY? A researcher, in this field, should never create and use copy paste scripts but always must understand WHY the other researcher worked in that way. Knowing the answer of this question is the only way to learn bioinformatics. Besides, a researcher in the field of bioinformatics (from any branch) must always be good about the environmental control. In other words, one should always easily control input output directories, modify files or directories, annotate and modify employed scripts during the research and should not allow any confusion during the different stages of the research. Finally, they should not blindly trust the output of a tool/software but do a benchmarking test for each of the tools which they decided to utilise in their research. In addition to this, even if the tools pass the benchmarking, researchers should have a good biological background in their field to tell if anything when wrong during the process by only looking the output(s) of the employed pipelines/packages/tools.&nbsp;&nbsp;</p><p style="text-align: justify;">&nbsp;</p>]]></description>
	<dc:creator>Jitendra Narayan</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42160/vicuna-a-software-tool-that-enables-consensus-assembly-of-ultra-deep-sequence-derived-from-diverse-viral-or-other-heterogeneous-populations</guid>
	<pubDate>Tue, 25 Aug 2020 03:40:17 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42160/vicuna-a-software-tool-that-enables-consensus-assembly-of-ultra-deep-sequence-derived-from-diverse-viral-or-other-heterogeneous-populations</link>
	<title><![CDATA[VICUNA: a software tool that enables consensus assembly of ultra-deep sequence derived from diverse viral or other heterogeneous populations.]]></title>
	<description><![CDATA[<p><span>VICUNA</span><span>&nbsp;is a&nbsp;</span><em>de novo</em><span>&nbsp;assembly program targeting populations with high mutation rates. It creates a single linear representation of the mixed population on which intra-host variants can be mapped. For clinical samples rich in contamination (e.g., &gt;95%), VICUNA can leverage existing genomes, if available, to assemble only target-alike reads. After initial assembly, it can also use existing genomes to perform guided merging of contigs. For each data set (e.g., Illumina paired read, 454), VICUNA outputs consensus sequence(s) and the corresponding multiple sequence alignment of constituent reads. VICUNA efficiently handles ultra-deep sequence data with tens of thousands fold coverage.</span></p>
<p><a href="http://software.broadinstitute.org/viral/docs/vicuna_v1.0.pdf">http://software.broadinstitute.org/viral/docs/vicuna_v1.0.pdf</a></p><p>Address of the bookmark: <a href="https://www.broadinstitute.org/viral-genomics/vicuna" rel="nofollow">https://www.broadinstitute.org/viral-genomics/vicuna</a></p>]]></description>
	<dc:creator>biogeek</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/34814/bioinformatics-web-application-development-with-perl</guid>
	<pubDate>Tue, 26 Dec 2017 18:14:11 -0600</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/34814/bioinformatics-web-application-development-with-perl</link>
	<title><![CDATA[Bioinformatics Web Application Development with Perl]]></title>
	<description><![CDATA[<div><p>Perl's second wave of adoption came from the growth of the world wide web. Dynamic web pages&mdash;the precursor to modern web applications&mdash;were easy to create with Perl and CGI. Thanks to Perl's ubiquity as a language for system administrators and its power to manipulate text, it was the default choice for web programming. Its presence everywhere made it popular and, in some ways, the duct tape of the Internet.</p><h4>Web Application Development</h4><p>The old days of CGI programs and the simple development style that represented seem clunky. Web pages have become web applications. Development has moved from generating static HTML to both client and server side programming, with rich client interfaces and powerful backends.</p><p>Perl is still well suited for developing modern web apps. The language grows more powerful and easier to use every year, the available libraries are wonderful and keep getting better, and the inventions and discoveries available in modern Perl are unsurpassed.</p><p>In particular, a modern Perl developer can do amazing things with modern Perl tools. If you still think of Perl web development as a&nbsp;<em>cgi-bin</em>&nbsp;directory full of messy scripts that spew warnings to STDERR, you're a decade out of date. Better yet, you can replace that mess piecemeal, thanks to the new tools and techniques of modern Perl. See, for example, the ever-growing list of technologies&nbsp;<a href="http://www.builtinperl.com/">Built in Perl</a>.</p><h4>Modern Perl Web Frameworks</h4><p>While the old wave of web development may have made the CGI.pm module central, modern Perl web programming follows a stricter separation of business logic, URL and request routing, and output. The days of slinging a string here, an array there, a Perl hash yonder, declaring every variable at the top of the program, and maybe making a subroutine are gone. The Perl world has seen the value of abstraction and ways to mechanize away boilerplate. Perl has dozens of frameworks and toolkits designed to make web development and deployment simpler.</p><p>Any of a dozen of these frameworks will help you do great things, but three in particular stand out. You can build web sites and web applications of tremendous value with all three. These are neither the only good possibilities (think of POE or Jifty or Continuity or...) nor the only mechanisms for web programming with Perl (see Mechanize or LWP or Mojo::UserAgent for more). Yet if you want three good options to choose between, start here.</p><h4>Catalyst</h4><p>The&nbsp;<a href="http://catalystframework.org/">Catalyst</a>&nbsp;framework is a flexible and powerful system for building small to large web apps. It uses the&nbsp;<a href="http://moose.perl.org/">Moose</a>&nbsp;object system to provide great APIs for extension and further development. It's the most mature of the modern top Perl web frameworks, yet it retains its flexibility and vibrancy. In particular, its plugin and extension ecosystem allows it to evolve to provide new and essential features.</p><p>Catalyst has embraced the Plack/PSGI standard for Perl web deployment and recent versions are exploring high-scalability, event-based request handling models.</p><h4>Dancer</h4><p>The&nbsp;<a href="http://perldancer.org/">Dancer</a>&nbsp;framework is deliberately minimal in syntax and scope, but it also has a vibrant plugin ecosystem. Dancer particularly excels for smaller sites and applications, though good programmers can build larger things with it.</p><p>The first version of Dancer was easy to use. Dancer 2 continues that ease while improving the internals and robustness of applications.</p><h4>Mojolicious</h4><p>The&nbsp;<a href="http://mojolicio.us/">Mojolicious</a>&nbsp;(Mojo) framework has a real-time design based on high performance event handling. Its focus is solving new and interesting problems in simple and effective ways, and the project has produced a lot of new code that does old things in better ways.</p><p>In particular, Mojolicious goes to great lengths to support new web standards, such as CSS 3, web sockets, and HTTP 2.</p><p>Where Catalyst embraces the CPAN fully, Mojolicious by design provides most of what an average app might need in a single download. It's still fully compatible with the CPAN, but the intention is to provide good working defaults in a package that's easy to start with. Mojo's fans are quick to praise it as fun to develop.</p><p>A modern Perl web developer should be familiar with at least one of these frameworks.</p><h4>Modern Perl Storage Mechanisms</h4><p>Perl's venerable&nbsp;<a href="http://search.cpan.org/perldoc?DBI">DBI</a>&nbsp;module has been the focal point of database access since its invention. Its design allows it to provide the same interface to huge relational databases and flat files alike through its DBD extension mechanism. Yet the DBI by itself isn't the be-all, end-all of data storage and access in Perl.</p><h4>DBIx::Class</h4><p><a href="http://search.cpan.org/perldoc?DBIx::Class">DBIx::Class</a>&nbsp;sits on top of DBI to provide an API to your database based on the concept of queries and results. This is often sufficient to remove all but the most complicated of SQL from your code, leaving you to manipulate your business models instead of the small details of how a relational database works. The power and maintainability you receive is well the small cost of the learning curve.</p><p>Even better, DBIC can manage (and even generate) your database schema for you.</p><p>Recent versions of DBIC have demonstrated that a well-written ORM can perform much better than even clever hand-written code. Because it builds on the Perl DBI, it scales everywhere from SQLite to PostgreSQL, MySQL, Oracle, and more.</p><h3>Rose::DB</h3><p>The lesser-known but no less powerful&nbsp;<a href="http://search.cpan.org/perldoc?Rose::DB::Object">Rose::DB::Object</a>&nbsp;builds on&nbsp;<a href="http://search.cpan.org/perldoc?Rose::DB">Rose::DB</a>&nbsp;to provide an object-relational mapper for Perl. While its high level features most directly compare to those of DBIx::Class, it's often measurably faster.</p><h4>NoSQL on the CPAN</h4><p>Of course the&nbsp;<a href="http://search.cpan.org/">CPAN</a>&nbsp;has modules for almost any NoSQL database or job queue or persistence mechanism you could name, and several you have never heard of. Everything you need is a quick CPAN or cpanm away!</p><h4>Modern Perl Deployment Strategies</h4><p>In the early days of the web, deploying a Perl web application meant putting one or more&nbsp;<em>.cgi</em>&nbsp;or&nbsp;<em>.pl</em>&nbsp;files in a special directory and hoping that your system administrator had everything configured correctly. The execution model was often slow and cumbersome, and accessing shared resources such as databases was often tricky.</p><p>Modern Perl has better choices. While deployment strategies are the source of many arguments, the return on your investment from learning the modern way is impressive.</p><h4>Plack/PSGI</h4><p>The PSGI specification (as exemplified by&nbsp;<a href="http://plackperl.org/">Plack</a>) describes a strategy for building Perl web apps independent of server and with the possibility to share custom processing behaviors.</p><p>In other words, it's a standard for writing Perl apps to take advantage of the huge ecosystem of Perl development available on the CPAN without tying yourself to a server like Apache, Apache 2, nginx, or anything else.</p><p>Any good modern Perl web framework (including those listed here) supports PSGI. Several deployment mechanisms exist to meet various business needs which also support PSGI. In particular, you can deploy the same application with a local testing server on your own machine as you can to your production server or servers without changing your application at all.</p><h4>mod_perl</h4><p>The older but still viable mod_perl Apache httpd module embeds Perl into the web server. This was the first widespread persistence mechanism for Perl web applications themselves and it's still popular to this day, though PSGI compliance is often the choice for new development. (PSGI handlers to use mod_perl as the backend are available.)</p><p>Modern Perl developers should familiarize themselves with PSGI and the wealth of available Plack middleware.</p><h4>Perl Web Development</h4><p>Of course no discussion of Perl web development would be complete without mentioning the strength of the CPAN. Almost any project will benefit from the wealth of freely available libraries built to solve real problems. These distributions run the gamut from full-blown web frameworks and content management systems to APIs for web services, development tools, testing systems, and interfaces to document formats and external resources.</p><p>For example, if you need to write a web service which accepts JSON data and produces Excel spreadsheets, you can glue together a few CPAN distributions and get the job done early. If you need to consume XML from a remote service and emit a PDF, you're in luck.</p><p>Perl's prowess as a general purpose programming language as well as its flexibility and power in managing text and gluing systems together make it a wonderful fit for web development. The community's adoption of modern Perl standards such as PSGI and Plack only enhance your power.</p><p>Web application development in Perl is still viable, and modern Perl tools and techniques and libraries make it more powerful and pleasant than ever.</p></div>]]></description>
	<dc:creator>Jit</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/42806/graphunzip-phases-an-assembly-graph-using-hi-c-data-andor-long-reads</guid>
	<pubDate>Fri, 05 Feb 2021 21:22:24 -0600</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/42806/graphunzip-phases-an-assembly-graph-using-hi-c-data-andor-long-reads</link>
	<title><![CDATA[GraphUnzip: Phases an assembly graph using Hi-C data and/or long reads.]]></title>
	<description><![CDATA[<p>GraphUnzip, a fast, memory-efficient and accurate tool to unzip assembly graphs into their constituent haplotypes using long reads and/or Hi-C data. As GraphUnzip only connects sequences in the assembly graph that already had a potential link based on overlaps, it yields high-quality gap-less supercontigs. To demonstrate the efficiency of GraphUnzip, we tested it on a simulated diploid Escherichia coli genome, and on two real datasets for the genomes of the rotifer Adineta vaga and the potato Solanum tuberosum. In all cases, GraphUnzip yielded highly continuous phased assemblies.</p>
<p>https://www.biorxiv.org/content/biorxiv/early/2021/02/01/2021.01.29.428779.full.pdf</p><p>Address of the bookmark: <a href="https://github.com/nadegeguiglielmoni/GraphUnzip" rel="nofollow">https://github.com/nadegeguiglielmoni/GraphUnzip</a></p>]]></description>
	<dc:creator>Jit</dc:creator>
</item>

<item>
  <guid isPermaLink='true'>https://bioinformaticsonline.com/researchlabs/view/35552/the-brent-lab</guid>
  <pubDate>Fri, 09 Feb 2018 10:55:27 -0600</pubDate>
  <link></link>
  <title><![CDATA[The Brent Lab]]></title>
  <description><![CDATA[
<p>The Brent Lab is developing and applying computational methods for mapping gene regulation networks, modeling them quantitatively, and engineering new behaviors into them.</p>
]]></description>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/bookmarks/view/43090/loretta-a-user-friendly-tool-for-assembling-viral-genomes-from-pacbio-sequence-data</guid>
	<pubDate>Wed, 23 Jun 2021 07:54:53 -0500</pubDate>
	<link>https://bioinformaticsonline.com/bookmarks/view/43090/loretta-a-user-friendly-tool-for-assembling-viral-genomes-from-pacbio-sequence-data</link>
	<title><![CDATA[LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data]]></title>
	<description><![CDATA[<p>LoReTTA (Long Read Template-Targeted Assembler), a tool designed for performing <em>de novo</em> assembly of long reads generated from viral genomes on the PacBio platform. LoReTTA exploits a reference genome to guide the assembly process, an approach that has been successful with short reads.</p>
<p>https://academic.oup.com/ve/article/7/1/veab042/6248116</p><p>Address of the bookmark: <a href="https://academic.oup.com/ve/article/7/1/veab042/6248116" rel="nofollow">https://academic.oup.com/ve/article/7/1/veab042/6248116</a></p>]]></description>
	<dc:creator>Neel</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/pages/view/36197/bioinformatics-oneliner</guid>
	<pubDate>Tue, 10 Apr 2018 04:13:03 -0500</pubDate>
	<link>https://bioinformaticsonline.com/pages/view/36197/bioinformatics-oneliner</link>
	<title><![CDATA[Bioinformatics OneLiner]]></title>
	<description><![CDATA[<p>To remove all line ends (\n) from a Unix text file:</p><pre>sed ':a;N;$!ba;s/\n//g' filename.txt &gt; newfilename_oneline.txt</pre><p>To get average for a column of numbers (here the second column $2):</p><pre>awk '{ sum += $2; n++ } END { if (n &gt; 0) print sum / n; }'</pre><p>To get sequence length for all sequences in a fasta file:</p><pre>awk '/^&gt;/ {if (seqlen){print seqlen}; print ;seqlen=0;next; } { seqlen = seqlen +length($0)}END{print seqlen}' \<br />filename.fasta</pre><p>To copy (move, rename, etc) files based on their list in a text file:</p><pre>cat file_list.txt | while read line; do cp "$line" complete_dataset/"$line"; done</pre><p>To split bam files into sets with mapped and unmapped reads:</p><pre>samtools view -F4 sample.bam &gt; sample.mapped.sam<br />samtools view -f4 sample.bam &gt; sample.unmapped.sam</pre><p>To gzip all your fastq files using gnu parallel and gzip:</p><pre>parallel gzip ::: *.fastq</pre><p>To gzip all your fastq files using pigz:</p><pre>pigz *.fastq</pre><p>To count all sequences in a fasta file:</p><pre>grep "^&gt;" yourfile.fasta -c</pre><p>To count all sequences in all fasta files in your current directory:</p><pre>for a in *.fasta; do ls $a; grep "^&gt;" -c $a; done</pre><p>To keep only one copy of duplicated lines:</p><pre>awk '!seen[$0]++'</pre><p>To sum assembly size from SPAdes contigs.fasta or scaffolds.fasta file:</p><pre>grep "^&gt;" scaffolds.fasta | cut -f 4 -d '_' | paste -sd+ | bc</pre><p>To remove everything after the first space at each line, e.g. to to simplify fasta headers:</p><pre>cut -d' ' -f1 &lt; your_file</pre><p>To count reads in a all .fastq.gz files in your current folder (fast, using gnu parallel):</p><pre>parallel "echo {} &amp;&amp; gunzip -c {} | wc -l | awk '{d=\$1; print d/4;}'" ::: *.gz</pre><p>To count reads in a all .fastq.gz files in your current folder:</p><pre>zcat *.gz | echo $((`wc -l`/4))</pre><p>To count reads in a all .fastq files in your current folder:</p><pre>cat *.fastq | echo $((`wc -l`/4))</pre><p>To count base pairs in a all .fastq.gz files in your current folder:</p><pre>zcat *.fastq.gz | paste - - - - | cut -f 2 | tr -d '\n' | wc -c </pre><p>To split multifasta file into many fasta files:</p><pre>awk '/^&gt;/ {OUT=substr($0,2) ".fa"}; {print &gt;&gt; OUT; close(OUT)}' Input_File</pre><p>To convert Illumina FASTQ 1.3 to 1.8:</p><pre>sed -e '4~4y/@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghi/!"#$%&amp;'\''()*+,-.\/0123456789:;&lt;=&gt;?@ABCDEFGHIJ/' f.fastq</pre><p>To convert FASTQ to FASTA:</p><pre>sed -n '1~4s/^@/&gt;/p;2~4p' </pre><p>To get fastq read length distribution:</p><pre>cat reads.fastq | awk '{if(NR%4==2) print length($1)}' | sort | uniq -c</pre><p>To deinterleave interleaved fastq file:</p><pre>cat myf.fq | paste - - - - - - - - | tee &gt;(cut -f 1-4 | tr "\t" "\n" &gt; myfile_1.fq) | cut -f 5-8 | \<br />tr "\t" "\n" &gt; myf2.fq </pre><p>To filter and sort contig identifiers from SPAdes assembly (e.g. here lenght &gt;= 4000 + coverage &gt;=100):</p><pre>grep "^&gt;" scaffolds.fasta | sed s"/_/ /"g | awk '{ if ($4 &gt;= 4000 &amp;&amp; $6 &gt;= 100) print $0 }' | sort -k 4 -n | \<br />sed s"/ /_/"g</pre><p>To append something to all headers of your fasta files:</p><pre>sed 's/&gt;.*/&amp;YOURSTRING/' filename.fasta &gt; new_filename.fasta</pre><p>To replace/squeeze multiple adjacent spaces by only one space:&nbsp;</p><pre>tr -s " " &lt; file</pre><p>To filter fastq based on length (here larger than or equal to 21, but smaller than or equal to 25.</p><pre>cat your.fastq | paste - - - - | awk 'length($2)&nbsp; &gt;= 21 &amp;&amp; length($2) &lt;= 25' | sed 's/\t/\n/g' &gt; filtered.fastq</pre><p>To print difference between the last and first row in 5th column:</p><pre>awk '{if (!first){first=$5;}; last=$5;} END {print last-first}' myfile.txt</pre><p>To sample only 200 first bases from all sequences in a multifasta file (e.g. from assembly scaffolds.fasta file here):</p><pre>awk '/^&gt;/{ seqlen=0; print; next; } seqlen &lt; 200 { if (seqlen + length($0) &gt; 200) $0 = substr($0, 1, 200-seqlen);\<br /> seqlen += length($0); print }' scaffolds.fasta &gt; 200bp_scaffolds.fasta</pre><p>&nbsp;To pipe a compressed fasta file directly into makeblastdb.</p><pre>gunzip -c fasta.gz | makeblastdb -in -</pre><p>To remove sequences with duplicate fasta headers from a fasta file.</p><pre>awk '/^&gt;/{f=!d[$1];d[$1]=1}f' in.fasta &gt; out.fasta</pre>]]></description>
	<dc:creator>Rahul Nayak</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/file/view/43732/spades-tutorial-pdf</guid>
	<pubDate>Tue, 01 Feb 2022 04:56:43 -0600</pubDate>
	<link>https://bioinformaticsonline.com/file/view/43732/spades-tutorial-pdf</link>
	<title><![CDATA[Spades tutorial PDF]]></title>
	<description><![CDATA[<p>SPAdes&mdash;St. Petersburg genome Assembler&mdash;was originally developed for de novo assembly of genome sequencing data produced for cultivated microbial isolates and for single-cell genomic DNA sequencing. With time, the functionality of SPAdes was extended to enable assembly of IonTorrent data, as well as hybrid assembly from short and long reads (PacBio and Oxford Nanopore). In this article we present protocols for five different assembly pipelines that comprise the SPAdes package and that are used for assembly of metagenomes and transcriptomes as well as assembly of putative plasmids and biosynthetic gene clusters from whole-genome sequencing and metagenomic datasets.&nbsp;</p>]]></description>
	<dc:creator>Abhimanyu Singh</dc:creator>
	<enclosure url="https://bioinformaticsonline.com/file/download/43732" length="268093" type="application/pdf" />
</item>

</channel>
</rss>