In the bioinformatics sector we mostly spend time on computational analysis of huge amounts of data and try to make sense of it, biologically. But, most of the newbie bioinformaticians are faced with dilemma when they receive biological sequence data for the first time. They mostly found confusing over open source, user friendly GUI, and commercial bioinformatics software. Don’t be surprise this is true and also not an easy task to decide, because analytical step is the most crucial part and believe to be the biggest bottleneck in publishing paper in high impact journals. Through this blog I would like to address the pros and cons of both kind of software/tools and try to assist (Hmmm not really, It looks convince) you to make decision on your software selections.
The most common newbie questions are:
Should I try to use these free open source programs? Why are we not trying GUI software for computational analysis? Should I use commercial bioinformatics programs/software?”
1. Let’s be open
We generally think free and cheap are useless. But this concept is not applicable when we discuss open source software. Mostly, the bioinformatics software is developed by highly competitive biological programmers who believe in open sharing of knowledge. They come under Open Bioinformatics Foundation or O|B|F which is a non-profit, volunteer run organization focused on supporting open source programming in bioinformatics. The best part about open source tools/software is that they’re free to download the source code and read exactly what the program does. If you are so inclined, you can view all of the parts of the program and see the logical flow of the pipeline. In addition, open source makes an excellent learning tool for any beginning bioinformatician. Moreover, you can modify existing open source programs to deal with cutting-edge problems or to customize your pipeline. Apart from your computational and analysis work, most of the reviewer also prefers the open source based results so that they can validate the results if validation required.
2. Code headache
As a bioinformatician you are supposed to know the basics of programming languages, and if you are not good at it, then please learn it as soon as possible because you are not a bio-analyst but biological programmers. The open source programs usually lack dedicated service and support teams (often because they were the product of an overworked doc/postdoc!) so you are responsible for troubleshooting your own errors most of the time. We commonly receive the HELP email to support and assist to setup the pipeline; you can also find this kind of request on any QA forum. I personally believe this coding horror brings the biggest downside of open-source programs; where you need some programming skills in order to implement the program in your pipeline. But, if you are not able to fix the pipeline and modify the open source code according to your requirements them you should re-think on your bioinformatician name tag!!!
3. Dive into the codes
Some of the biologist turn bioinformatician says “if you can do the same thing with commercial software then why to get migraine with weird codes”, well this statement looks to me that guys are keen to learn swimming but still don’t like to get wet. If you are still using paid software and doing your work by customer support and clicking some of the well-designed GUI button then perhaps you are not interested in learning and trying new and challenging bioinformatics works. You are missing the basic flavour of bioinformatics. Let’s dive into the coding world, I am sure your will enjoy it. I recommend your to swim freely in code’s sea, and enjoy the journey; do not merely watch it from the outside.
4. Paid does not mean better
The bioinformatics company which are specializes in bioinformatics solutions develop well designed/packed, user friendly software by using a large number of specialised scientist, programmers and support staff. They also provide good services to accomplice your biological analysis work. This means that if you hit a ‘snag’ with your data, help is likely only a phone call away! These companies price their products competitively against the cost of a dedicated bioinformatician. You may be able to afford the program, but not the additional staff! Additionally, most of the functionality that you need in your analysis is already coded into the program. Need to plot a graph? Just click this button right here. It is that easy. But, as a bioinformatician this is not generally well encouraged approach in biological analysis work, because the software is not available to everyone and your data can’t be validated. Moreover, there is very less chances that anyone will repeat your work or love to do similar kind of research (because not all the labs in the world are rich like yours).
5. Take a caution
In biological analysis work, in which you deal GB/TB of data are having maximum chances of getting errors, so please be careful and always cross check your data before coming to any conclusion. Even an error in two line code can alter your entire analysis and display weird results. Some of the scientist blindly believes on commercial software, which is entirely wrong. Using proprietary tools does not absolve you of the need to actually read and research the type of analysis that you are doing. This is particularly true in the case of genome assembly and annotation.
At the end, I would like to tell only one think that open source solutions allows you to do more cutting edge analysis than the commercial tools. So let’s go for it.
This is my personal view. I have nothing to do with any company or open source community. The views expressed on these pages are mine alone and not those of my current/past employers. I do reserve the right to remove comments left by spammers or off-topic comments.
This paper "Buying in to bioinformatics: an introduction to commercial sequence analysis software" provide an in-depth overview.