Earlier this summer, I drove down to the Southeast Linux Fest in Spartanburg, South Carolina. One of the talks that stood out to me was given by Heather Holl, bioinformaticist and Slackware Linux team member. She talked about the open source tools she uses most in her work in equine genomics. I was especially impressed at how she used standard, open source Linux command-line tools to get her job done.
Being the resident Linux fanatic at 5AM, it got me wondering--which open source tools are most valuable to our scientific research staff? What are we doing that is really cool (read: geeky) in the open source arena with bioinformatics? After a massively unscientific poll of some of our staff scientists, a few trends emerged.
At least at 5AM, R is king.
According to the website, http://cran.r-project.org/doc/manuals/R-lang.html, “R is a system for statistical computation and graphics. It provides, among other things, a programming language, high level graphics, interfaces to other languages and debugging facilities.” Our scientists use R for those same tasks. With data sets as crazy (and huge) as they can get with genomic analysis projects, R is a must-have for efficient processing. R even has its own IDE these days, with RStudio, another open source project headed by a couple of ex-Microsoft employees (among others).
When not doing math in R, Python is the language du jour for bioinformatics at 5AM.
When I speak with bioinformatic students who are just finishing up their academic careers and starting to look for “real work” they talk about all of the Perl they’ve had to use. While 5AM does use Perl on occasion, Python is a much more popular choice internally and for our customers. Being a Python fanboy, I of course wondered what specialized Python modules we are using. Our current favorites are:
This is in no way a full summary of the open source tools we use at 5AM to process scientific data, but I do think it represents some of the more interesting tools. Especially with regard to some of the python modules (I’m thinking of pytable in particular), we are out on the leading edge of what’s available in the Python community and it’s an exciting place to be.
-Jamie Duncan, 5AM Solutions
Python? R? Or are you a fan of another tool? Weigh in using the comment section below.