Up at 5AM: The 5AM Solutions Blog

Can this really be the state of scientific programming

Posted on Tue, Sep 01, 2009 @ 12:47 PM

I came across an article yesterday that really shocked me. No, not that a pharmaceutical company ghostwrote articles for a scientific publication. And no, not that there is an ongoing debate on how long humans and chimps interbred after they evolved into separate species. It was, in fact, an article about how to organize computational biology projects.

I am absolutely flabbergasted that an article could be published that advocates such amazingly basic things as version control and how to organize directory structures. A few choice quotes:

I will focus on relatively mundane issues such as organizing files and directories and documenting progress.

... a reasonable rule of thumb is that someone should be able to understand what you are doing solely from reading the comments.

... write robust code to detect errors.

I find version control software to be invaluable for managing computational experiments.

But should I be flabbergasted that it was published, or that it needed to be published? Once I got past the shock that this kind of blindingly obvious stuff warranted publication, I came to the depressing realization that there are probably lots of people out there who've gotten their degree in bioinformatics and don't have any background in software engineering. Maybe they really don't know about version control or error detection.

One more quote:

... principles behind organizing and documenting computational experiments are often learned on the fly, and this learning is strongly influenced by personal predilections as well as by chance interactions with collaborators or colleagues.

It seems to me that articles like this are not going to fix this problem, however. People like me who read them will nod knowingly and say "of course" but others will ignore it completely. Maybe bioinformatics degree programs should include courses on software engineering to specifically address this kind of issue. But it also seems to me that people who supervise computational scientists need to be aware of these issues. If those supervisors have software project management experience then I'd imagine some of these practices might be implemented. But if the supervisors are bencvh scientists then it might not. I think one of the things this argues for is an organizational structure that places comutational scientists in a group rather than scattering them throughout various scientific groups. That way they can be part of a culture that supports good practices and focuses not just on science but on engineering.

Lastly, it should serve as a reminder that things that are obvious to us with some engineering experience are not so obvious to those with a more scientific background. Just as engineers in the biomedical domain should be interested in learning more of the science, scientists should be interested in the engineering. And anything we can do to bridge that gap in a constructive way will benefit both ourselves and our clients.


Diagnostic Tests on the Map of Biomedicine


Download the ebook based on our popular blog series. This free, 50+ page edition features updated, expanded posts and redesigned, easier-to-read maps. 

FREE Biobanking Ebook

Biobanking Free Ebook
Get this 29 page PDF document on how data science can be used to advance biorepositories.

 Free NGS Whitepaper

NGS White Paper for Molecular Diagnostics

Learn about the applications, opportunities and challenges in this updated free white paper. 

Recent Posts