Maybe because I’m (relatively) recently departed from academia, I don’t have much preference in programming languages; although, perhaps it’s because I’ve had a fair share of advisors, professors, and managers, who did have preferences. I’ve had to work with everything from C, and Java, to Perl, Python, Matlab, and R. For quite some time, I even did bioinformatics and software development in C# (shocking, I know). In the end though, I’m left with a rather undecided feeling on the topic of best language for the field, I guess one could say I’m language agnostic. That works for me though, especially here at 5AM where we want to customize solutions to the customers’ data and needs.
However, when the customer just needs the job done and isn’t particularly interested in the details, like the language of choice, we can cater to that too. Many of us have our favorites, but I’m of the belief that different languages should be used for different reasons (and there are plenty of diverse reasons in this field). A survey by bioinformatics.org revealed a ranking of most useful languages to learn: Python, Perl, Java, C/C++ and the framework .NET which includes C# (more information about the survery is availble at http://www.bioinformatics.org/benchmark/index.html). Although Python is ranked first, there was a time, when Perl was avidly argued to be the end all, be all, the king/queen of languages for biological data. With the large support and development base, aptitude for string detection, manipulation and processing, and overall flexibility and user friendliness, the claim is understandable. It’s a powerful language, and a good, quick solution to many bioinformatics problems.
However, similar positives can be said of Python as well. Not only is it easy and quick to pick up (like Perl), but many would say the language is even better and more equipped for larger, more complex, integrated projects. Python has Django for databases, Numpy for data, and Matplotlib for charts. Plus, unlike with Perl, Python’s “rules” prevent unexpected behavior: there are no more hard to find errors from accidentally changing a string with in-place operations. It’s also a strongly typed and whitespace sensitive language forcing us to write more readable, pretty code (and also annoying us when we edit that code in a different text editor causing the white space to no longer match). It still maintains that loose, dynamic quality of Perl that we bioinformatics people love, while meeting the rigor and static typing that many traditional software engineers are used to with Java or C++.
Obviously there are pluses and minuses to any language, and people aren't afraid to point them out. However, it seems we are moving toward a compromise between the opposing views. Perhaps this shift will allow us to get the best of all worlds and find better, more efficient, valid solutions with less hoops to jump through.
However, when the customer just needs the job done and isn’t particularly interested in the details, like the language of choice, we can cater to that too. Many of us have our favorites, but I’m of the belief that different languages should be used for different reasons (and there are plenty of diverse reasons in this field). A survey by bioinformatics.org revealed a ranking of most useful languages to learn: Python, Perl, Java, C/C++ and the framework .NET which includes C# (more information about the survery is availble at http://www.bioinformatics.org/benchmark/index.html). Although Python is ranked first, there was a time, when Perl was avidly argued to be the end all, be all, the king/queen of languages for biological data. With the large support and development base, aptitude for string detection, manipulation and processing, and overall flexibility and user friendliness, the claim is understandable. It’s a powerful language, and a good, quick solution to many bioinformatics problems.
However, similar positives can be said of Python as well. Not only is it easy and quick to pick up (like Perl), but many would say the language is even better and more equipped for larger, more complex, integrated projects. Python has Django for databases, Numpy for data, and Matplotlib for charts. Plus, unlike with Perl, Python’s “rules” prevent unexpected behavior: there are no more hard to find errors from accidentally changing a string with in-place operations. It’s also a strongly typed and whitespace sensitive language forcing us to write more readable, pretty code (and also annoying us when we edit that code in a different text editor causing the white space to no longer match). It still maintains that loose, dynamic quality of Perl that we bioinformatics people love, while meeting the rigor and static typing that many traditional software engineers are used to with Java or C++.
Obviously there are pluses and minuses to any language, and people aren't afraid to point them out. However, it seems we are moving toward a compromise between the opposing views. Perhaps this shift will allow us to get the best of all worlds and find better, more efficient, valid solutions with less hoops to jump through.