It isn’t exactly a problem; but it is a thing.
Up at 5AM: The 5AM Solutions Blog
Yesterday, at Health Datapalooza, United States CTO, Todd Park, announced that the U.S. Department of Health and Human Services (HHS) and the Centers for Medicare & Medicaid Services (CMS) had launched initiatives that make tons of much-requested — but hard to get — health data accessible to the public.
A few days ago I attended talks by Phil Bourne and John Wilbanks, both of whom are working on ways to make scientific data, including genetic information, more freely available for research purposes. Phil and John were speaking at an NIH conference called ‘Open Science: The Transparency Revolution’ (see agenda and videocast). Dr. Bourne is Associate Director for Data Science for NIH and Wilbanks is Chief Commons Officer at Sage Bionetworks. Bourne has only been in his position at NIH for about a month, so he outlined his goals, one of which was to create a data commons at the NIH where researchers could go to discover, access and analyze scientific data.
This was the fourth time I've attended the Conference on Semantics in Healthcare and Life Sciences (CSHALS), and every time I come back with new ideas. This conference has a much greater emphasis on implementation than in the past. Considering that this conference has been going for seven years, that means a very clear evolution from its more speculative origins. Organized by the International Society for Computational Biology (ISCB), it's perhaps the best blend I've seen of people from industry and academia centered around applying semantic technologies and strategies to biomedical research.
This is the first in a series of blog posts that will dive more deeply into the nuts and bolts of Data Science. Today we will talk a bit about statistics, but we will be talking tools, visualization strategies, and data representation in the context of specific problems.
Biobanking has seen many changes over the past decade. Decentralized biobanks managed by spreadsheet have given way to institution-wide efforts that are managed through large scale information systems that can interoperate with laboratory information management systems (LIMS) and international databases that publish the resulting research. This trend is continuing through the use of tools like biolocator to aggregate information about biospecimens from many institutions to allow researchers from around the world to build effective sample sizes for even some of the rarest diseases.