Up at 5AM: The 5AM Solutions Blog

Advancing Biorepositories with Data Science - Free eBook

Posted on Tue, Feb 11, 2014 @ 06:00 AM

Biobanking has seen many changes over the past decade. Decentralized biobanks managed by spreadsheet have given way to institution-wide efforts that are managed through large scale information systems that can interoperate with laboratory information management systems (LIMS) and international databases that publish the resulting research. This trend is continuing through the use of tools like biolocator to aggregate information about biospecimens from many institutions to allow researchers from around the world to build effective sample sizes for even some of the rarest diseases.


Biobanking Ebook Download Your Free Copy Here


Data Standards

All of this is being driven through the adoption of data standards in these information systems. Data standards need to work not only on how data is expressed, such as data models, class names, and data formats, but also on what is meant by the data. Clearly expressing the semantic components of the data allows people and systems to interpret it without consulting the originator.

Free eBook

In this book we will discuss the issues around data standards and interoperability in biobanking, including existing best practices and gaps in current practice, the use of controlled vocabularies, how semantics improves the effective power of biobanks, different ways of expressing those semantics, and some thoughts about the complexities of identifying cell lines. Our aim is to inspire a discussion of future directions and set the stage for wider and more efficient use of biospecimens.


Free eBook - Advancing Biorepositories with Data Science


Topics Addressed in this eBook Include:

  • Tips on advancing biorepositories from business workflow, best practices to data standardization and data sources.

  • Why data standards matter to biobanking and how they are interoperability driven.

  • Data standard recommendations from OBBR and ISBER.

  • Exploration of controlled vocabularies including UMLS and CAP standard for clinical pathology.

  • Data standard initiatives in biobanking from the NCI to BRISQ and SPREC.

  • A review of data ontologies with a focus on catalog, glossary, controlled libraries, taxonomy and formal constraints.

  • The importance of data standardization and controlled vocabularies in biobanking

  • Linking and authenticating cell lines.


As biorepositories have moved away from isolated resources to become more interconnected, data standards and semantics become more and more important. Different methods of expressing semantics, from data schemas and controlled vocabularies to formal ontologies, each provide different benefits and can be employed in complementary ways. The biorepository is a key piece of the translational medicine pipeline. In the future, the ability of biorepositories to integrate with electronic healthcare records, laboratory information management systems, and research databases will amplify their value and help to create a coherent data strategy for translational medicine.




- Jim McCusker (Lead Author)
Jim McCusker

- Greg Gurley  (Contributing Author)
Greg Gurley     





- Hannes Neidner (Contributing Author)




- Leigh Boone (Design)
Leigh Boone








Tags: biobanking, Data Science, ISBER, Controlled Libraries


Diagnostic Tests on the Map of Biomedicine


Download the ebook based on our popular blog series. This free, 50+ page edition features updated, expanded posts and redesigned, easier-to-read maps. 

FREE Biobanking Ebook

Biobanking Free Ebook
Get this 29 page PDF document on how data science can be used to advance biorepositories.

 Free NGS Whitepaper

NGS White Paper for Molecular Diagnostics

Learn about the applications, opportunities and challenges in this updated free white paper. 

Recent Posts