Biobanking has seen many changes over the past decade. Decentralized biobanks managed by spreadsheet have given way to institution-wide efforts that are managed through large scale information systems that can interoperate with laboratory information management systems (LIMS) and international databases that publish the resulting research. This trend is continuing through the use of tools like biolocator to aggregate information about biospecimens from many institutions to allow researchers from around the world to build effective sample sizes for even some of the rarest diseases.
All of this is being driven through the adoption of data standards in these information systems. Data standards need to work not only on how data is expressed, such as data models, class names, and data formats, but also on what is meant by the data. Clearly expressing the semantic components of the data allows people and systems to interpret it without consulting the originator.
In this book we will discuss the issues around data standards and interoperability in biobanking, including existing best practices and gaps in current practice, the use of controlled vocabularies, how semantics improves the effective power of biobanks, different ways of expressing those semantics, and some thoughts about the complexities of identifying cell lines. Our aim is to inspire a discussion of future directions and set the stage for wider and more efficient use of biospecimens.
Topics Addressed in this eBook Include:
Tips on advancing biorepositories from business workflow, best practices to data standardization and data sources.
Why data standards matter to biobanking and how they are interoperability driven.
Data standard recommendations from OBBR and ISBER.
Exploration of controlled vocabularies including UMLS and CAP standard for clinical pathology.
Data standard initiatives in biobanking from the NCI to BRISQ and SPREC.
A review of data ontologies with a focus on catalog, glossary, controlled libraries, taxonomy and formal constraints.
The importance of data standardization and controlled vocabularies in biobanking
Linking and authenticating cell lines.
As biorepositories have moved away from isolated resources to become more interconnected, data standards and semantics become more and more important. Different methods of expressing semantics, from data schemas and controlled vocabularies to formal ontologies, each provide different benefits and can be employed in complementary ways. The biorepository is a key piece of the translational medicine pipeline. In the future, the ability of biorepositories to integrate with electronic healthcare records, laboratory information management systems, and research databases will amplify their value and help to create a coherent data strategy for translational medicine.
- Jim McCusker (Lead Author)
- Greg Gurley (Contributing Author)
- Hannes Neidner (Contributing Author)
- Leigh Boone (Design)