Organisms are pretty complicated things, especially multicellular organisms. But what if I told you that, in some ways, single-celled organisms are even more complicated than multicellular organisms? Yet, when we talk about identifying things, this is exactly the case.
Multiphoton fluorescence image of cultured HeLa cells with a fluorescent protein targeted to the Golgi apparatus (orange), microtubules (green) and counterstained for DNA (cyan).Nikon RTS2000MP custom laser scanning microscope. (Via Wikipedia)
A lot of this comes from the way identifiers relate to the things they identify. When we talk about things, we are usually relating three things together: a symbol (a word, identifier, image, number, or something else), a thought (a concept or idea), and the thing itself, called a referent. According to Odgen's Semiotic Triangle, symbols symbolize thoughts, thoughts refer to referents, and through that symbols stand for referents. When dealing with computer systems, it's generally helpful to have a symbol (like an identifier) refer to only one thing, because computers don't do well with ambiguity. Identifying a cell line is complicated because, well, what, exactly are we identifying?
A cell line is a colony of cells that are immortalized from a piece of tissue that came from a multicellular organism. It reverts part of cellular behavior back to things like bacteria and slime molds, but only part. HeLa is the most famous cell line, and is the most popular both in knowledge share and in cell count. All that popularity means that there are many thousands of cell colonies that have evolved in different ways over the years. Is this still the same HeLa that was extracted from Henrietta Lacks?
In many situations, we would want to say "yes, HeLa is HeLa". But in just as many situations, it's important to distinguish between the modern HeLa cell line and the one originally created in 1951. Further, we need to identify many different strains of HeLa cells, and when we're managing a laboratory, we need to tell the difference between one colony of cells that have been treated with a particular compound and one that's been left as a control. The most careful position might just say that each and every cell is an identifiable thing, and they are, but that rarely happens unless a single cell has been isolated from a population. Biologists rarely work on this scale, however. Instead, we need a way of identifying colonies of cells and relate them back to the most general cell line.
"...we need to tell the difference between one colony of cells that have been treated with a particular compound and one that's been left as a control."
I wrote a couple of papers years back that discusses this problem in more detail. The fallout of this has highlighted the differences in perspective that scientists and philosophers have about how to model information, but an outcome has been the addition of two relations to the World Wide Web Consortium's Provenance Ontology (PROV-O), called alternateOf and specializationOf. We can use these relations to link different cell colonies together, and also to talk about more abstract representations of cell colonies, like cells that all share a parent colony as its source, even back to the first cells to be immortalized. There are even methods now of authenticating cell lines, so these relationships can even be verified and rediscovered.
Biology is a messy business, but it never helps to ignore the mess if instead you can document it.