Up at 5AM: The 5AM Solutions Blog

Leaky Abstractions 2: Airing the Dirty Laundry

Posted on Tue, Nov 10, 2009 @ 12:59 PM

In my previous post I gave two examples of leaky abstractions. In this post I want to talk about what makes leaky abstractions likely to occur. This will require a perhaps unlikely detour to a laundromat.

First consider what abstraction is. In the general sense of the word, Wikipedia defines it as "the process or result of generalization by reducing the information content of a concept or an observable phenomenon". So we start with some concrete item or idea, and then divide its properties into two categories - ones that are important and ones that are incidental. We discard the latter, and we declare the former as the abstraction. Our original concrete concept or idea is one example that fits the abstraction, but hopefully there are others (otherwise the abstraction would not be very useful).

One place where this process can go wrong is the choice of which properties are to be included in the abstraction, and which are incidental. The subtlety is that this choice depends on the context in which the abstraction is to be used. For example, consider clothing. In the context of placement within a store, the important aspects are type (shirt, pants, etc), and perhaps the designer - we have a section for jeans, or a section for Levi's - and we can abstract away other details. However, in the context of doing laundry, the important aspects are color and material, and the others can be ignored.

To bring this back to the realm of software architecture, take the distributed object paradigm. Here, the important aspects of an object are declared to be its fields and method signatures. How fast the methods execute is an implementation detail that has been abstracted. As a result, one can indeed reason about the correctness of a program using distributed objects without breaking the abstraction - e.g. without caring whether a given object instance is local or remote. However, it turns out that in the real world performance is as important as correctness to whether a program is useful - and here the abstraction breaks down, and the difference between local and remote objects is quite large indeed.

The choice of context is fundamental. As Joel Spolsky and Jeff Atwood point out, all abstractions are leaky to some extent. The important thing is, on the part of the abstraction creators, clearly defining the context within which the abstraction can be expected to hold, and on the part of the abstraction consumer, not using it outside of that context. The irony in software is that you can, in principle, reuse any component for a purpose completely unanticipated by its creator; in reality, this makes meaningful reuse harder. In other engineering disciplines, the set of feasible uses for a component tends to be much more circumscribed, and therefore, within those tight boundaries, more reliable.

So how can reuse be controlled in software? In the end it comes down to explicitly documenting the assumptions and preconditions under which the abstraction is expected to be valid. A good framework for doing so are the architectural quality attributes of the ATAM framework. Using patterns helps as well, since they carry shared understanding of how an abstraction will behave.


Diagnostic Tests on the Map of Biomedicine


Download the ebook based on our popular blog series. This free, 50+ page edition features updated, expanded posts and redesigned, easier-to-read maps. 

FREE Biobanking Ebook

Biobanking Free Ebook
Get this 29 page PDF document on how data science can be used to advance biorepositories.

 Free NGS Whitepaper

NGS White Paper for Molecular Diagnostics

Learn about the applications, opportunities and challenges in this updated free white paper. 

Recent Posts