This post is the first of several in which I would like to talk about the issue of leaky abstractions in software engineering. A leaky abstraction can be a broad and somewhat vague notion, but the essential idea is that you've created some model of the behavior of a complex system, where the model simplifies and hides some of the details of the system, yet at some point the simplification breaks down and those details poke through. When this happens, it is a big problem, as this simplification is what allows programmers to get their job done, by allowing them to work at the level of complexity appropriate to their task at hand. In this post I want to describe two examples of leaky abstractions I came across recently in my own programming.
The first is a fairly simple one. The product I work on has an API exposed as a set of remote EJBs; clients that invoke these EJBs can optionally provide login credentials so that the operations are invoked with the security privileges of the corresponding user. It turns out that, if the login credentials are incorrect, the client gets a ClassNotFoundException for a low-level mysql class. The reason for this is that a FailedLoginException is thrown, but ultimately this contains a cause exception which references a mysql class. Since the client does not have mysql's jdbc jar on its classpath, this causes the ClassNotFoundException.
This is a clear case of a leaky abstraction: the client is programming to the JAAS security API; it should not matter that ultimately the identity store behind it is backed by a mysql database, and the client should not need to have the mysql jar in this scenario. The fix here is simple: the thrown FailedLoginException should not include the underlying exception as a cause.
The second example is more far-reaching and involves a Hibernate query with a limit clause. The limit was set to 50 results, but the query would return only 49 results, even though I knew that there were in fact more than 50 matching records in the database. After some frustrating back and forth between the debugger and DbVisualizer, I found the root cause: The target class for the Hibernate query had a collection with an eager fetch strategy. This implied that, when the Hibernate query was actually translated into SQL and executed, there were potentially multiple rows corresponding to a single instance of the class, one for each element in the collection. The limit of 50 results, however, was passed down into the generated SQL layer as-is, which meant that the SQL result set had 50 rows, but by the time those were translated back into class instances, there were fewer than 50.
In fairness to Hibernate, its documentation recognizes this and explicitly warns that limit queries and eager fetching of collections do not work together. But this is really a cop-out. Hibernate is meant to be an abstraction that allows me to think of my data store in terms of class instances, and not worry, once i've done the mapping, about the actual rows in the database. Given that, when I specify a limit for a query, I want to think of that limit in terms of the objects, not the rows, and at least in this case, Hibernate doesn't let me do that. I can sympathize, as fixing this would definitely be non-trivial - Hibernate could conceivably detect that the number of object results was less than the limit, and issue additional SQL query (or queries, since there would be no way to predict how many additional rows would need to be retrieved to get the right number of object results), or even not do a limit query at the database level, and limit the results on the Java side. But this would likely make performance worse.
These are just two recent examples that I ran into directly, but others abound. The entire distributed object paradigm, with the notion that one could treat local and remote objects the same, is one giant leaky abstraction, and it was quickly realized that the only way to write performant code in this paradigm is to very much care about which objects were local and which were remote. The various web frameworks that try to shoehorn a Swing-like component model on top of HTTP are another case where the abstraction can feel akin to putting lipstick on a pig.
In future posts I would like to discuss what makes leaky abstractions more likely to occur, how one can try to avoid them, and what makes the software engineering domain particularly prone (or not) to leaky abstractions.