Rail Rash

I’m not supposed to be writing this essay right now. I’m supposed to be well into a three hour bike ride. Instead, five minutes into the ride, I hit one of the rails of an old freight siding at too small an angle while traveling at a non-trivial velocity and ate some pavement. Though I was apparently not sufficiently skilled to avoid the fall, if there’s one thing I know how to do it’s hit the ground under control, a skill perhaps learned through some combination of falling while skiing, diving while playing volley ball, and slide tackling while serving as a soccer goalie. I had the presence of mind to shoot my hands out, which took the brunt of the fall yet were unscathed thanks to the gloves I destroyed. My left elbow, left forearm, left knee and left hip took the secondary impact, as did various portions of my bike as evidenced by assorted things being scuffed or knocked out of alignment. My arm got pretty shredded and bled a bit, my hip was almost fully protected from abrasion (though bruised) by my cycling pants which did not give way, and my knee managed to tear through the pants and get a minor abrasion but suffer more from impact. I’m happy to report that my instinct to arch my back and pull my head backward means I still have all of my teeth and original facial characteristics (for better or worse). I also had occasion to remember why I carry a folding set of Allen wrenches which proved quite useful in doing field realignments.

But that’s not the main focus of this story. It’s supposed to be tale of frustration with another kind of rail, specifically Ruby On Rails. For a while I’ve been meaning to document some of my experiences with ROR’s ActiveRecord, its module for object/relational mapping. While AR has not caused me substantial physical pain in the fashion that another kind of rail recently did, it has been the source of much recent anguish.

In the course of writing a Rails app intended to be part user interface, part data warehouse, and part task management system, I’ve repeatedly stretched AR to its breaking point or at the very least stumbled into some of its darker corners. I’ve managed to get it to fail in two ways, firstly by generating syntactically invalid SQL, and secondly by generating semantically invalid SQL. Syntax fail is annoying, but in a very immediate sort of way. You know you’ve got a problem the moment you try to run the code. Semantics fail, however, is far more nefarious, causing subtle misbehavior that may not manifest right away or even fly under the radar when it does.

AR performs reasonably well when loading from a single table or when managing simple relationships. I was, however, enticed by one of AR’s slicker recent offerings, named_scope. A named_scope provides you with an auto-generated function that returns a collection of entities narrowed by the “scope” which can be defined to have one or more conditions. There’s even a nifty plug-in that auto-generates the negated version of all named_scopes. And, most excitingly, you can chain invocations of named_scopes and it narrows the returned collection by all of the scopes by under the hood building a comprehensive SQL “WHERE” clause. This sounds great until you realize that AR is not nearly smart enough to do this correctly.

One major failure comes from the named_scope implementation not being well synchronized with other query logic. If you have the misfortune of using a named_scope with a :joins attribute and calling “find” with an :include attribute that pulls relatives from one or more of the same tables via a has_one declaration, AR passes syntactically invalid SQL to the database with duplicate table aliases. This results from Rails defaulting to loading has_one relatives with a LEFT OUTER JOIN whereas has_many is done with a distinct query with a “WHERE id IN (…)” clause. There is a workaround for this, but it is undocumented and extremely obscure. You have to override X class method Y always to return Z. Doing so makes AR always use the “WHERE id IN (…)” style, thus avoiding the problem. At least this bug was generating syntactically invalid SQL and thus causing an immediate crash.

Another obvious failure to synchronize regions of the AR code base manifests when you have a :joins clause in a named_scope and also in an invocation of “find” with the intent to apply an :order parameter to joined entities. This somehow confuses AR under certain circumstances and causes the result set to contain duplicate rows. The only solution to this, short of ditching the :joins in one place or the other, is apparently to add a :select parameter to the “find” invocation of the form “DISTINCT table_name.*”. Unlike the former bug, this one generates semantically invalid SQL which the database is happy to execute for you and then return a subtly incorrect result set.

As if this subtle misbehavior were not bad enough, it gets worse. Suppose you declare multiple named_scopes on an entity, multiple such scopes have :joins attributes that reference the same tables, and these scopes furthermore have :conditions that reference the joined table. If you use just one such scope, things will occur as you expect. If you use two or more of them, however, AR performs multiple distinct joins, leaves the table name in the first join unaliased, and aliases each subsequent join in an unpredictable fashion. This leads you into the trap of having the :conditions on different joins all reference the table from the first join, causing very unpredictable results (semantics fail again). There does not appear to be any good resolution for this if your conditions are not simple equality tests (meaning that you have no choice but to use the string form of :conditions). Your only apparent option is to rewrite the :conditions to use not a join but a subquery. I would have thought that the :conditions parameter would have allowed the string form to contain a macro reference to the joined table that would be expanded by AR once the alias had been determined. No such luck…

Another maddening limitation of AR is the restriction of :joins parameters only to be used for inner join operations. If you wish to perform an outer join, then instead of referencing other tables by their symbolic identifiers, you’re stuck having your :joins parameter be a literal string that contains one or more joins operations. Theoretically you’re supposed to be able to pass an array that mixes symbolic table references and literal strings, but due to an apparent bug that causes the documentation to disagree with the code’s execution, passing an array that mixes strings and symbols causes AR to treat them all as symbols, the result of which is error messages like “ActiveRecord cannot find relationship ‘LEFT OUTER JOIN bar on bar.id = foo.bar_id'”. The solution to this problem is to use one big string :joins that contains both your inner and outer joins. The problem with this, however, is that AR fails to realize that it needs to alias table names in a way that does not collide, meaning that an unfortunate combination of a named_scope and a “find” operation will generate (semi-thankfully) syntactically invalid SQL. Again, the best solution appears to be to rewrite the :conditions in your named_scope to use a subquery instead of a join.

ActiveRecord has many charms, but it also falls flat on its face in many circumstances. At this point, I feel sufficiently well versed in its foibles that I can avoid them while leveraging its strengths. This means leaning more heavily on the database than I would have expected was required with Rails. Subqueries and database views can come to your rescue, and perhaps for many things those are the more elegant solutions, but being forced to use them, and more to the point being forced to realize their necessity by suffering repeated AR failures to behave as one might expect, is a long way from ideal. AR has enough escape hatches that you can work around its limitations by using various string forms of arguments instead of symbolic forms, but when you do this AR finds itself unsure of what you’re doing and you will start to trip over one another, necessitating more “fixes”, in a fashion that often feels like a race to the bottom in which you don’t really get to use many or AR’s niceties.

To far too great an extent, ActiveRecord feels like it is slapping together SQL, throwing it over the wall to the RDBMS, and just hoping for the best, then washing its hands of the matter if things go less than perfectly.

  — AWG

Leave a Reply