sofer.com » blog » tech
09 October 2009

Using :joins in Active Record

has_many :through relationships cannot be used when there is more than one model sitting in the middle of a many-to-many relationship. In that case, a method that uses custom :joins (or, where eager loading is wanted, :include) is the preferred find option. There is no need to hand-craft your SQL. Not that there’s anything wrong with SQL, but conditions on a join table expressed as a ruby hash have been available since June 2008 and they are a lot tidier.

For example, assume the Active Record classes: User, Message, Posting, Group and Membership, where each user can belong to several groups and each group can have several users, defined by a membership record. A message can be posted by a user to several groups and each group can have several messages posted to it.

These relationships are easy to express as follows:

u = User.first
u.groups
u.messages
g = u.groups.first
g.messages
g.users
m = g.messages.first
m.groups
m.user
m.user.groups

and so on. If you want to see all the messages posted by users to all the groups of which a particular user is a member, you might (absent-mindedly) be tempted to try something like:

u.groups.messages

But u.groups returns an array, which will not under normal circumstances respond to a request for messages, so this is not going to work. Instead we can construct a method in the User model using the :joins option.


Message.all(:joins => { :groups => :memberships }, :conditions => { :memberships => {:user_id => id } }).uniq

This quite complex join is expressed almost entirely as hashes. The :joins hash defines the join of memberships, groups and memberships. This is worth looking at in a bit more detail. Here are some (not entirely useful) variants:

Message.all(:joins => :groups)
Message.all(:joins => [:groups, :user]) # actually, this is an unlikely one
Message.all(:joins => { :groups => :memberships })
Message.all(:joins => { :groups => { :memberships => :users }}) #actually, :groups => :users would do here

All of these method calls express joins on the Message model. The exact form of the arguments depends on the relationship between the different models and they can be nested as deeply as necessary. The most deeply-nested model is the one that is likely to be of most interest and will be used in the :conditions.

As can be seen from the examples above, :groups and :users are accessible directly to the Message model, but :memberships are accessible only to through the Group model and :users are only accessible through both the Group and Membership model (actually, the Membership model is superfluous here, but the example demonstrates the possibility of multi-level nesting).

In this case, we are looking for those messages that should be visible to a specific user. So, having joined the Message model to the Membership model, we want to restrict our results to those groups of which the user is a member. In other words, where the memberships.user_id is the same as the id of that user. This is again achieved using a hash, this time to determine the scope of the conditions (in this case, the scope is the Membership model).

the .uniq method ensures that no duplicates are returned.

:joins do not do eager loading of model data. In cases where eager loading would boost performance, :joins can be replaced by :include (minus the :uniq method), like this:


Message.all(:include => { :groups => :memberships }, :conditions => { :memberships => {:user_id => id } })

For comparison, the SQL generated by the :joins call above looks like this:

SELECT "messages".* FROM "messages" 
INNER JOIN "postings" ON "postings".post_id = "messages".id 
INNER JOIN "groups" ON "groups".id = "postings".group_id 
INNER JOIN "memberships" ON memberships.group_id = groups.id WHERE ("memberships"."user_id" = id) 

And the SQL for the :include call is significantly more complex. For readability, hash-style :joins and :include are the way to go.

For more information on the difference between :joins and :include, see Ryan Bates’ screencast on Include vs Joins