I ran into a couple of thicket of hard-to-solve Ruby on Rails problems in the past couple weeks. I want to write up the answers in hope of helping someone else – and to help cement the knowledge I picked up.
A colleague, traveling for the day, Slacked me early one morning to tell me that there was some weird new bug in our app that was deleting records that… well… weren’t ever supposed to be deleted.
We work on the ProPublica Represent app, which gathers data about what’s going on in the U.S. Congress. For some reason, the records representing each member of Congress’s term were somehow getting deleted. Our database suddenly said that, say, Rep. Michael Capuano, had left Congress at the beginning of this year and that his seat was vacant. I knew this was gonna be a good one because that’s not ever supposed to happen – so the app isn’t even supposed to have code to even delete those objects.
(We represent mid-term retirements differently; even if Capuano had quit, we’d still retain a record of his service so far this term.)
I dug into the logs, which, luckily, told me which web requests were associated with these deletions. More puzzlingly – anytime my colleague just looked at the page for a member in our internal admin, that member’s record got deleted.
We hadn’t even been making serious changes to each member’s dashboard page, certainly not within the past handful of days. And we would’ve noticed if members were mysteriously seeming to quit our representation of Congress for months, since the last serious change – which was to add a dropdown menu to list all the other members of congress, as options of others whom you could compare this member to.
Still befuddled, I stared for a while at the few short lines that made up the
show method for members, then stared a while at
git blame. Then I stared a while at the Member model and looked at
git blame for that file.
Turns out, someone had made a small change a day earlier to how to generate a list of members serving in a given Congress were calculated. Just a utility method, used all of the app. As part of that change, they had replaced a
sort_by operation to an
This is where Rails vets are like… I know what’s going on here!
To list all the other members of Congress, you have to get a list of all the members… then remove the one member you’re looking at. How did we remove that member from the list? The
What happens when you call
sort_by on an ActiveRecord::Relation, the datatype for the results of Rails database requests? It sorts it, of course, but it also returns the result as an Array, not an ActiveRecord::Relation.
order, on the other hand, adds an
ORDER BY clause to the SQL that get sent to the database.
What happened was that a method that had previously returned an Array now was returning an ActiveRecord::Relation. And because of duck-typing gone array, the
delete method – the exact same code that had worked the same for months – now not only removed an element from an array, but actually deleted that element’s corresponding record in the MySQL database!
Tests maybe could’ve fixed this, but you can only test what you anticipate (and deletions after viewing a record is hard to anticipate). A type-safe language would’ve fixed this problem, but at great cost to developer productivity. Rails – perhaps – ought to have not matched up the
Array#delete method with the
ActiveRecord::Relation#delete method, perhaps renaming one of them.