ruby

Twitter, Rails, and scaling - An uninformed commentary

This post originally appeared as a comment on Shel Israel's Global Neighborhoods post, "An Open Letter to the Twitter Guys". I've copied it here to make it available to me in the future, and I've made some edits to connect links and bring it more into line with a blog post than a comment. In the process, it's been removed from the context of its conversation, so click back through the above links and read the thread. It's a good blog post and there are some excellent comments.

----

The only problem with the speculation on the technical underpinnings of Twitter's scaling issues going on here is that we might be (probably are?) wrong. That said, allow me to engage in some speculation!

I think in this case Shel and his technical contacts are probably on the wrong track. This conversation has already occurred once, with the base accusation that Rails can't talk to multiple databases. Once aired, the problem was solved relatively quickly and easily via a code contribution from the community: http://drnicwilliams.com/2007/04/12/magic-multi-connections-a-facility-in-rails-to-talk-to-more-than-one-database-at-a-time/

So even at the beginning, the problem was not the Rails application server architecture, but an issue of database contention. David HH's response to the criticism may have seemed a bit defensive, but at root he was correct that the best way to engage the community is to air your issues in community forums rather than try to work through your problems silently and then accuse the product of a shortcoming out of frustration during an interview. It certainly seems that the whole thing could have been handled better.

DHH also seems to be correct that at the application server level, Rails scales easily (though expensively) by simply throwing hardware at the problem. However, Twitter wasn't dealing with an application server problem and therefore wasn't dealing with a specifically "Rails" problem.

Moving on to the current discussion, it seems the root issue is still database contention, possibly in conjunction with a client polling architecture that may be reaching the limit of scale. See http://www.highscalability.com/scaling-twitter-making-twitter-10000-percent-faster and http://www.readwriteweb.com/archives/xmpp_web.php for decent overviews.

Having used Rails, Ruby and PHP (though not having scaled anything), I can say that as far as I see there is nothing inherent in either that solves or significantly exacerbates either of these issues. On the database side, applications built using either language/framework will generate SQL statements that query the database. In pure PHP, you usually write the SQL yourself. Rails does a lot of the SQL work for you, but it is certainly possible to override that assist and do it yourself if you've developed a situation where further optimization is required.

In short, all factors seem to point to inherent architectural issues that Twitter is struggling with. There are always ways to approach these issues, but it's far more complex than switching from one language or framework to another.

Syndicate content