ruby

Introducing jsglue

Last weekend I pushed out the very first version of something I'm calling jsglue to Github. It now lives here: http://github.com/esjewett/jsglue/tree/master

jsglue is in essence a framework for implementing web-connective applications a la Yahoo! Pipes and Tarpipe. Currently it is at best a compliment to those programs and at worst totally useless. In the future I would like to see it or something like it become an alternative to these tools, for a few reasons that I'll eventually get into in later posts.

jsglue does three things:

  • It allows you to register a handler to a path.
    • The handler consists of a path and two pieces of javascript - one that constructs a response to a request sent to that path, and one that constructs one request (and in the future multiple requests, optionally) to another URL.
  • It accepts HTTP requests to paths with registered handlers.
    • When this happens, it creates a response using the handler javascript for this purpose, and it adds a job to a stack that will be processed later.
  • It provides a program that can be run periodically to process the stack of jobs that has built up, sending off new requests as specified by the javascript in the handler associated with each job.

That's it.

Why do I care?

Well, hopefully that will become clear of its own accord. But the key is that the full contents of the original request are exposed to the javascript processing script that constructs the new request. As such, you can do pretty much any kind of processing you like within this handler code, which is user-defined.

So why do you care? Let me count the ways:

1. Receive a request in JSON and spit it back out multi-part form-encoded (in fact, right now this is pretty much the only thing you can do). Ever tried to connect up Yahoo! Pipes with Tarpipe? It doesn't work. With this, it can.

2. Webhooks are great. Webhooks are the facility to have a web application issue an HTTP request to an arbitrary URL when some event happens in the web application. That sounds boring, but it's actually awesome. Webhooks are great, except that no one speaks the same language so every webhook-based solution is bespoke. Bespoke is great in a suit or a coffee mug, but it's bad in web infrastructure. Yahoo! Pipes can't understand webhook calls. Tarpipe usually can't understand them. Most other web applications can't understand them. There needs to be some sort of middle-person.

3. If I'm going to run a ton of my personal data through some middleman web application, you should have the option of running that web application yourself. I'm not saying you will, but I think it would be nice if you could.

Okay, 3 ways is enough for now. We'll get to more later.

Why do you *not* care?

Well, there are lots of reasons for that too.

1. This is dorky. No, there is not a UI. No, it doesn't do much of interest. It's an infrastructure prototype more than anything else. The idea is really that we need infrastructure for building applications that can do this sort of thing. I don't have a lot of time to spare, so I'm willing to just put a framework out there, and maybe a REST-only web-application if I can get the components running on a hosting service (harder than it sounds). I'll leave it to someone else to put the UI on top of it. I'm not convinced that the "pipe" metaphor is correct (I'm partial to "tubes" myself), but I don't have a better idea, so someone else will have to have that idea.

2. This code sucks. Yes it does. I urge you to fork it, improve it, or throw up your hands in disgust and start over. I just want something that does this. I don't really care if it's written by me.

3. There's no way this execution model will fly on a public site, and no one is going to run this on their own server. This is sort a feature of this design that allows your users to execute arbitrary javascript on your server. As such, I think this will primarily find use on private servers, or as a back-end engine for a public site where the inputs are carefully cleansed. Not a recipe for ultra-popularity, I'll grant. But that's not really the point either.

So what's it made out of?

Currently, there are only four main ingredients:

  • Ruby is the implementation language. It's role in jsglue is to serve as duct-tape for the other components.
  • Datamapper is the database interface, allowing you to use pretty much any supported database (I'm using SQLite at the moment).
  • Sinatra for the HTTP web-service interfaces. These interfaces are pretty much a direct mapping onto the database. (REST-ful? Maybe.) (Incidentally, how is it that a minuscule Ruby web-framework beats out FRANK SINATRA in the Google rankings?)
  • Johnson for the Javascript processing.

That's it. It's a couple hundred lines of code. I haven't really counted, or put it on Ohloh.com for that matter, but it can't be more than that. It's got some unit tests. It's going to be changing quickly as I make it more multi-purpose.

I'll document and post examples as they become available.

A tour of testing with an SAP focus (in the end)

As might have been assumed from the my post on automated testing in SAP systems a couple months ago, I've been delving into testing in the SAP landscape. I'm beginning to put together a series of presentations and workshops on the subject, the first of which I was delighted to deliver last week.

Being the first in the series, this presentation focuses on an overview of the leading edge of the field. It would be nice to be able to emulate the sort of testing techniques we can use in Ruby in SAP BI and EPM application development. Nice, but not necessarily realistic. I can dream!

How to get Johnson built

Johnson is a Ruby wrapper of the SpiderMonkey JavaScript interpreter. In practice, this means you can use Johnson to evaluate JavaScript statements within a Ruby program.

It works really nicely, but there is no released gem and building the development gem is a bit of a headache. The instructions in the readme at the Johnson Github site appear simple but have several prerequisites that must be fulfilled before the whole process works. Here's what I did on Leopard. I'll try recreating on Windows tomorrow.

[sudo] gem sources -a http://gems.github.com
[sudo] gem install jbarnette-johnson

Multiple errors occur - if you see gems missing (like hoe, for example) then install them.

[sudo] gem install jbarnette-johnson

Errors occur, probably due to a failure to compile the native gem. Usually something about a missing rake/extensiontask.

[sudo] gem install rake-compiler
[sudo] gem install jbarnette-johnson

Yet another error occurs - a missing Manifest.txt file. Download from Github at http://github.com/jbarnette/johnson/tree/master and put it into the directory in the error message.

[sudo] gem install jbarnette-johnson

Success!

[Apr. 4, 2009 - Fixed typo, changing "rake-compile" to "rake-compiler"]

Twitter, Rails, and scaling - An uninformed commentary

This post originally appeared as a comment on Shel Israel's Global Neighborhoods post, "An Open Letter to the Twitter Guys". I've copied it here to make it available to me in the future, and I've made some edits to connect links and bring it more into line with a blog post than a comment. In the process, it's been removed from the context of its conversation, so click back through the above links and read the thread. It's a good blog post and there are some excellent comments.

----

The only problem with the speculation on the technical underpinnings of Twitter's scaling issues going on here is that we might be (probably are?) wrong. That said, allow me to engage in some speculation!

I think in this case Shel and his technical contacts are probably on the wrong track. This conversation has already occurred once, with the base accusation that Rails can't talk to multiple databases. Once aired, the problem was solved relatively quickly and easily via a code contribution from the community: http://drnicwilliams.com/2007/04/12/magic-multi-connections-a-facility-in-rails-to-talk-to-more-than-one-database-at-a-time/

So even at the beginning, the problem was not the Rails application server architecture, but an issue of database contention. David HH's response to the criticism may have seemed a bit defensive, but at root he was correct that the best way to engage the community is to air your issues in community forums rather than try to work through your problems silently and then accuse the product of a shortcoming out of frustration during an interview. It certainly seems that the whole thing could have been handled better.

DHH also seems to be correct that at the application server level, Rails scales easily (though expensively) by simply throwing hardware at the problem. However, Twitter wasn't dealing with an application server problem and therefore wasn't dealing with a specifically "Rails" problem.

Moving on to the current discussion, it seems the root issue is still database contention, possibly in conjunction with a client polling architecture that may be reaching the limit of scale. See http://www.highscalability.com/scaling-twitter-making-twitter-10000-percent-faster and http://www.readwriteweb.com/archives/xmpp_web.php for decent overviews.

Having used Rails, Ruby and PHP (though not having scaled anything), I can say that as far as I see there is nothing inherent in either that solves or significantly exacerbates either of these issues. On the database side, applications built using either language/framework will generate SQL statements that query the database. In pure PHP, you usually write the SQL yourself. Rails does a lot of the SQL work for you, but it is certainly possible to override that assist and do it yourself if you've developed a situation where further optimization is required.

In short, all factors seem to point to inherent architectural issues that Twitter is struggling with. There are always ways to approach these issues, but it's far more complex than switching from one language or framework to another.

Syndicate content