api

What's the deal with JAX-RS and Lift?

There has been some talk lately (from the Java performance maven @kohlerm and others) about JAX-RS in the context of the API of a Lift application - specifically ESME. JAX-RS is a Java annotation framework for programming RESTful web services (APIs for the non-enterprisey out there).

The real goal of the talk about JAX-RS with regards to ESME (as I see it, and I'm not the only point of view on this by a long shot), is to create an API that is as RESTful as reasonably possible, and which (for most resources) is indistinguishable from a JAX-RS implementation from the perspective of a client consuming the API.

It appears at first glance that the easiest way to achieve the goal of indistinguishability from a JAX-RS implementation is to do a JAX-RS implementation. I'm not convinced it is so straightforward.

Here are the requirements for a platform for implementing an HTTP API for ESME, in my view:

  1. Access to ESME instance objects as instantiated in the Lift application (preferably we are not using the database as the way that the API and the application communicate, especially since ESME doesn't necessarily even have a DB in my understanding)
  2. Respect the ESME security architecture (meaning that the API must act as a specific user)
  3. Support delta/streaming collections over HTTP in addition to REST-only resources and collections

Meanwhile, I've found a few instances of people attempting to use JAX-RS in the context of a Lift application. Unfortunately, I don't think they actually meet these requirements.

Here (http://blogs.sun.com/sandoz/entry/using_scala_s_closures_with) we have a discussion of how we can implement JAX-RS provider classes in Scala instead of in Java. This is, admittedly very attractive, but I'm not seeing a way to import the entire Lift application context into these provider classes. Or rather, enough of the context to satisfy my requirements 1 & 2 above.

I had a fleeting moment of hope when I saw lift-jersey and this thread where we have James Strachan writing a Lift module that appears to allow us to use the Lift templating language within a JAX-RS implementation in Scala. However, it looks to me like this only allows using the Lift templating language, and by and large replaces the Lift request-handling stack with JAX-RS. This is sort of like what we want to do, but not really, and I think we're going to have the same struggles with integrating the Lift application itself that we would have had with the first example.

Where we'll really start running into trouble, even if we can satisfy requirements 1 & 2, is in requirement 3. In ESME, we some resource collections that need to be provided in a "delta" or "streaming" format to our clients, but also provide more orthodox RESTful HTTP interfaces. JAX-RS doesn't appear to be very friendly to the delta/streaming problem-space, so we will actually be forced to implement the streaming parts of these resources in Lift/Scala directly. This means that we can't just cordon off a portion of the URI-space of the ESME application to be served by Jersey or another JAX-RS container, outside of the context of the Lift application. The two containers would need to be very closely intertwined in the URL-space of the ESME application. In other words, we'd need to pattern-match before the request even hits Jersey and JAX-RS, which kind of defeats the purpose.

So, that little exploration was relatively fruitless, but maybe this blog will attract some comment setting me straight on how to do this.

Meanwhile, I had the thought "What's so great about annotations anyway?" My first impression is that there is nothing so great about annotations. First impressions are usually wrong, but it got me thinking along the lines that what JAX-RS does with annotations is suspiciously similar to what the Lift request dispatcher does with pattern matching. Here, let's loosely borrow an example from the lift-jersey Github page.

import javax.ws.rs.{Produces, Path, GET}
/**
* @version $Revision: 1.1 $
*/
@Path("/resourceReturningTemplateView")
class ResourceReturningTemplateView{
  @GET
  def view() = <xml_container>Some Text</xml_container>
}

So, all this says is that when we receive a GET request to the path "/resourceReturningTemplateView" in our application, return the results of the view() method, which in this case is a string of XML.

What does this look like in a native Lift? Well, ignoring the other Lift incantations that need to be done (which are really just one line in Boot.scala, a class definition, and an implicit function that converts from a NodeSeq or Elem to a LiftResponse), all that's required is something like

def dispatch: LiftRules.DispatchPF = {
  case Req("resourceReturningTemplateView" :: Nil, _, GetRequest) =>
    <xml_container>Some Text</xml_container>
}

Lift request matching appears to be pretty powerful, so I started looking at whether it can cover the use cases of JAX-RS. The upshot is that I think it does pretty much everything we need, so I'd prefer to stick with the Lift pattern matching over JAX-RS annotations for the time being. Some things may be a bit more complicated in Lift, like the @Consumes annotation and the Allow header value, or direct handling of form fields as specified in the @FormParam annotation. However, I think these are all doable and just require good patterns be developed.

Final thought: I'm pretty sure that JAX-RS isn't really an API for RESTful web services. It's an API for pattern-matching HTTP requests in a language (Java) that doesn't have native pattern-matching.

Final disclaimer: No actual code was harmed, or tested for that matter, in the writing of this blog. In other words, that code up there probably doesn't work, and that's nobody's fault but my own.

Why use OAuth in clients? (OAuth FAQ Part 3)

There is an ongoing meme that OAuth provides little or no security benefit in client applications. I believe that this assertion is incorrect.

Client applications are applications that run on my computer or phone and so are implicitly "trusted" by me. (I use quotes because I think that the assumption of user trust for client applications is wrong, but that is neither here nor there.)

I already addressed this somewhat obliquely in my OAuth FAQ Part 1 and Part 2, but it bears repeating, which I just did on the OAuth Google Group here.

To quote myself:

One major security benefit of using OAuth for client apps is that the client is only provided with an access token for the service and not the user's password.

If the access token and consumer secret are compromised, then either can be revoked, either by the user (in the case of the access token) or by the provider (in the case of the consumer secret). In most provider implementations a request authorized with an access token is not allowed to update certain aspects of the user's account, such as the password.

If the client requires the user to input their password (for example, Google's ClientLogin protocol) and the client becomes compromised, then the password is exposed, allowing full access to the user's account. Game over.

In my mind, this is *the* reason I want clients I use to use OAuth rather than a username/password login scheme.

I would be happy with another token-based login scheme as well, but OAuth is a perfectly good, publicly reviewed standard and I see no reason why a provider should cook up a bespoke token-based authorization scheme when OAuth is available and works for clients.

In other words, developers should use OAuth in their phone and desktop applications and users should demand it. I do.

OAuth Q&A Part 2

This the continuation of my ongoing OAuth Q&A, now with fewer links and more editorial commentary. See Part 1 as well.

If I have access to an authorized token and the consumer key and secret, I can make "authorized" API requests galore. So OAuth is no more secure than the username/password pattern.

That's not a question. Way to start off part 2 with a big fail!

[Ed. - Ahem . . . You're asking the questions as well as answering them.]

[Me - I deny that. But okay, let's try again at a real answer.]

This is only partly correct.

First, OAuth provides the a pattern for providers to scope access much more granularly than when a single username/password pair gives total access to an app. For example, most OAuth providers do not allow access to administrative functions when authorizing with an OAuth token, so a rogue client cannot change a user's password. Because of this, the user's access to the app can be guaranteed and damage can be limited to the scope of access granted to the client.

Second, because the user's access can be guaranteed and because client access can be managed at the level of the token, authorization for a given token can be revoked in a self-service manner. The scenario here is that a user grants access to a malicious app that vandalizes the account. The user realizes this, logs in to the provider, revokes the consumer's access, and repairs (hopefully) the damage. Because the malicious consumer cannot change the password on the account, the provider does not have to become involved in the initial response to the vandalism. Follow-on activities that the provider may need to become involved in (like account restoration or consumer key revocation) still exist, but the need for instantaneous response is lowered.

Hey, you just said "consumer key revocation", but back in Part 1 you said that the provider can't assume a consumer key uniquely identifies a particular class of consumers. What gives?

It is correct that the provider can't make this assumption based on the OAuth specification. However, the provider can impose requirements around the consumer key on consumer developers, and the provider can reserve the right to revoke a consumer key for whatever reason they choose. This could be required in the case of denial of service attacks or malicious consumers using a particular key. Providers should work closely with consumer developers to clearly lay out what actions the provider will take in these cases and what options consumer developers have.

Providers should recognize that the case of desktop and web apps are significantly different in the case of key revocation and they should plan accordingly. In the case of web apps, it should be relatively easy to replace a compromised consumer key and secret. In the case of desktop apps, the consumer developer may face the task of updating thousands or millions of installed applications in the case of a key revocation.

Let's assume that a user has installed a malicious desktop app that wants to use the API of my web app. What's the point of requiring OAuth, since this malicious desktop app already owns the user's system and will have access to anything the user types?

I'm not sure where this assumption comes from. Well, let me amend that. I know exactly where this assumption comes from, but it is still incorrect. Let me assure the reader that there are many platforms where the installation of a malicious application does not translate into ownership of the host system. The reader will find this sort of behavior in any properly configured Unix-like operating system including Linux, on the iPhone, in any modern browser, in Mac OS X, and even in some locked down versions of the Windows operating system.

On these platforms, even if the user is tricked into installing a malicious client or going to a malicious website, the user may still be careful to verify that they are only entering their password into the real site of the provider. There are a lot of options for verifying this, but the best way is for the user to make sure that he or she always enters there password into their provider's site using a modern browser with anti-phishing technology and using an https connection.

OAuth Q&A - Part 1

OAuth Q&A - Part 1

I've been seeing a lot of misinformation about OAuth in discussions lately. Mostly, this is because there has been a lot of activity around OAuth due to the announcement that the Twitter API will be supporting OAuth and eventually (probably, and less officially) moving to OAuth-only authentication and dropping support for basic auth entirely. Generally this misinformation is rebutted somewhere else on the web but it can be somewhat inconvenient to track down all of the call-and-response blog postings on the topic.

As such, I thought an O'Grady-style Q&A might be in order, based on my notes and recollections. Said Q&A turned out to be a little longer than expected, so I'm doing it in a number of parts that will be determined by the cessation of misinformation or the end of my attention span, whichever comes first. If there are any inaccuracies included in these posts that I become aware of (On the web? About OAuth? From a non-expert? Impossible!), I'll post corrections inline.

What was OAuth designed to do?

OAuth was designed to do delegated authorization, which is a fancy way of saying that it was designed to ensure that a user will never have to give the password for a service to any entity other than the service provider. If the user wishes to use an application that is not the provider itself in order to access the provider (for example, using Twhirl to access Twitter, or Plaxo to access your Google contacts), then that application can use OAuth to request authorization, which is then granted in a transaction that occurs directly between the user and the provider and does not involve giving the password to the application.

You can read lots about it on the OAuth site, the OAuth wiki, the OAuth mailing list, on Eran Hammer-Lahav's site, and the Google identity and authorization site.

OAuth doesn't work in desktop applications because there is no way to protect the consumer key or secret, so there's no point in using it here, right?

No, OAuth does what it was designed to do just fine in desktop apps.

But doesn't OAuth rely on a consumer secret and key to identify applications that are allowed to access protected resources and services? How do I protect that in a desktop application?

OAuth neither requires nor assumes that the consumer key and secret are hidden from other apps (though there are some nice payoffs if you can manage this). The consumer secret is simply not sent over the wire. Providers should not assume that consumer keys or secrets identify specific consumers or classes of consumers. Note that this may require providers to treat desktop consumers differently than web apps where requests come from a single IP address and the consumer has more control over the key than is the case with desktop apps.

My company has a web app and we also provide a desktop client. Because we provide the client, we can just use username/password authentication in the client and that's safe, right?

No, and there are two reasons for this. First, doing this teaches your users that responding to the password anti-pattern is acceptable behavior and makes your users more susceptible to malicious API clients that ask for their password. Second, this client behavior necessarily results in the user's password being stored by the client in a recoverable format, which is a significant security risk.

OAuth is really just a standard for token negotiation and request signing. My application already generates API access tokens and I have a scheme for sending the token along with the request, so why do I need OAuth?

The most important reason is that there are a lot of things that can go wrong with API access tokens, especially when used over an insecure transport (http instead of https). Using a public, peer-reviewed standard lets you rest easy knowing that lots of experienced thought has gone into avoiding security issues.

Well, okay, but I've actually got the best security minds in the business working on my access token scheme, so I'm pretty comfortable there. What are the other reasons?

The next biggest reason is interoperability. There are lots of OAuth [client libraries] that make it relatively easy to inter-operate with systems that use OAuth as their auth delegation protocol. For example, Shindig (OpenSocial, Google Gadgets, iGoogle, etc.) has built in support for OAuth API access for gadgets running on the platform. I expect that within a year there will be at least one web "glue" application (like Yahoo! Pipes, or Tarpipe) that will support reads from and writes to arbitrary OAuth endpoints. If you use a custom scheme, you will miss out on this interoperability.

A RESTful ESME API

The Enterprise Social Messaging Experiment (ESME) project grew out of a collaboration in the SAP world and is now a project in the Apache Incubator. The project itself is an interesting and inspiring demonstration of the organizing power of so-called web 2.0 tools, which allowed the project to go from an idea to an Incubator project in about half a year through the efforts of a wide-spread group of individuals, many of whom have never met. I've been on the sidelines of the project, looking on, but I've been fascinated by some conversations that have taken place around the API.

The ESME API is often described as a "REST" API or a "RESTful" API. "REST" refers to the design principle of Representational State Transfer. Wikipedia has a good overview. REST is often seen as an alternative to RPC APIs, or "Remote Procedure Call" APIs. At root, the difference as described by Wikipedia is that RPC is about telling an application to do something while REST is about changing the state of the resources of an application.

The upshot is that RPC can be thought of as interfacing in verbs, while REST can be thought of as interfacing in nouns. To grossly oversimplify, let's pretend that I'm updating my location in the application http://www.foobar.com/.

In RPC I would do something like

POST HTTP request http://www.foobar.com/api/update_location?lat=1234&long=5678

In the context of a REST API, I would do something like

PUT or POST HTTP request http://www.foobar.com/location?lat=1234&long=5678

A subtle distinction to be sure, but let's look at the difference. In the RPC version, we see the verb in the URI. There is a separate URI for every possible action we might want to take with regards to a location (update_location, get_location, create_location, delete_location). In a REST API, the resources of the program are addressed directly (that is, we send the request to the same URI that we would use to display our location in a web browser), and the "verb" that we want to apply to the resource is embedded in the HTTP request. REST is a very HTTP-oriented design approach, but it is an approach that makes sense because HTTP is a protocol for handling resources. The Wikipedia page has more on this, and may very well contradict me, as I am by no means an API expert!

To get to the point, what would an ESME REST API look like? Let's first look at the current ESME API, which is described as "REST", but which I think we can safely conclude is actually RPC.

http://code.google.com/p/esmeproject/wiki/REST_API_Documantation

Note how each API command is a verb. This is the hallmark of an RPC API. (ESME is part of a tradition here. The Twitter "REST" API is also primarily written in an RPC style, where the verb is part of the path of URI and is not assumed based on the HTTP method. The Twitter API does assume that a GET HTTP request maps to a read except when they have also have a specific "show" verb, but now I'm just nit picking.)

There is nothing wrong with this, and RPC is actually more in line with enterprise API design standards than a REST API, but I'd like to get at what a real ESME REST api would look like. I provide here the current ESME API method along with the REST equivalent that comes to mind.

I'm listing arguments here as URL-encoded, but they could easily be form-encoded, XML, JSON or all of the above. On the REST side, where a portion of the URL is in all caps, this would be substituted by the unique ID of that particular resource. This is, of course, not a well-thought-out proposal, but rather a suggestion to illustrate what a more RESTful API might look like.

Current (RPC)

REST

GET /api/status GET api/sessions
(It might make sense for this to return only the current session, but in theory it would return all sessions that the current session is allowed to access, so for an administrator, it might return all open sessions. An individual session would be accessed at GET api/sessions/SESSIONID.)
POST /api/login
token=API_TOKEN
POST api/sessions?token=API_TOKEN
GET /api/logout DELETE api/sessions/SESSIONID or
DELETE api/sessions?session=SESSIONID
(get SESSIONID from api/sessions)
GET /api/get_msgs GET api/users/USERID/messages
(get USERID from api/session)
GET /api/wait_for_msgs GET api/users/USERID/messages (long-poll?)
GET api/messages/MESSAGEID
Gets a particular message.
POST /api/send_msg
message=messagebody
via=client
tags=tags
metadata=XML_data
replyto=message_id
POST api/messages?message=MESSAGE_BODY&via=CLIENT&tags=TAGS&metadata=XML&replyto=MESSAGEID
PUT api/messages/MESSAGEID (payload the same as POST)
DELETE api/messages/MESSAGEID
GET /api/get_following GET api/users/USERID/followees
GET /api/get_followers GET api/users/USERID/followers
POST /api/follow
user=id_of_user
POST api/users/USERID/followees/USERID2 or
POST api/users/USERID/followees?user=USERID2
POST /api/unfollow
user=id_of_user
DELETE api/users/USERID/followees/USERID2 or
DELETE api/users/USERID/followees?user=USERID2
GET /api/all_users GET api/users
GET /api/get_tagcloud
numTags=optional_no_of_tags
GET api/tags
(This doesn't really seem like an appropriate API method. It should really return all of the tags, or user-specific tags (GET api/tags/USERID) and let the front-end decide what to do with it.)
GET /api/get_tracking GET api/users/USERID/tracks
POST /api/add_tracking
track=text
POST api/users/USERID/tracks?track=TEXT_TO_TRACK
POST /api/remove_tracking
trackid=id_of_tracking_item
DELETE api/users/USERID/tracks/TRACKID
GET /api/get_conversation
conversationid=Conversation_id
GET api/conversations/CONVERSATIONID
GET /api/get_actions GET api/users/USERID/actions
(Actions probably don't make sense outside of the context of a specific user.)
POST /api/add_action
name=name
test=trigger
action=action
POST api/users/USERID/actions?name=NAME&test=TEST&action=ACTION
POST /api/enable_action
id=action_id
enabled=true|false
PUT api/users/USERID/actions/ACTIONID?enabled=true|false
(This is actually a general outlet to update any attribute of an action, including whether or not it is enabled.)
POST /api/delete_action
actionid=action_id
DELETE api/users/USERID/actions/ACTIONID

One point to note is that most HTTP clients do not currently support
the "PUT" or "DELETE" methods, so these have to be simulated
through POST methods with an extra parameter. I think that because of the close mapping to resource verbs, is worth using these methods in
the specification and defining the simulation method for the entire API
separately.

The above is based on a rough object hierarchy as follows:

  • ESME API instance (api/)
    • Sessions (api/sessions)
    • Users (api/users)
      • Messages posted by a user (api/users/USERID/messages)
      • Users followed by a user (api/users/USERID/followees)
      • Users following a user (api/users/USERID/followers)
      • Trackers belonging to a user (api/users/USERID/tracks)
      • Actions belonging to a user (api/users/USERID/actions)
    • Messages (api/messages)
    • Tags (api/tags)
    • Conversations (api/conversations)

Each of these bullets represents a set of objects. The resource representing an individual object lives at api/objects/OBJECTID. For example, api/sessions/SESSIONID. As much as is reasonable, one would expect to be able to GET (read), POST (create), PUT (update/amend), or DELETE (delete) any individual member of each of these object sets. Going through each of these objects to ask what it would mean to create, read, update, or delete that object may reveal holes in the existing API, some of which I have filled in above.

Syndicate content