user interface

OAuth Q&A - Part 1

OAuth Q&A - Part 1

I've been seeing a lot of misinformation about OAuth in discussions lately. Mostly, this is because there has been a lot of activity around OAuth due to the announcement that the Twitter API will be supporting OAuth and eventually (probably, and less officially) moving to OAuth-only authentication and dropping support for basic auth entirely. Generally this misinformation is rebutted somewhere else on the web but it can be somewhat inconvenient to track down all of the call-and-response blog postings on the topic.

As such, I thought an O'Grady-style Q&A might be in order, based on my notes and recollections. Said Q&A turned out to be a little longer than expected, so I'm doing it in a number of parts that will be determined by the cessation of misinformation or the end of my attention span, whichever comes first. If there are any inaccuracies included in these posts that I become aware of (On the web? About OAuth? From a non-expert? Impossible!), I'll post corrections inline.

What was OAuth designed to do?

OAuth was designed to do delegated authorization, which is a fancy way of saying that it was designed to ensure that a user will never have to give the password for a service to any entity other than the service provider. If the user wishes to use an application that is not the provider itself in order to access the provider (for example, using Twhirl to access Twitter, or Plaxo to access your Google contacts), then that application can use OAuth to request authorization, which is then granted in a transaction that occurs directly between the user and the provider and does not involve giving the password to the application.

You can read lots about it on the OAuth site, the OAuth wiki, the OAuth mailing list, on Eran Hammer-Lahav's site, and the Google identity and authorization site.

OAuth doesn't work in desktop applications because there is no way to protect the consumer key or secret, so there's no point in using it here, right?

No, OAuth does what it was designed to do just fine in desktop apps.

But doesn't OAuth rely on a consumer secret and key to identify applications that are allowed to access protected resources and services? How do I protect that in a desktop application?

OAuth neither requires nor assumes that the consumer key and secret are hidden from other apps (though there are some nice payoffs if you can manage this). The consumer secret is simply not sent over the wire. Providers should not assume that consumer keys or secrets identify specific consumers or classes of consumers. Note that this may require providers to treat desktop consumers differently than web apps where requests come from a single IP address and the consumer has more control over the key than is the case with desktop apps.

My company has a web app and we also provide a desktop client. Because we provide the client, we can just use username/password authentication in the client and that's safe, right?

No, and there are two reasons for this. First, doing this teaches your users that responding to the password anti-pattern is acceptable behavior and makes your users more susceptible to malicious API clients that ask for their password. Second, this client behavior necessarily results in the user's password being stored by the client in a recoverable format, which is a significant security risk.

OAuth is really just a standard for token negotiation and request signing. My application already generates API access tokens and I have a scheme for sending the token along with the request, so why do I need OAuth?

The most important reason is that there are a lot of things that can go wrong with API access tokens, especially when used over an insecure transport (http instead of https). Using a public, peer-reviewed standard lets you rest easy knowing that lots of experienced thought has gone into avoiding security issues.

Well, okay, but I've actually got the best security minds in the business working on my access token scheme, so I'm pretty comfortable there. What are the other reasons?

The next biggest reason is interoperability. There are lots of OAuth [client libraries] that make it relatively easy to inter-operate with systems that use OAuth as their auth delegation protocol. For example, Shindig (OpenSocial, Google Gadgets, iGoogle, etc.) has built in support for OAuth API access for gadgets running on the platform. I expect that within a year there will be at least one web "glue" application (like Yahoo! Pipes, or Tarpipe) that will support reads from and writes to arbitrary OAuth endpoints. If you use a custom scheme, you will miss out on this interoperability.

Medical problem listing, vocabulary, taxonomies, and user interfaces

Fie Song and R. William Soukoreff (1994). "A Cognitive Model for the Implementation of Medical Problem Lists", Proceedings of the Frist Congress on Computational Medicine, Public Health and Biotechnology. Austin, Texas.

Problem listing is a way to organize the myriad of medical information compiled in charts and medical records. It was originally conceived in a paper-based medical records system but the linked article discusses the adaption of the technique to a system using electronic medical records. The article itself is a bit out of date, having been written in 1994, but the state of the art today in many areas (medical records, information storage and retrieval, library science, cataloging, ERP systems used in businesses) is surprisingly similar to that described in the paper.

Because of this the paper can serve as a useful prompt for discussion and criticism of the current state of the art. I've spent some time discussing this area with my father as he has a particular interest in the area, so hopefully this is somewhat useful as a summary and continuation of our discussions.

On the activity of problem listing

The paper tends to take the tack that if problems can be properly entered into a well-designed computer system then the computer can handle much of the task of categorizing and relating problems to each other. This is fair enough and in the intervening years since '94 several techniques for establishing these sorts of relationships have matured in the data mining (relation through coincidence of terms or other statistically relevant coincidences) and semantic modeling (relation by way of a manually or automatically maintained model of relationships between terms) spaces.

One interesting point that came up in our discussion is that problem listed patients have been shown in some studies to receive better and more predictable treatment. We theorized that a reason for this difference may be the engagement of the physician in the review of the medical record and the creation of the problem list. This activity may help the physician to take the current encounter with the patient in the context of the overall medical history. If the computer takes care of building the problem list then some of this benefit may be lost.

On the permanence of the record

An overarching concern throughout the article is with maintaining the permanence of the record. This is one area in which the article shows its age. In 1994 the only widely deployed examples of systems I'm aware of that maintained versioned, audit-worthy record of changes to a document were source control systems used in software projects. Even these had not gained wide adoption.

Today various wiki-like patterns are available that meet the requirements for permanence and audit-worthiness outlined in the article. These systems also maintain a very robust ability to make edits and changes over time and to easily see the differences between a document and two different times as well as the people responsible for the changes.

Today, permanence is not as much of a concern or a technically interesting problem. It is a problem that has been largely solved and examples of this solution (like Wikipedia) exist on a scale that dwarfs that of almost any individual medical records system.

On standardized vocabulary and taxonomies

In contrast to the previous area, this is an area where information sciences (unfortunately) hasn't changed much in the past decade-plus. The article espouses the need for a standardized vocabulary for describing medical problems as well as standardized taxonomies for categorizing problems. This type of approach is peppered throughout the article, but especially on pages 4 - 6, with the concise statement being that

The physicians should be forced to use some standardized medical vocabulary to describe the problems, however enough flexibility should exist within the problem descriptions to satisfy the physicians.

There are a lot of problems with this demand for standardization and I should write more about it. Problems mostly center around a probable lack of expressiveness and the threat of obsolescence.

Granted, demanding standardization significantly reduces the complexity of these types of systems on the technical side. But in my experience over-standardization can (and often does) make systems unusable. At the very least it significantly increases complexity on the user-interface side of the house by presenting UI designers with the problem of creating flexibility and expressiveness while enforcing adherence to a very confined taxonomy or vocabulary.

My main objection to the "standardization" approach to taxonomy is that it is rarely justified, and this article is no exception. If a standard taxonomy is really better in a particular area in the medical field, then this can and should be shown in a properly controlled study. A medical records system is as much a part of the medical approach to patient care as a drug or a hospital procedure and as such it should be subject to the same scrutiny through scientific study. In a non-medical and potentially less scientific discipline, I would at least like to see an argument for standardization, but the benefits of a standard approach are usually assumed.

What are the alternatives?

The article mentions the possibility of free-form input fields while holding that the technology to support this type of input as the primary input mode has not yet arrived. Today the handling of unstructured text in database applications has improved considerably from 1994, especially in the area of indexed search. Coupled with modern data mining techniques or a reasonable semantic model, unstructured input might be usable.

Another option is semi-structured input like what we see in tagging systems. Semi-structured input is often not as onerous in its limits on expression and is less of an obsolescence threat than a totally structured taxonomy because the semantics of the tagging system are user generated and can easily change with the times. (Of course, it remains a challenge to link up old terms with newer ones, but the common practice of over-tagging with redundant terms can help ease this problem.)

Predictably, semi-structured input delivers some of the advantages of structured data and taxonomies such as simple processing and matching, fast access with limited need for indexing, and a somewhat predictable semantic structure that allows programs to make assumptions about the data that could not be made with unstructured input. For example, in the tagging situation, a program could make the assumption that tags are atomic terms that directly describe the object being tagged.

I've got a draft blog post laying around somewhere that outlines some alternatives to standardized vocabularies and taxonomies but haven't gotten around to digging it out and posting yet. Oh well.

On user interface

The article gives a nice description of the strength of paper-based systems in the user interface area, mentioning that in a paper-based medical charting system

it is possible to overview several pages at the same time, and to rapidly browse through a large number of pages. The speed an experienced user can achieve in 'zooming-in' to the relevant parts is remarkable, and the amount of information covered by a glance is enormous.

The article then goes on to talk about a "graphical interface that allows a user to directly manipulate the retrieval requests and rapidly browse through a medical record". Keep in mind that this is 1994 and graphical interfaces that do much of anything are still pretty few and far between. This is an apt description of a problem that the information visualization community is still working through today.

Another one of the most insightful points in the article is in the user interface area. This is the requirement that

Different formats of the problem list will be useful to different people, and hence several different views of the problem list should be supported.

This requirement is spot on and should be a requirement for any information storage and retrieval system. Moreover, we are learning today that it is important to provide not only multiple views or interfaces to the data, but a system for accessing the data directly for the purpose of building custom views and interfaces that the designers of the software might not have envisioned. This type of system is usually referred to as an application programming interface, or API, and is a key component of the explosion of creativity and interoperability that is occurring right now in the web-based application space.

A thought on interest and attention

Laura Fitton's post on tools she'd like to see for Twitter got me thinking about a related problem in aligning attention and interest, so I'd like to add to her list. The problem: It's hard, though not impossible, to align the attention a given blog/tweet/IM/email demands to the interest we have in it.

As an example, take myself and three of interests that I occasionally mention on Twitter.

The set of people that are actually interested in all three of these things is pretty minimal, and that's just a sampling of three things I mention on a regular basis.

Twitter's approach to this issue is to minimize the attention that everything receives. That seems to work pretty well for Twitter. Other systems (I'm looking at you Email) tend to maximize the attention that every message receives. Some systems make an attempt to match attention to interest and I'd like to see more of that type of work either as built-in features or as add-ons facilitated by an API.

We're starting to see some movement in this area among collaboration and knowledge management vendors. There are also tentative steps from social sites like Ma.gnolia.

Let's also remember that this is personal. The relative importance of any given piece of information is different for you than it is for me. So this approach is especially important in our personal communications software, be that software a feed reader, an email client, or an service like Twitter.

A suggestion for Twitter auto-translation

Twitter is a big ol' international hodgepodge of communities, conversations, and observations to the tune of 700,000+ active users.  Being international, it also tends to be multi-lingual.  Herein lies a problem: I can't understand, say, Arabic.  Or French or Japanese for that matter.  In our brave new world of computerized , automated everything, there's no technical reason these conversations can't be translated on the fly, if only we had a little metadata.

Metadata, in this case, would be data about the language someone is tweeting in.  You see, from a consumer perspective the real problem of automating translation isn't the translation part.  Google and Yahoo! have that pretty well nailed on the level of 140 character tweets that often appear to have been run through a translation program anyway.  The problem is figuring out, automatically, the language of the post and the language it should be translated into.

Enter Twitter nanoformats.  Nanoformats are closely related to microformats.  So closely related that they even live on the Microformats wiki.  Anyway, nanoformats are little pieces of text you can insert into tweets and other things to provide some sort of metadata, like location, tags, or (surprise) language.  The language nanoformat consists of the string "lang:" followed by the iso 639-1 code for whatever language the Tweet is tweeted in.

In order to enable auto-translation, a Twitter user would have to do two things.  First, our user would insert a lang nanoformat in their own bio to indicate the language they normally tweet in.  This bio nanoformat will also be used to figure out the language our user would like to receive other tweets in, so choose wisely.  Second, our user would insert a lang nanoformat in tweets they make that are not in the language their bio indicates.

So, for example, I normally twitter in English, so my profile indicates that with the "lang:en" nanoformat.  But if I tweet something in Spanish, I would add the "lang:es" nanoformat to that tweet.  (Don't worry, I won't subject you to any Spanish tweets, notwithstanding the example below.)

Okay, here's how it works

Now that we've done all that groundwork, the magic can begin.

Warning, the following paragraphs contain forward-looking statements.  No client actually does this stuff yet, though it's conceivable that I am under-informed on that point.

When I fire up my Twitter client and log in, the client checks out my bio and sees that I want my posts in "en", which means "English" in ISO-ese.  We'll call this imaginary client Litter, but we could just as well call it Twhirl, Snitter, Tweetr, or Twitterific.

Now, whenever Litter sees a tweet, it checks the tweet for the lang nanoformat and if it finds one, and it isn't "lang:en" it makes a quick call to http://translate.google.com to translate the tweet, sans nanoformat.  If the client doesn't find a lang nanoformat in the tweet, it should check the bio of the person the tweet belongs to for a lang nanoformat and perform the same call.

So the tweet "Hablo español lang:es" would result in the following call in my Litter client:

http://translate.google.com/translate_t?text=Hablo español&langpair=es|en

Click that and see what comes out.

What happens next involves some screen scraping of dubious legality, but maybe some agreement can be reached.  Anyway, Litter then pulls the result of this query out of the HTML mishmash it receives back.  It provides that translated result as the content of the tweet.

A little discussion goes a long way

There are a lot of reasons this won't work.

First of all, there is a 70 request per hour limit on the Twitter API.  Litter would exhaust that limit in about 5 minutes if it were trying to translate every tweet in the stream I receive.  As a result, I suggest that the translation function be implemented with a manual trigger, at least initially.  The imaginary Litter allows a user to right-click on a tweet and choose "Translate" from the context menu.

It also appears Google's terms of use don't allow this sort of use of the Google Translate service.  Similarly Yahoo! doesn't appear to allow this use of Babelfish.  However, there are a lot of dubious uses of these sorts of tools, so maybe we can come to some sort of agreement.  In the meantime, a working proof on concept would be pretty neat.

Lastly, there should always be some way to easily get at the original tweet within the client, just in case something goes horribly wrong in the translation.  This is technically an opt-in system, since it won't work on a person's tweets if they don't use the nanoformat, but we should still be considerate.

Despite these hurdles I'm certainly looking forward to this type of functionality becoming common, and not just on Twitter.  I know others are as well.  What do you think?  How can we make this happen?

Privacy controls - a UI design problem

Privacy controls are an area of continuing concern in social applications. Michael Krigsman has hit on the point in a sensationalist manner on his IT Project Failures blog in a post titled Twitter is Dangerous. I think Michael is off-base because I think Twitter gets it right by keeping it simple: Your stuff is either private or public. It's hard to misunderstand that.

The important thing is that users of a given system clearly understand the privacy implications of entering data into that system. In order for this to happen, the user interface and explanation within the application have to correspond with the terms of service and technical platform. As I said, I think Twitter gets it right, but others may disagree.

We see a more muddled case with the recent addition of social sharing in Google Reader. Google could have done better with the initial launch of this feature, but to the credit of the Reader team, they got the message and fixed the problem. For the interaction design nerds out there, I think that the problem here was actually not with the sharing feature per se, but with a user/developer disconnect around the privacy implications of "Sharing" an item via a publicly indexed feed with an obfuscated URL. Many users thought of the feed as private while I'm pretty sure the Reader developers thought of the Shared items feed as public.

Let's wend our way to the point of this post, which amazingly enough isn't Facebook's Beacon. No, I'm going to talk about the Plaxo Pulse and a conversation I've been having with John McCrea on Twitter. The situation is that I, apparently, had a misconception about how Plaxo shares my Pulse items when I elect to make them "Public".

Specifically, I thought that making a feed "Public" would broadcast it to all of my Plaxo Connections. Connections on Plaxo are like Facebook Friends, both they and I have to agree to being connections. When you make a connection, you categorize the contact as any combination of business, friend, or family. As such, I assumed that I had exact knowledge of who would receive my Plaxo Pulse broadcasts, though I recognize that the material being broadcast is totally public for anyone who actually looks for it.

I was wrong. What actually happens is (apparently) that anyone who uses Plaxo and has me in their address book receives a broadcast in their Pulse feed of entries I set as "Public". In other words, I have no idea who I'm spamming with these entries.

At the risk of sounding self-righteous, I'm going to blame the Plaxo user interface for this one. (Whew, I feel better already.) Here's why:

  1. The copy reading "choose which of your connections ..." clearly indicates that I'm choosing which Connections have permission to see the feed. Plaxo creates a strong division between the Pulse feature and the Address Book feature in their documentation (for example) so when Plaxo says "Connections", I assume they mean it. There is an expectation of consistency in copy across a site.
  2. When "Public" is checked, it automatically checks and grays out the business, friends, and family options, creating the possibility that "Public" corresponds to "All of the above" or "All connections". This interpretation is clearly ambiguous and draws on my (fairly common) experience with HTML check-boxes and multiple choice tests, but any ambiguity is probably a bug in a scenario like this one.

I'll be clear that no damage was done here and all of the Pulse feeds I share are public information, so Michael Krigsman can rest easy. For me this is an academic exercise. But it is conceivable, for example, that a client who has me in his or her address book and uses Plaxo was getting spammed by my tweets and del.icio.us bookmarks 10-15 times per day. I'm a pretty public person, at least online, but I prefer to allow people the choice of whether they want to subscribe to a steady stream of my random thoughts. As such, I've un-Publicized all of my Plaxo Pulse feeds. If you want to see 'em, Connect with me.

I think this is a good example of one thing that can go wrong while designing and changing privacy settings. The complexities are similar to other UI problems, but the ramifications in this area are often invisible to the user and the consequences of misunderstanding could be severe. Combined, these two ingredients put the user at the mercy of the developer and this trust should be respected by taking care in design decisions and testing against real user expectations.

Syndicate content