Sunday, May 5, 2013

Idempotence

Idempotence. The subtle killer. The heartbreak. The agony. But it's not what you think.

It's a painfully simple principle that sounds, on its face, like no more than common sense. Like a health problem, it can be the source of a lot of pain, if ignored. And yet some web services violate it.

An idempotent operation is one that can be executed multiple times without changing the result of the initial execution. Examples of idempotent mathematical operations would be x * 0 = 0, or |(|x|)| = |x|.

It seems rather obvious that a read-only RESTful web service request should behave in a similar fashion. For example, if you went to an ATM and requested a balance inquiry, you wouldn't expect the inquiry to change any data in your account. In RESTful Web services: The basics, Alex Rodriguez maintains that REST APIs that use an HTTP method inappropriate for their intended purpose constitutes poor design.

By "the intended purpose" Rodriguez means the appropriate CRUD operation effected by each HTTP method:
  • POST creates a resource.
  • GET retrieves a resource.
  • PUT updates a resource, that is, changes its state.
  • DELETE removes or deletes a resource.
We expect GET to be read-only and therefore idempotent. That is, it only retrieves resource data and never changes it. To be clear, "resource data" means data stored and maintained by a web service, for example, contact information stored in a database. It doesn't mean metadata--data used only by the web service to operate--for example, a counter that tracks how many times a resource was viewed, or the timestamp when the resource was last accessed. GET could change metadata and still be considered idempotent with respect to the underlying resources.

GET is meant to retrieve resource data, not to create objects or change their state. Here are a few ways in which it could be abused:

Abusing GET for POST

It's hard to imagine why someone would use GET to create a new resource--after all, that's what POST is for. But you could do it with a request like this:

GET /addressbook/contact?first_name=Arthur HTTP/1.1

An unintended consequence could be that someone would invoke this request thinking that it will query the database and return the contact record(s) having the first name 'Arthur'. Instead, the service will create a new record for 'Arthur'.

Instead of using GET to create a record, the appropriate request would use POST to send the data in the request body:

POST /addressbook HTTP/1.1

Host: address_server< ?xml version="1.0"?>
  <contact>
    <first_name>Arthur</first_name>
  </contact>

Abusing GET for PUT

You could also use GET to update an existing record--which you really should do using PUT. But let's look at how you might make the request with GET:

GET /addressbook/contact?first_name=Arthur&new_first_name=Michael HTTP/1.1

This looks like a strange GET request, but you might expect it to retrieve a record with fields matching the strings 'Arthur' and 'Michael'.

Instead of using GET to update a record, the appropriate request would use PUT to send the new data in the request body:

PUT /addressbook/Arthur HTTP/1.1
Host: address_server<?xml version="1.0"?>
  <contact>
    <first_name>Michael</first_name>
  </contact>

Heresy and bad practice

Such unorthodox uses of HTTP requests constitute bad practice in the following ways:

  • They violate idempotence. You should be able to call GET an unlimited number of times, and expect that no record data will ever be changed.
  • They present misleading expectations. If you make a GET request, you expect it only to retrieve data without changing the state of the data. Creating a record or updating a record is not the expected function of a GET request.
  • They pass data as parameters instead of as structured data, which is a major advantage of REST. The above case uses GET to update an existing record uses REST in an almost SOAP-like manner, by passing data via parameters. Passing data in this way might be appropriate for some applications, such as query parameters in search engines, but it's a problem with hierarchically structured data. Instead, you should pass structured data (such as XML or JSON) in the request body of a POST or PUT method.
But there are cases where you might not use the "appropriate" HTTP method for its intended function. I've seen this implemented in a few APIs. You can use POST to initiate a state-changing operation. I'm not a fan of doing this. Yes, it can work if implemented carefully. Perhaps I'm a purist, but it breaks the integrity of the intended use of each HTTP method for its intended CRUD operation. However, this implementation makes sense if:
  • You intend to create a unique new object each time POST is called, or
  • You know that the service will only be called once for each set of data sent in the request.
For example, you might have a service that authenticates a user, then sends the user a one-time passcode (nonce). Any objects that you might have created are discarded. In this case, POST would make sense, because it created "virtual" resource data that was created and deleted in the process.

The Man from C.R.U.D.

While there are strong arguments for making it a best practice to hew to the "intended purpose" of the HTTP methods for each CRUD operation, inflexibly insisting on such rigidity might earn you the moniker "The Man from C.R.U.D." I believe there are exceptional cases that are sound implementations that are possibly even more appropriate than the orthodox usage.

The guidelines are reasonable and clear: never implement methods in a way that their expected use might be misinterpreted, and certainly never implement requests that alter resource data where idempotence is expected.