Scale This!: Do it Again!

This is the last in my mini-series about the fallacies of network computing before moving on to deal with the practical side of dealing with these things.

The final fallacy to address, states that "the network is reliable". As you know if you ever used AOL in its, erm... heyday? - it's not. It may go a bit faster now, but it's the same old network that it used to be.

What kinds of operations can be safely performed on something so unreliable? The answer is: "Idempotent" ones. Gosh, Latin! How old is this internet?

idem - meaning "the same". You'll see this used in legal papers abbreviated to "id." as lawers do tend to repeat themselves a lot, and are by far the biggest fans of dead languages (outwith Delphi programming circles).

potent - meaning power

So: "having the same power". But what does that mean? Essentially it's a big fancy word for something that's very simple (in theory at least).

Idempotent operations are essentially repeatable, and have no side effects.

Due to the infrastructure of the net, it is often the case on the web that messages between end points may be sent and received more than once, so it's important that messages processed multiple times do not cause side effects. The clichéd example of this is the browser refresh, and while this is an example of idempotence, I feel it's much clearer to explain by showing what it's not.

Voting systems are the common example of failing to take idempotence in to account.

Even though the "replay attack" is one of the oldest forms of internet abuse, it is still surprisingly common to come across sites that contain polls which are easily rigged. Essentially the web page presents "Vote for A" and "Vote for B" as functions. You click "Vote for B" and a counter is incremented. If you fire the request again, the counter is incremented again.

Write...

for (int i = 0; i < 10000; i++) { VoteForB(); }

... and zap! In a fraction of a second, "B" becomes the clear winner.

Increment() IS NOT an idempotent function. It has side effects on application state and is not repeatable.

An idempotent operation is both repeatable and passive.

So, a toggle() function is not idempotent. A Sqaure() Function or the evaluation of a regular expression against given input is idempotent. It does not change any value given to it and has no side effects.

A * A will always equal A squared.

You can of course deconstruct many non-idempotent operations in to a number of idempotent equivalents, e.g. SwitchOff() and SwitchOn() functions rather than Toggle().

Calling SwitchOff() SwitchOff() SwitchOff() SwitchOff() SwitchOff() SwitchOff() will always result in the Switch being Off.

Idempotence is often confused with method safety. A "Safe" method has absolutely no side effects, for example a read operation is Safe. A Delete() operation is not "Safe" but it can be made to be idempotent. i.e. repeatable with no side effects.

This type fault tolerance is actually designed in to HTTP itself. See:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1.2

Fundamentally, the RESTful-style of the web is different from the CRUD style of most database-oriented systems.

Under RESTful systems using HTTP, calling PUT multiple times will result in the resource existing at a particular address. Calling DELETE multiple times will result in the item no-longer-existing.

You can call PUT PUT PUT or DELETE DELETE DELETE and the end result will be the same.

The comparable operations using something like SQL will result in errors beyond their first call: "The record already exists" or "The record does not exist", respectively.

On the web, you ignore idempotence at your peril. Famously, Google Web Accelerator wreaked havoc and was quickly withdrawn due to web developers widely ignoring idempotence and method safety.

Almost overnight, millions of web developers learned about idempotence the hard way.

While idempotence seems fairly simple on the face of it, the reality of it can quickly become very, very complicated, particularly since, in most distributed systems, you need to take in to account the idea of sequential idempotence. If you're looking at a system with multiple workers of some sort, either threads, VM instances, or WorkerRoles in Azure, and if you're using lock-free structures (as you should be), the possibility exists that you may have to process the same message not just more than once, but also out of order.

For example, suppose that someone updates a record, while another simultaneously deletes it.
Here, I have an Idempotent PUT and an idempotent DELETE verb. I can call PUT PUT PUT or DELETE DELETE DELETE, and have the same end result. But now I have an extra problem: I now require two independent operations to have combined idempotence.

The following sequences (and even all of these if run in parallel together)

PUT DELETE
PUT PUT DELETE PUT DELETE PUT
DELETE PUT

should all have the same end result. What is that end result?

Now you can see how idempotence works, you can think about how to solve this. And if you want a clue see this article:
http://www.spamdailynews.com/publish/Google_subpoenaed_to_turn_over_deleted_Gmail_messages.shtml
The important thing to remember is to write code that is repeatable and has no side effects. The result of this is that you enable fault tolerance, using the simple policy of "If you're not sure it worked, just do it again".

Thursday, 30 July 2009

Do it Again!

No comments:

Post a Comment