Tuesday, 10 November 2009
To be doomed to know the future and not to be believed
Just came across this. Digg are using Facebook's (now Apache's) "Cassandra" Distributed Datastore. Interesting.
Labels:
data,
distributed,
Eventual Consistency
Thursday, 5 November 2009
Why would you do that, Larry? - Reconciliation, Compensation and Eventual Consistency
In my previous post, I introduced the idea of "Eventual Consistency" and we saw how, due to the CAP theorem, it helps ensure global availability of data by sacrificing its consistency.
In most modern day line of business software development, we are used to dealing with the opposite; "Strong" or "Write" Consistency. If I update "Value 1" in the database to say "Good Morning", then the next time someone reads that value, it says "Good Morning". The system’s state is consistent at the end of a write.
Relational databases such as Oracle or SQL Server are in general "Write consistent" stores. However, if using an “Eventually consistent” such as Azure Table Storage or Amazon SimpleDB, the data is not guaranteed to be consistent at the end of a write, due to the replication delay discussed in the previous post.
These systems will take an amount of time to become consistent. This is referred to as the Inconsistency Window. It’s up to the application to decide how to deal with this inconsistency.
The Reconciliation process can be performed manually or automatically, depending on the circumstances.
The simplest way to demonstrate this as a manual process is by querying a search engine such as Google or Yahoo or Bing. Their indices are "eventually consistent" as you sometimes... brace yourself... come across broken hyperlinks in results! Despite what an Oracle sales person might have you believe, the world does not come to an end if your data is inconsistent. In this case, the end user can reconcile the data manually, by clicking the back button and moving on to the next result. No harm done, maybe a little trade-off in terms of usability, but worth it for access to a vast amount of data.
Removing the requirement for consistent data (by removing locking) means that you perform writes in parallel. Probably the most obvious example of this in action is the auction site eBay. In an online auction, the critical moments are as the end of the auction approaches. You could have a relatively quiet period where people are just watching a series of regular bids, followed by a fairly sudden building of activity as the end nears with many overlapping transactions. A bit like Gustav Holst's "Saturn" which I saw performed live by the RSNO last week. Brilliant! Classical music for nerds!
Anyway, unlike a piece of music, in orchestrating an auction, the order in which bids are written is not actually important. You only need to find the highest bid at reconciliation time. As long as you wait until the inconsistency window has closed, you will have the right answer.
This "right answer" is referred to as the Real state of the system.
When dealing with an eventually consistent store, you need to divide all your data into two kinds of state: Provisional and Real.
Any application transactions should only ever read or write to the provisional state, which is a local view that each of these transactions has - no global viewing allowed! Since there is no shared state, and no global locks, transactions can zoom away unimpeded by each other.
The Real state of the system represents the real state of the world. This always lags behind the provisional state, as the real and provisional states are reconciled periodically normally (although not always) by a background process - in Windows Azure terminology, a "Worker Role".
A solid reconciliation process requires that any actions the sytem performs are revocable. Unlike a transactional system, where we can just call "ROLLBACK", in an asynchronous system some things, particularly those that trigger real-world actions, are "unrollbackable". So how do we revoke them?
We execute what's known in SOA circles as a compensation. An important thing to remember about compensations is that they can be very very expensive. As an extreme example, if I've booked a hotel and discount flight and there are no hotel rooms available, I would have to apply a compensation to cancel the flight, but they might offer no refunds due to the discount. So the compensation cost is not only the cost of the compensation itself, but the sum of both that and any "chained" compensations that it triggers; so, the aim here is to minimise the probability of having to run a high-cost compensation.
The point of reconciliation is that the real and provisional states must be in synch with each other at the end of the process - NOT that they should necessarily contain the same data, but answers to queries against each should return the same thing.
In order to minimize the cost of running any compensations, we may need to do the reconciliation more frequently (at a higher cost).
So once again it's all a balancing act, but the key take aways here are: an understanding of Real vs. Provisional State; and the roles of Reconciliation and Compensation in resolving these states.
NB: Theoretically, distributed systems cannot specify an inconsistency window, as there could always be a catastrophic network failure between geographically distributed replicas. So there are still issues hidden away in there, but in practice most resolve fairly quickly. There are still more patterns for dealing with these things, all based on the not-yet-famous "Well I didn't hear anything" design pattern, also not-yet-known as the “tree falling in the woods” pattern.
Until next time... Soupy Twist
In most modern day line of business software development, we are used to dealing with the opposite; "Strong" or "Write" Consistency. If I update "Value 1" in the database to say "Good Morning", then the next time someone reads that value, it says "Good Morning". The system’s state is consistent at the end of a write.
Relational databases such as Oracle or SQL Server are in general "Write consistent" stores. However, if using an “Eventually consistent” such as Azure Table Storage or Amazon SimpleDB, the data is not guaranteed to be consistent at the end of a write, due to the replication delay discussed in the previous post.
These systems will take an amount of time to become consistent. This is referred to as the Inconsistency Window. It’s up to the application to decide how to deal with this inconsistency.
The Reconciliation process can be performed manually or automatically, depending on the circumstances.
The simplest way to demonstrate this as a manual process is by querying a search engine such as Google or Yahoo or Bing. Their indices are "eventually consistent" as you sometimes... brace yourself... come across broken hyperlinks in results! Despite what an Oracle sales person might have you believe, the world does not come to an end if your data is inconsistent. In this case, the end user can reconcile the data manually, by clicking the back button and moving on to the next result. No harm done, maybe a little trade-off in terms of usability, but worth it for access to a vast amount of data.
Removing the requirement for consistent data (by removing locking) means that you perform writes in parallel. Probably the most obvious example of this in action is the auction site eBay. In an online auction, the critical moments are as the end of the auction approaches. You could have a relatively quiet period where people are just watching a series of regular bids, followed by a fairly sudden building of activity as the end nears with many overlapping transactions. A bit like Gustav Holst's "Saturn" which I saw performed live by the RSNO last week. Brilliant! Classical music for nerds!
Anyway, unlike a piece of music, in orchestrating an auction, the order in which bids are written is not actually important. You only need to find the highest bid at reconciliation time. As long as you wait until the inconsistency window has closed, you will have the right answer.
This "right answer" is referred to as the Real state of the system.
When dealing with an eventually consistent store, you need to divide all your data into two kinds of state: Provisional and Real.
Any application transactions should only ever read or write to the provisional state, which is a local view that each of these transactions has - no global viewing allowed! Since there is no shared state, and no global locks, transactions can zoom away unimpeded by each other.
The Real state of the system represents the real state of the world. This always lags behind the provisional state, as the real and provisional states are reconciled periodically normally (although not always) by a background process - in Windows Azure terminology, a "Worker Role".
A solid reconciliation process requires that any actions the sytem performs are revocable. Unlike a transactional system, where we can just call "ROLLBACK", in an asynchronous system some things, particularly those that trigger real-world actions, are "unrollbackable". So how do we revoke them?
We execute what's known in SOA circles as a compensation. An important thing to remember about compensations is that they can be very very expensive. As an extreme example, if I've booked a hotel and discount flight and there are no hotel rooms available, I would have to apply a compensation to cancel the flight, but they might offer no refunds due to the discount. So the compensation cost is not only the cost of the compensation itself, but the sum of both that and any "chained" compensations that it triggers; so, the aim here is to minimise the probability of having to run a high-cost compensation.
The point of reconciliation is that the real and provisional states must be in synch with each other at the end of the process - NOT that they should necessarily contain the same data, but answers to queries against each should return the same thing.
In order to minimize the cost of running any compensations, we may need to do the reconciliation more frequently (at a higher cost).
So once again it's all a balancing act, but the key take aways here are: an understanding of Real vs. Provisional State; and the roles of Reconciliation and Compensation in resolving these states.
NB: Theoretically, distributed systems cannot specify an inconsistency window, as there could always be a catastrophic network failure between geographically distributed replicas. So there are still issues hidden away in there, but in practice most resolve fairly quickly. There are still more patterns for dealing with these things, all based on the not-yet-famous "Well I didn't hear anything" design pattern, also not-yet-known as the “tree falling in the woods” pattern.
Until next time... Soupy Twist
Subscribe to:
Posts (Atom)