Scale This!: 2009

Tuesday 10 November 2009

To be doomed to know the future and not to be believed

Just came across this. Digg are using Facebook's (now Apache's) "Cassandra" Distributed Datastore. Interesting.

Thursday 5 November 2009

Why would you do that, Larry? - Reconciliation, Compensation and Eventual Consistency

In my previous post, I introduced the idea of "Eventual Consistency" and we saw how, due to the CAP theorem, it helps ensure global availability of data by sacrificing its consistency.

In most modern day line of business software development, we are used to dealing with the opposite; "Strong" or "Write" Consistency. If I update "Value 1" in the database to say "Good Morning", then the next time someone reads that value, it says "Good Morning". The system’s state is consistent at the end of a write.

Relational databases such as Oracle or SQL Server are in general "Write consistent" stores. However, if using an “Eventually consistent” such as Azure Table Storage or Amazon SimpleDB, the data is not guaranteed to be consistent at the end of a write, due to the replication delay discussed in the previous post.

These systems will take an amount of time to become consistent. This is referred to as the Inconsistency Window. It’s up to the application to decide how to deal with this inconsistency.

The Reconciliation process can be performed manually or automatically, depending on the circumstances.

The simplest way to demonstrate this as a manual process is by querying a search engine such as Google or Yahoo or Bing. Their indices are "eventually consistent" as you sometimes... brace yourself... come across broken hyperlinks in results! Despite what an Oracle sales person might have you believe, the world does not come to an end if your data is inconsistent. In this case, the end user can reconcile the data manually, by clicking the back button and moving on to the next result. No harm done, maybe a little trade-off in terms of usability, but worth it for access to a vast amount of data.

Removing the requirement for consistent data (by removing locking) means that you perform writes in parallel. Probably the most obvious example of this in action is the auction site eBay. In an online auction, the critical moments are as the end of the auction approaches. You could have a relatively quiet period where people are just watching a series of regular bids, followed by a fairly sudden building of activity as the end nears with many overlapping transactions. A bit like Gustav Holst's "Saturn" which I saw performed live by the RSNO last week. Brilliant! Classical music for nerds!

Anyway, unlike a piece of music, in orchestrating an auction, the order in which bids are written is not actually important. You only need to find the highest bid at reconciliation time. As long as you wait until the inconsistency window has closed, you will have the right answer.

This "right answer" is referred to as the Real state of the system.

When dealing with an eventually consistent store, you need to divide all your data into two kinds of state: Provisional and Real.

Any application transactions should only ever read or write to the provisional state, which is a local view that each of these transactions has - no global viewing allowed! Since there is no shared state, and no global locks, transactions can zoom away unimpeded by each other.

The Real state of the system represents the real state of the world. This always lags behind the provisional state, as the real and provisional states are reconciled periodically normally (although not always) by a background process - in Windows Azure terminology, a "Worker Role".

A solid reconciliation process requires that any actions the sytem performs are revocable. Unlike a transactional system, where we can just call "ROLLBACK", in an asynchronous system some things, particularly those that trigger real-world actions, are "unrollbackable". So how do we revoke them?

We execute what's known in SOA circles as a compensation. An important thing to remember about compensations is that they can be very very expensive. As an extreme example, if I've booked a hotel and discount flight and there are no hotel rooms available, I would have to apply a compensation to cancel the flight, but they might offer no refunds due to the discount. So the compensation cost is not only the cost of the compensation itself, but the sum of both that and any "chained" compensations that it triggers; so, the aim here is to minimise the probability of having to run a high-cost compensation.

The point of reconciliation is that the real and provisional states must be in synch with each other at the end of the process - NOT that they should necessarily contain the same data, but answers to queries against each should return the same thing.

In order to minimize the cost of running any compensations, we may need to do the reconciliation more frequently (at a higher cost).

So once again it's all a balancing act, but the key take aways here are: an understanding of Real vs. Provisional State; and the roles of Reconciliation and Compensation in resolving these states.

NB: Theoretically, distributed systems cannot specify an inconsistency window, as there could always be a catastrophic network failure between geographically distributed replicas. So there are still issues hidden away in there, but in practice most resolve fairly quickly. There are still more patterns for dealing with these things, all based on the not-yet-famous "Well I didn't hear anything" design pattern, also not-yet-known as the “tree falling in the woods” pattern.

Until next time... Soupy Twist

Thursday 1 October 2009

A river runs through it

Hopefully now, you will be becoming acclimatised to some of the best practices for designing large scale systems. Now that we've examined the "application layer" in a bit of detail (implement only Atomic, Idempotent, Stateless operations), we'll move on to look at the "back-end" - the data storage / state / persistence layer.

Fundamentally, all scalable architectures are about replication, be it caching, db replication, or clustering. They are about mitigating the problems of having one central authoritative system which simply does not scale. All because as the number of users and transactions grow, so does contention on a central resource. The only way to deal with this is to add replicas: replica applications, databases, machines, graphics, etc., whatever your application is built of.

Way back in the "DotCom Boom (tm)" days of the internet, Eric Brewer, chief scientist of a company called Inktomi, which ran Yahoo!’s search engine before a young upstart company called Google got their big break by displacing them, came up with an idea known as "Brewer’s Conjecture", or sometimes the "CAP theorem". It basically says that in a system that uses distributed data, you can choose two of the following (never all three):

Consistency, Availability, or Partition-tolerance

Having all three is impossible, as there is a delay that occurs during the creation of a replica. To deal with this delay, you can either lock access to each copy until all are in sync, or put up with the fact that these replicas may be out of synch with each other during the replication process. One of the most interesting demonstrations of this was what used to be known as the “Google Dance”: when Google pushed out their monthly index updates, search results across the globe would all go out of synch (nowadays they use much more frequent updates, and a lot more in the way of personalisation technologies, so you can’t really see this effect any more).

At the compute / application layer, problems of state management are largely solved by adopting the styles of programming discussed previously in this blog (Implement only Atomic, Idempotent, Stateless operations).

However, the most common backend state management system for most web applications is the Relation Database Management System, or RDBMS; usually MySQL, Oracle, or MS SQL Server. These databases provide partition tolerance (to a degree) and a consistent view of data, so logically they must be sacrificing availability.

In his most famous blog post, Werner Vogels, Chief Technology Officer of Amazon, takes the view (and I agree) that not only should we sacrifice consistency for the sake of Availability and Partition-tolerance, but that this is the only reasonable option as a system scales (as the uptime of a whole system is the product of the uptime of all its component parts).

Other people have different views. StackOverflow, a popular programming community site, have chosen to sacrifice availability. You often (in my case a lot! maybe I'm breaking it??) see the message "StackOverflow is currently offline for maintenance. Routine maintenance usually takes less than an hour. If this turns in to an extended outage, we will post details on the blog." This may cause them some interesting challenges as they grow, but certainly at the moment , all things considered, they will not necessarily lose any cold hard cash if their site is down for 20 minutes, whereas Amazon’s losses would be massive.

Buiried deeply in the small print of Amazon’s SimpleDB service, the documentation for the Amazon SimpleDB PutAttributes method has the following note:

"Because Amazon SimpleDB makes multiple copies of your data and uses an eventual consistency update model, an immediate GetAttributes or Query request (read) immediately after a DeleteAttributes or PutAttributes request (write) might not return the updated data."

Windows Azure’s Table Storage and both Amazon and Azures Queue Services also share this characteristic, and so cannot guarantee message ordering (FIFO).

This model of "Eventual Consistency" is also one of the core characteristics of the new breed of NoSQL databases appearing. And whilst this makes a lot of sense to many Web Developers who have spent their lives dealing with stateless cached data, it's probably the single most confusing part of distributed application development to a traditional RDBMS or Client/Server developer.

The very idea! Eugh! Yuk! You mean you can update data only for the system to "sometimes give you the wrong result". This makes RDBMS-centric developers feel very nauseous. But fear not! Eventual consistency is part of daily life. As I've said in a previous post, it's odd that the canonical example of the transaction is the bank transaction, when in fact this process operates in an eventually consistent manner (and is actually far more complicated than a transaction).

We've been indoctrinated by the Big RDBMS vendors, that only transactional write consistency will do. They commonly use the banking metaphor to imbue their products with an implied stable, trustworthy state that is simply not there.

People just do not need to know exactly when someone else does something, just what they've done, when they come to deal with it. The mother of all transactional systems, the stock-exchange, is eventually consistent. However a LOT of money is spent to get the inconsistency window as short as possible, for example, locating trading offices as close as possible to exchanges with high speed connections.

There is absolutely no doubt that dealing with an eventually consistent store is much more complicated than dealing with a central RDBMS, but as multi-tenant systems become the norm over the next 5-10 years, the RDBMS will look increasingly creaky as a core on which to build an application.

Over the next couple of posts I'll look in more detail at eventually consistent data stores and queues and some patterns commonly used to deal with it.

Tuesday 15 September 2009

Intermission

Please stay tuned for some important messages.

I've been on holiday this week. So you'll just have to wait a bit for my next thrilling post :-).

In the meantime... Some interesting holiday reading...

http://research.microsoft.com/en-us/um/people/horvitz/leskovec_horvitz_www2008.pdf

And this made me laugh. ..

Normal service will resume shortly.

Monday 24 August 2009

Unleash the beasts!

Following my last post on caching patterns, I was asked a question about how to scale cache repopulation. The question was "In a read through caching scenario... assuming your cached object takes 30 seconds to calculate, and you have 60 processes accessing the cached value every half a second each. This suddenly expires, resulting in 120 misses per second until the value is re-calculated and re-cached successfully, swamping your server with repopulation requests. Right?" Well, erm, yes. Yes it does. Within some of the more nerdy circles, this is often called the Dog Pile effect, and has brought many websites down.

When caching quick-to-create items, this rarely causes problems. However in the case of large data sets that take a while to calculate, this can prove "the straw that breaks the camel’s back".

From Wiktionary: v. to dogpile

To jump on top of someone, usually in a group.
2003, Nancy Holder, Buffy the Vampire Slayer: Chosen[3], ISBN 0743487923, page 657:
A vampire got her around the neck from behind; then more, dogpiling her.
To pile on or overwhelm, such as with criticism or praise.
2005, Craig Spector, Underground[4], ISBN 0765306603, page 169:
But this guy was serious, using online payment services and dogpiling her e-mail box within minutes, requesting expedited shipping.

You shouldn't really have anything in your system taking so long to repopulate. If you do, it's likely to indicate a more fundamental architectural problem, as whatever you’re caching does not have a high degree of disposability.

This is a situation where we have hundreds of workers looking for the same information, so we don’t want to swamp our data source with the requests for that same information.

What we want to do, in theory at least, is use a lock. OMG. I know. I said never use locks, but here we are using them. What gives?

Well, this is really an optimisation for a particularly unusual case. Here, cached items are immutable, so we have no side effects to worry about. The process for repopulating lists is just not very efficient. Locks CAN be used in a distributed system, but just very carefully (and normally on a singly responsible background process). For example, some data reconciliation processes may require locks on data, but by putting these processes on background workers, we try to ensure they do not block user activities.

You also need to be aware of the problems of using locks in your particular environment. (We're OK here as our write operations are idempotent).

In this case we would write a wrapper around the cache, to check on its expiry, and force only a single thread to repopulate the cache.

So in pseudo code...

function GET(key)
{
  object = Cache.Get(key) //try and get the value from the cache
  if (object is null)
  {
    lock //SERIALISE ACCESS
    {
      if (object is still null)
      {
        object = GetItemToCache() //takes 30 seconds
        PUT(key, object, now + 15 minutes) //expires in 15 minutes time.
      }
    }
  }
  return object
}

function PUT(key, value, expires)
{
  Cache.Insert(key, value, expires)
}

Now you still have processes queuing up at the point where access is serialised, so in the 30 seconds required to repopulate the cache, you have 120 processes per second blocking. This solves one problem but creates another.
We've offloaded the traffic from GetItemToCache() , but forced everyone to wait. So better, but still not too great. How can we improve this to avoid blocking even further? Well... we can temporarily re-cache the "expired" value!

Here, we know that our cached object takes (about) 30 seconds to repopulate, so we can estimate when the item will go "stale". When the data goes stale it expires. This allows users to continue to read stale data whilst the update is prepared. avoiding contention on the object loading process.

.NET's cache for example allows you to specify a callback, which runs when an item falls out the cache. Most caching platforms have a similar mechanism for handling dropped cache items.

Using ASP.NET expiry, callback functions are delegates that support the following signature:

void CacheItemRemoved(string key, object val, CacheItemRemovedReason reason);

So our newly refactored pseudocode is...

constant ItemKey = "Item_That_Goes_Stale"; //a key to identify the cached item
function GET()
  {
    object = Cache.Get(ItemKey) //try to get the item from the cache.
    if (object is null)
    {
      lock //if we have a miss, we synchronise access to repopulate the cache here.
        object = Cache.Get(key)
        if (object is still null) //double check after a lock (another thread could have jumped in.)
        {
          if (object is null) POPULATE_CACHE(); //If it's not in the cache.. cache it.
          return object
        }
      end lock
    }
  return object
}

function PUT(value, expires)
{
  //Here we add a callback to retain the value in the cache in the case of expiry.
  //ASP.NET will call MAKE_STALE just before the item is dropped from the cache.
  Cache.Insert(ItemKey, value, expires - 30 seconds, MARK_STALE);
}

function MARK_AS_STALE(string key, object val, CacheItemRemovedReason reason)
{
  if (reason for removal is Removed or Underused) then return;
  // do no re-cache the item if it's removed (written over) or underused.
  //cache is stale... repopulate it straight away... Now if someone calls GET in the next 30 seconds they will get the old value.
  Cache.Insert(ItemKey, value); //this value will be removed when the following POPULATE_CACHE call replaces it.
  POPULATE_CACHE(); //REPOPULATE THE CACHE
}

function POPULATE_CACHE()
{
  object = GetItemToCache() //takes 30 seconds
  PUT(key, object, now + 15 minutes) //expires in 15 minutes time.
}

If you are not using a platform which supports this type of callback pattern, you can use the cache itself to check for staleness by caching a "still fresh" flag for an object which expires shortly before the real item. You can then check the cache for both the real item and the "still fresh" item. If the "still fresh" flag is not in the cache, we know the real item is stale. This does require a double read of the cache.

Either way, your application should remain much more responsive and stable if this technique is employed to freshen the cache ahead of time. Again, I must say this is an optimisation that is hopefully not needed too often. If you’ve designed your systems around distributed architectural principles, you should never need to do this. The vast majority of the time, the standard read/write, through/behind caching techniques should be more than enough to achieve acceptable system responsiveness.

The central principle and widespread use of caching in large scale web applications, and the new post-relational NoSql databases (used in the likes of Amazons Web service platform and Microsoft Azure Queues), lead to some interesting and common problems with data consistency which I will start to look at in my next post.

Friday 7 August 2009

Warning: This internet may contain traces of nuts

To summarise my last few posts, you should now have at least a basic understanding of how the web is put together (DNS, TCP/IP, HTTP, HTML) and the problems you are likely to encounter when writing code for distributed systems. You should also have an idea of the major architectural constraints to consider when creating robust code in this type of environment....

Atomicity - Write atomic operations that pass or fail in their entirety.
Statelessness - Maintain a high degree of disposability
Concurrency - Avoid using locks wherever possible.
Idempotence - Write code that is repeatable and has no side effects

Great. We can now write a working distributed application. However, while this application may be effective, it may not be the most responsive. All this data still has to move all over some very s_l_o_w wires. In all distributed applications, there is a rule of thumb that you should remember. Data is always more expensive to move than it is to store. You only have to look at mobile phone operators data tariffs to see that.

So, how do we stop moving stuff around unnecessarily? Well, the answer is actually modelled directly on the world of squirrels. Well maybe not directly. Both squirrels and distributed applications use caches for storing stuff around their domain until it's needed.

In practical terms most caches have interfaces like a hash table, so you effectively have a big bucket in which you can store objects and identify them using a key. You can add an item to a cache by calling PUT(key, objectToCache) and retrieve items by calling GET(key)

Most caches are of limited size, so we have to decide how to throw things away if the cache is full and we want to add new items. This is normally done by popularity. So those items "hit" the most stay in the cache, and less popular items are thrown away.

Also in common with squirrels, we have an issue with staleness. We don't want to be retrieving nuts only to find that they've gone all mouldy. Yuk! So most caches also support some sort of Data Expiration strategy. These can be defined in a number of ways, though normally they tend to support sliding and absolute expiry policies. The former is where you specify after what period of inactivity you want cached items to expire. i.e. "If this isn't used for 5 minutes throw it away". The latter, absolute expiry is just the same as a use-by date..."this data is no good after midnight on wednesday."

Now an important thing about caches, is that you should not make any assumptions about their consistency. They could have been pillaged by some other hungry squirrel in the middle of the night without your knowledge.

As the cache makes decisions about how to optimise its use, it is quite possible that you can call PUT(itemA, myItem) and then call GET(itemA) and you get NULL back.

In technical jargon...
If you call GET(itemA) and it returns myItem, this is called a "hit"
If you call GET(itemA) and it returns NULL this is called a "miss".

Caches come in all shapes and sizes:
In-Memory Caches, Distributed Caches, Proxy caches, and Large Area Caching Networks , but they all work in more or less the same way.

So how do we make best use of a cache? Essentially, it's all to do with understanding the cost of a miss. You take the slowest operations.

There are effectively 4 caching patterns that you see on a day to day basis: Read Through, Write Through, Read Ahead and Write Behind.

For reading, it's all about how you handle a cache miss. Using the Read Through pattern in the case of a cache miss, you query your data source, populate the cache, and continue. Next time you get a hit.

With Read Ahead you should never get a miss. You query your data source and populate the cache before any requests are expected. But you may be holding things in memory that are not required.

Populating a cache on a write is more unusual outside distributed applications, but can again boost performance given the right set of conditions. Using Write Through caching, whenever a user submits data, this data is cached as it is written to the data source, speeding up any following reads without having to query the data source again.

And finally, Write Behind. Now, this one is seen particularly often in distributed systems. Most of the "post-relational" databases out there use this pattern, in which the cache is firstly committed to, then some other process writes to the data source at some unspecified later time. This has the advantage of handling writes very very quickly, and leaves the complicated business of reconciling data until later. However this means you have to tolerate a degree of inconsistency across nodes, albeit momentary in many cases.

So what pattern should I be using? Well, as usual there is no straightforward answer. "It all depends".

Read-Ahead gives a speed advantage over Read-Through, but can only be used if you can predict which cache items are going to be needed in the future. If your guess is accurate, refresh-ahead means reduced latency and no added overhead. The poorer your guess is, the worse your performance will be (as more unnecessary requests are sent to your data source). This could even result in worse overall performance if your data source falls behind with its request processing backlog. Read Ahead caches are also more expensive to initialise.

If Write-Behind caching can be used, it can allow much lower latency and higher throughput than Write-Through caching. Additionally Write-Behind caching can sometimes improve the performance of the data source (as there are less writes, and they can be delayed until resources are available).

So, caching can be used to improve the efficiency of distributed systems. You often see caching used like some kind of silver bullet for speed, but like any form of optimisation, it should be viewed as such. The trade off for speed is use of more resources (and management of those resources). More components mean more complexity, but applied carefully, caching can greatly improve the responsiveness of an application, especially a distributed one.

Thursday 30 July 2009

Do it Again!

This is the last in my mini-series about the fallacies of network computing before moving on to deal with the practical side of dealing with these things.

The final fallacy to address, states that "the network is reliable". As you know if you ever used AOL in its, erm... heyday? - it's not. It may go a bit faster now, but it's the same old network that it used to be.

What kinds of operations can be safely performed on something so unreliable? The answer is: "Idempotent" ones. Gosh, Latin! How old is this internet?

idem - meaning "the same". You'll see this used in legal papers abbreviated to "id." as lawers do tend to repeat themselves a lot, and are by far the biggest fans of dead languages (outwith Delphi programming circles).

potent - meaning power

So: "having the same power". But what does that mean? Essentially it's a big fancy word for something that's very simple (in theory at least).

Idempotent operations are essentially repeatable, and have no side effects.

Due to the infrastructure of the net, it is often the case on the web that messages between end points may be sent and received more than once, so it's important that messages processed multiple times do not cause side effects. The clichéd example of this is the browser refresh, and while this is an example of idempotence, I feel it's much clearer to explain by showing what it's not.

Voting systems are the common example of failing to take idempotence in to account.

Even though the "replay attack" is one of the oldest forms of internet abuse, it is still surprisingly common to come across sites that contain polls which are easily rigged. Essentially the web page presents "Vote for A" and "Vote for B" as functions. You click "Vote for B" and a counter is incremented. If you fire the request again, the counter is incremented again.

Write...

for (int i = 0; i < 10000; i++) { VoteForB(); }

... and zap! In a fraction of a second, "B" becomes the clear winner.

Increment() IS NOT an idempotent function. It has side effects on application state and is not repeatable.

An idempotent operation is both repeatable and passive.

So, a toggle() function is not idempotent. A Sqaure() Function or the evaluation of a regular expression against given input is idempotent. It does not change any value given to it and has no side effects.

A * A will always equal A squared.

You can of course deconstruct many non-idempotent operations in to a number of idempotent equivalents, e.g. SwitchOff() and SwitchOn() functions rather than Toggle().

Calling SwitchOff() SwitchOff() SwitchOff() SwitchOff() SwitchOff() SwitchOff() will always result in the Switch being Off.

Idempotence is often confused with method safety. A "Safe" method has absolutely no side effects, for example a read operation is Safe. A Delete() operation is not "Safe" but it can be made to be idempotent. i.e. repeatable with no side effects.

This type fault tolerance is actually designed in to HTTP itself. See:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1.2

Fundamentally, the RESTful-style of the web is different from the CRUD style of most database-oriented systems.

Under RESTful systems using HTTP, calling PUT multiple times will result in the resource existing at a particular address. Calling DELETE multiple times will result in the item no-longer-existing.

You can call PUT PUT PUT or DELETE DELETE DELETE and the end result will be the same.

The comparable operations using something like SQL will result in errors beyond their first call: "The record already exists" or "The record does not exist", respectively.

On the web, you ignore idempotence at your peril. Famously, Google Web Accelerator wreaked havoc and was quickly withdrawn due to web developers widely ignoring idempotence and method safety.

Almost overnight, millions of web developers learned about idempotence the hard way.

While idempotence seems fairly simple on the face of it, the reality of it can quickly become very, very complicated, particularly since, in most distributed systems, you need to take in to account the idea of sequential idempotence. If you're looking at a system with multiple workers of some sort, either threads, VM instances, or WorkerRoles in Azure, and if you're using lock-free structures (as you should be), the possibility exists that you may have to process the same message not just more than once, but also out of order.

For example, suppose that someone updates a record, while another simultaneously deletes it.
Here, I have an Idempotent PUT and an idempotent DELETE verb. I can call PUT PUT PUT or DELETE DELETE DELETE, and have the same end result. But now I have an extra problem: I now require two independent operations to have combined idempotence.

The following sequences (and even all of these if run in parallel together)

PUT DELETE
PUT PUT DELETE PUT DELETE PUT
DELETE PUT

should all have the same end result. What is that end result?

Now you can see how idempotence works, you can think about how to solve this. And if you want a clue see this article:
http://www.spamdailynews.com/publish/Google_subpoenaed_to_turn_over_deleted_Gmail_messages.shtml
The important thing to remember is to write code that is repeatable and has no side effects. The result of this is that you enable fault tolerance, using the simple policy of "If you're not sure it worked, just do it again".

Thursday 23 July 2009

For Sale: One slightly worn RDBMS

Web and cloud apps have to be able to scale as close to linearly as possible, in a hostile environment. Network topology and number of users can fluctuate wildly. Contrast this with a desktop application, which just has to handle a single user. Or with a client/server app, which may have to handle a couple of hundred users usually on the same network. If millions of users start using our distributed system, the experience of each should be no different from what it would be if only 5 or 6 people were using it.

While web developers are used to dealing with concurrency, either consciously or just by knowing that calling Application.Lock() from VBScript will cripple their app, it is also becoming a wider issue for desktop developers, as chip manufacturers squeeze more and more cores on to their chips.

The arch-enemy of concurrency is shared data. Any data that two people have to update can cause concurrency problems. This is actually made slightly simpler in web programming by virtue of HTTP being stateless, and is one reason why the functional style of programming as seen on the web is becoming more popular in other applications.

If we can maintain the qualities of statelessness and atomicity in our operations, we are already well on our way to make our code safe to run concurrently. It's only when someone attempts to force a stateful model that you run into problems.

If shared data is the enemy of concurrency, locks are its atomic bombs. Locks work by serialising access to shared data, so that only one process / thread / fibre can write shared data at any one time. This means that anyone else just has to wait in line. Serialisation is the arch enemy of scalability.

So, NEVER hold editable shared data in your application layers. I will of course break this absolute rule in a future post for grandstanding purposes, but I'm allowed to - It's my blog! And of course, it's to illustrate a good point.

This issue doesn't tend to be a problem for those accustomed to web programming, since most web frameworks either enforce, or heavily imply, avoidance of global state. In most real systems today, shared state is handled by a backend RDBMS, such as Oracle or SQL Server, which offer a gigantic improvement over file-based DBs such as Access or Paradox. These do a very good job, at the moment, of handling medium to large scale apps. However, if we scale to the web, where fast global availability is required, then "Houston, we have a problem." We realise all we’ve done is offloaded the problem further down the chain. Because these client-server style databases have transactions, and therefore locks, embedded firmly at the heart of their architecture, they will not scale globally. You can partition them; you put them on ever larger machines, or even clusters of machines; but they will always require a global lock to perform a commit.

So if RDBMS' such as SQL Server don't scale well, then what does?

Whilst still a drop in the ocean in terms of commercially enterprise databases, Databases based on Distributed hash tables, such as CouchDB, Google’s BigTable, Voldemort, AWS SimpleDB and similar, are gaining a lot of traction.

Microsoft’s forthcoming Azure Table Storage falls in to this category too, unlike their slightly clumsy and poorly positioned "old man in a new hat" - SQL Azure Services.

These types of database characteristically emphasise Decentralization, Scalability and Fault Tolerance, with nodes continuously joining, leaving, and failing within the system.

Sound familiar? Well these are all databases designed to work not only on the web, but in accordance with the fundamental architecture of the web. They emphasise availability over transactional consistency which can cause serious headaches for people stuck in an RDBMS world view. (The idea that the database will sometimes return the wrong result is horrifying to them.)

Scalable Transactions are EXTREMELY complex (actually impossible, as previously discussed) in a distributed system. So, support for transactions in Distributed Databases is rare to limited. If you look at Amazon's SimpleDB or Azure Table Store, if you perform a "BatchPut" or "EntityGroupUpdate" respectively, these do succeed or fail as a unit. However, they are subject to limitations well below the expectations of developers who are used to developing against RDBMS systems. Maximum request sizes of 4MB, or number of items in a transaction (25 for SDB), seem almost laughable. However these distributed document style databases are a fundamentally different type of application from the RDMBS.

Unlike SQL Server or Oracle however, they are reliable, cheap, and MASSIVELY scalable. Therefore, while they account for a tiny fraction of the database market at the moment, they do offer a very serious cost-based competitive advantage over large scale installations of RDBMS systems. I will write a future post about lock free data management patterns and post-relational DBs in the future, but once again, for now the key takeaway for concurrency is:

Locks are mindblowingly expensive in a distributed system.

And they get more expensive the more they are used. If you really MUST use them, stand well back and wear very dark sunglasses.

Next and last up: Idempotence

Monday 13 July 2009

"It Worked."

In my last post I briefly discussed statelessness. The key point about statelessness is that between HTTP calls, whether to fetch web pages, or to access web services or other HTTP based services, each part of the system should be capable of being restarted without causing the others to fail.

Ideally, services should require no state. There are ways of maintaining a "session state" in many web platforms, but these should very much be seen as an optimisation or workaround, rather than a first port of call. I will look at the various session state patterns.

One side effect of statelessness is that we do not really have the context of a conversation available to us, so all the state required to complete a transaction must be available during a single request/response exchange and have no dependencies outside this. This characteristic is referred to as "Atomicity", and normally produces a particularly interesting expression on the face a developers when realised for the first time. Usually it's a result of attempting to use a component that has adopted what I would call the "Barefaced Liar" pattern - all seemed fine when the code ran locally.

This is where a component is designed to expose a simple interface, common to UI development, but actually impossible to guarantee in a distributed environment. There are lots of examples of this but often it's a Distributed Transaction Coordinator of some sort - although session state providers, and WCF’s "Bi-Directional" channel, can cause similar nauseating effects.

So... we have no state available here in which to hold stateful conversations.

The mother of all conversations, in Software Circles at least, is the "2-Phase commit". This protocol ensures that when I ask for work to be done by say three co-operating participants, then either all of it is done, or else none is. The canonical example of this is the bank transaction. We want money to be taken out of account one and put in to another, but if one of those actions fails, we don't want to revert just that one action and pretend nothing untoward has happened. We don't want both (or neither!) of the accounts to have the money.

It essentially works like this. A coordinator asks each participant to do work and waits until it has received a confirmation reply from all participants.

The participants do their work up to the point where they will be asked to commit to it. Each remembers how to undo what they've done so far. They then reply with an agreement to commit their work if the work was done, or a "failed" message if they could not perform the work.

If the coordinator received an agreement message from all participants, then it sends a commit message to them all, asking them to commit their work. On the other hand, if the coordinator received a fail message from any of the participants, then it sends a "rollback" message to all, asking them to undo their work.

Then, it waits again for responses back from all participants, confirming they've committed or rolled back. Only when all have confirmed successful committal, does the co-ordinator inform whoever is interested, that the transaction has succeeded.

This is a staple pattern of Client Server Systems, particularly of the RDBMS kind. However, it is not guaranteed to work in a stateless system as conversations require context and so are not stateless. This breaks our previous advice about maintaining disposability. You cannot invisibly dispose of a participant in a transaction who is waiting to commit with no side effects.

So... Operations exposed on the web should all be Atomic. They should fail or pass as a single unit.

To a client/server developer this of course seems at first sight to be monumentally impractical in many real world scenarios. But in fact most activities of any real-world value cannot use transactions as we know them.

In a standard web services integration, the BookFlight(), BookHotel() and BookCar() methods in a BookHoliday() operation may well be run by completely different organisations. Each of these organisations are likely to have differing policies and data platforms.

I've always found it funny that the poster boy for the 2-phase commit, The Bank Account Transfer, in reality cannot - and does not - use the 2-phase commit protocol.

So what does it use? Well the answer lies in a fairly pungent concoction of asynchronous co-ordination, reconciliation and compensation that is definitely an acquired taste. I will come back to look at specific ingredients you can use in a later post but for now, the important thing to remember is...

At the service boundary, operations should fail or pass in their entirety.

Next up: Concurrency

Thursday 2 July 2009

The state I am in

One of the fallacies of distributed computing states that "Network topology doesn't change."

Well, yes it does, particularly on the internet. Pieces of hardware and software are constantly being added, upgraded and removed. The network is in a constant state of flux. So how do you perform reliable operations in such a volatile environment?

Well, the answer lies in part with the idea of statelessness. HTTP is what’s known as a stateless protocol. For those coming from a non-distributed platform background, statelessness is one of the most misunderstood aspects of web development.

Partly this is due to attempts to hide statelessness by bolting on some - often clunky - state management system. This is because statelessness is often viewed as a kind of deficiency. Framework developers all over the world say, "Oh yuk. Statelessness. This is soOOo difficult to work with. I want my developers to be able to put a button on a form and say OnClick => SayHello(). It's 2009! I don't want to have to worry about state issues on a button click!".

If you're dealing with a Windows Forms style of application, your button is more or less the same “object" that is shown on the user’s screen; but in many web frameworks, a "button" and its on screen representation in a user’s browser may be thousands of miles apart. The abstraction of “a button” assumes a single object, with a state - but this is not the reality. What appears to be one button is in fact two buttons: a logical one on a server, and a UI representation in a browser, with a “click” message being sent between them. The concept of "a button" is a leaky abstraction, and the reason that it's leaky, is because it does not take in to account the network volatility that is a standard feature of the web.

So, what do I mean by statelessness?

When you use the web, it enlists the resources required to fulfil your request dynamically. Once the request has been completed, the various physical components that took part are disconnected, and neither know nor care anything for each other anymore. How sad :-(

Let's look at it in terms of a conversation,

Customer: Hello
Teller: Hello, what can I do for you today?
Customer: I'd like to know my bank balance please.
Teller: Can I have your account number please?
Customer: It's 09-F9-11-02-9D-74-E3-5B-D8-41-56-C5-63-56-88-C0 ;-)
Teller: Well. Your balance is only $1,000,000.
Customer: Ok, Can I please transfer all of it to my Swiss bank account please?
Teller: Yeah sure, what's the account number?
Customer: 12345678.
Teller: Ok, I'll just do that now for you. Would you like a receipt?
Customer: Yes please.
Teller: Here you go => Receipt Provided.
Customer: Bye.
Teller: Bye.

This conversation contains state. Each role of Customer and Teller must know the context of the messages being transferred.

The stateless version of this would go more like...

Customer: Hello, I'd like to know the Balance for account 09-F9-11-02-9D-74-E3-5B-D8-41-56-C5-63-56-88-C0,
Teller: It's $1,000,000

Customer: Could you please transfer $1,000,000 from account 09-F9-11-02-9D-74-E3-5B-D8-41-56-C5-63-56-88-C0 to Swiss bank account number 12345678 and provide a receipt please.
Teller: Yup. Here you go => Receipt Provided.

The end results are the same. However in the first example, "fine-grained operations" are used. The Customer and the Teller need to know where they are in the conversation to understand the context of each statement. In the second, each individual part of the conversation contains all of the information required to perform the requested task.

If we think of these as client and server message exchanges, then in the second example it doesn't matter if the server (teller) is restarted, reinstalled or is blown-up and replaced between the two requests. This is invisible to the end user. You would never be aware of the swap.

However, in the first example, if the server was restarted it would "forget" where you were in the conversation.

Customer: Yes please.
Teller: Sorry? What?

...or in pseudocode...

Client: Yes please.
Server: Sorry? What? WTF? Ka-Boom!

This is a fundamental difference between programming in a “stateful” way, and distributed programming.

Understanding statelessness, its implications and how to manage state in a stateless environment, is a core skill of a web developer. Without understanding the implications of various state management systems, you simply cannot develop large scale applications. In web application there is ALWAYS a client, and a server, and a potentially lengthy request/response message exchange between them.

Most programmers have a fairly firm grip on Object-Oriented programming, and one of the keys to this is the packaging of state and behaviour. As a result, they get locked in to a system of thinking that says state and behaviour should always go together, hence the popularity of technologies such as CORBA, DCOM, and Remoting. These technologies make it easy for people who are used to dealing with "objects" to deal with these without worrying directly about the messaging process that goes on behind the scenes.

However, you often hear complaints about the scalability of these types of remote object technologies. The reason for this often involves assumptions about state. Since remoting interfaces are often automatically generated by whatever remoting framework you are using, and the objects used are very often designed deliberately in such a way as to be re-usable (although not necessarily performant) in both distributed and non-distributed environments, you end up with Chatty, context bound, fine-grained interfaces, which have a hard time keeping up when you have a large number of conversations going on at once.

Ceteris paribus, if you imagine the two conversations above in a remote system where each message exchange took a second, the first would take 7 seconds, the second 2.

What's also interesting is that the two message exchanges in the second conversation can be made either way round or, if required, in parallel. This of course works only if you are able to make some assumptions regarding the availability of cash to you, for example, that you have an overdraft facility of $1,000,000 which you are currently not using. I will come back to this when I discuss Concurrency and Data Consistency, but it is perfectly reasonable to run these operations in parallel. This, along with the rapid growth in multi-core and cloud computing, is one reason for the recent resurgence of interest in functional/stateless programming languages such as Erlang and F#.

The important thing to remember about statelessness is that it is all about maintaining a high degree of disposability. You should be able to throw everything away between message exchanges without a user noticing a server restart. Doing this will enable your application to handle server restarts and changes in network topology gracefully.

Next up: Atomicity.

Monday 29 June 2009

If you're not confused, you're not paying attention

In the last post we looked at the myths of infinite latency and bandwidth. I'd like quickly to cover some of the other fallacies before coming back to these, and how they result in some fundamental architectural constraints for distributed systems.

So let’s quickly check off a couple of these fallacies and their solutions.

The network is secure - It's not. The internet is a public network. Anything sent over it by default is equivalent to writing postcards. You wouldn't write your bank account details on a post card. Don't send them openly over the web.

Solution: Secure Sockets Layer (SSL), the famous “padlock” icon in your browser. Now my postcard is locked in a postcard sized safe. It can still go missing, but no-one can open it without the VERY long secret combination.

There is one administrator. - There is not. Lots of people, all speaking different languages, with different skills, running different platforms with entirely different needs, requirements and implementations. The people deploying code to the web are often entirely separate from those responsible for keeping that code running, often these groups have nothing to do with each other.

Solution: When designing software for this type of environment, you should make absolutely no assumptions about access to or availability of particular resources. Always write code using a "lowest common denominator" approach. Try to ensure that when your system must fail or degrade, it does so gracefully. I will come back to this in a future post.

These problems resulting from the heterogeneity of distributed systems are mostly solved or worked around by the adoption of the various web standards. The point of the standards is that messages can be exchanged without the requirement for a specific platform, or all the baggage that goes along with that.

Right... take a deep breath. Some of the jargon used in the next couple of paragraphs may cause eye/brain-strain, but do not fear! All will become clear, hopefully in 6 posts’ time. The next 3 posts will cover what I would regard as the most important considerations in building a large web based system, and are directly related to the remaining fallacies.

Fallacy	Consideration/Constraint
Network Topology doesn't change.	Statelessness
Transport cost is zero.	Atomic/Transactionless Interfaces
The network is reliable.	Idempotence (yes. really!)

Following that, I’ll discuss some of characteristics and issues that emerge from systems built within these constraints. Specifically, Concurrency, Horizontal Scalability and Distributed Data Consistency (or inconsistency, as more accurately describes the issue).

So next time... Statelessness.

Wednesday 17 June 2009

Faster than a speeding bullet

Over the next couple of posts I'm intending to go in to just a little more depth about the fallacies of network computing I mentioned last time. During the past week I have been doing some work using Amazon’s Web Services Platform to speed up one of our websites.

While demonstrating the delivery speed gained by using Amazon’s CloudFront web service, I encountered a surprising (to me at least) reaction. Though impressed by the visible improvement, many non-web developers were raising points such as, "Why does where our site is hosted matter? I thought the web is globally available", "I thought we were paying for unlimited bandwidth with our ISP?"...

Well, this reaction fell straight in to the path of fallacies number 2 and 3!

Latency is zero.
Bandwidth is infinite.

WRONG!!!

Many people assume that bandwidth and latency are the same thing, however the common metaphor is of a system of roads. The number of lanes on the road is equivalent to its bandwidth, and its length is equivalent to its latency.

A high-bandwidth connection usually has lower latency, since it has less chance to get congested at peak times. However, latency and bandwidth are NOT directly related. They normally go together (i.e. high bandwidth usually means better ping times), but they don't have to.

Even if we could buy "unlimited" bandwidth from our supplier, then if our website is served up from a single location, no matter how fast the connection, it will still suffer from the effects of network latency/lag when viewed at a global level.

It's all to do with the speed of light. No, really! It's Physics time. Hooray!

With the exception some new fangled weird physics theories, as far as normal physics goes, the speed of light is constant. Nothing can travel faster than it, and this "nothing" rule therefore applies to the messages you send around the internet.

Light goes very "fastly" (as my 2 year old son says) - about 300K km/second. This is roughly a million times faster than sound, and fast enough to zoom around the Earth more than 7 times in one second. This sounds really quick, and it is for day to day things like phoning someone (or using smoke signals); you wouldn't perceive it. However, over distances of thousands of miles, the assumption that you never need to worry about the speed starts to break down.

You can sometimes see this effect if you have satellite TV and the same programme is being broadcast on both terrestrial and satellite channels. If you flick between the channels you'll see the satellite picture arrives a moment later than the one broadcast from the ground.

Broadcasting uses "unreliable" protocols, so if messages are held up, they are assumed missing in action and your signal/picture will break up. TCP/IP however is a "reliable" protocol, so if messages are held up, you have to wait for them to arrive, and your experience is that your signal will appear to slow down.

If you're a computer trying to send a message to the other side of the world, it takes about 100 milliseconds to "ping" a computer in Australia from here in the UK (assuming your message travels at the speed of light).

Try this!... Ping a well known server in Australia:

> ping wwf.org.au

...and a well know server here in the UK:

> ping direct.gov.uk

Depending upon where you are located, you'll probably see very different times. I'm seeing ping times of between 340 and 360 ms to Australia, and just 14 to 21 pinging locally (from Scotland to London).

So what does this mean to me as a developer? Well if my users are in London and my site is hosted in Australia, it's going to take a long time to load.

The most common way to mitigate this problem is to use the smallest possible message size, compress your graphics, and so on. However, for rich content like large flash or silverlight files, or streaming video, this is often impossible. An alternative is to put your content as close as possible, in network terms, to your end user.

Whatever your application does, there are some important design decisions to be made in terms of how to partition, deploy and manage it to ensure an optimum experience for end users.

The most common design practices to work around these 2 assumptions are...

1. Optimise your message sizes carefully.
2. Compress any large content.
3. Make a clear separation between types of content, so that your application can be partitioned either logically or geographically if necessary.

At a basic level you might separate static vs. dynamically generated content, but could also be partitioning based on expected usage, data size, or resource locale, for example.

Whatever you need to take in to account for your specific application, the one rule to rule them all is
"Never assume that the network is fast enough by default."

Friday 5 June 2009

The limits of my language are the limits of my world

In my last couple of posts, I described the web and how it is built. The reason for this is to introduce the web to non-web developers - primarily Windows ones, as Windows developers "see" the world differently from web developers. This is as a result of the underlying operating environment.

I am about to make some massive generalisations for the purposes of illustrative clarity.

Windows developers tend to see things in terms of "objects", "containers" and "references" which all communicate via "events" that are raised and responded to. Web developers tend to see things in terms of "resources", "relations" and "links" which all communicate via "messages" that are sent and received.

Windows developers "create new objects" where web developers "post resources". This may all be argued to be a case of "You say tomato", but they represent fundamental differences between traditional Windows development and web development.

An "event" is an abstraction that performs poorly on the web. A windows developer new to web development will see an event in a way that is appropriate to windows development, but if you deal with events in a web application the same way you deal with events in a windows application, you will run in to trouble. However, If you see don't see an event, but a pair of messages, you can store, edit, copy, cache and queue a message. If you think of things in terms of events, it's difficult conceptually to get your head around the idea of a "cached event".

This mismatch of understanding between non-distributed and distributed development results in the famous "Fallacies of Network Computing".

These fallacies are a list of flawed assumptions made by programmers when first developing web, or in fact any distributed, applications.

They are...

1. The network is reliable.
2. Latency is zero.
3. Bandwidth is infinite.
4. The network is secure.
5. Topology doesn't change.
6. There is one administrator.
7. Transport cost is zero.
8. The network is homogeneous.

Over the next few posts I will explain these incorrect assumptions, the problems they have caused (and continue to cause) for people new to web development, and the considerations, patterns and constructs that should be taken in to account to work around or overcome them.

Monday 1 June 2009

The journey of of 1000 miles (part 2)

In the previous post, We looked briefly at the protocols that make the web work. A browser starts by making an HTTP request. This is resolved by DNS, sent over the network via routers to reach a destination which then produces a response message and sends that back.

Now we know how messages are sent around the web, what kind of things are we able to send? Well... Enter HYPERTEXT! (Wow!). The web was designed to send "Hypertext" around? But what is Hypertext? Essentially it's a body of text with an in place reference to another body of text. If you ever had one of those adventure game books as a child (perhaps giving away my long held nerdiness there) they work as a form of hypertext. "If you want to open the treasure chest, go to page 134, if not, go to page 14."

So, this is not necessarily a computery idea. In fact, the idea of hypertext is generally agreed as originating from an idea by an engineer with the best name in computing history... Vannevar Bush, way back in 1945. Whilst there are some very well know implementations of Hypertext (Adobe PDF and Microsoft's original Encarta) it was the creation of the web in 1992 that eventually led to wide scale adoption of hypertext.

The core language of the web, HTML (Hypertext Markup Language) provides a way of "Marking up" text to contain references to another body of text. It does this using a mechanism called a "hyperlink", which describes the reference to another piece of text.

For example...I write a document and say in it...

The Vancouver Sun's statement that "The 2011 movie TRON 2 could become the most expensive film ever made with a reported budget of £300 million dollars." has been debunked.

I can augment this information by pointing at references, without breaking the flow of the text. I do this using an HTML "element" (a label with angled brackets around it) to point at a "hypertext reference". HTML elements can contain "attributes" which are values associated with the elements.

So for example... The "A" (anchor) element has an "href" (hypertext reference) attribute.
The basic format of this is...

The Vancouver Sun's statement that "The 2011 movie TRON 2 could become the most expensive film ever made with a reported budget of £300 million dollars" has been <a href="http://www.slashfilm.com/2009/04/13/">debunked</a>

A web browser knows not to display the anchor element directly, but to underline the contained text and allow a user to "click" on it. The browser knows when the user clicks, this means GET the document specified in the HREF attribute".

This Request/Response messaging pattern is core not only to the web but to the underlying TCP protocol over which the web (HTTP) runs.

Unlike the internet protocols used for something like Skype, or RealPlayer,
for web a browser to receive any information, it must first ask for it.

EVERYTHING you do on the web is affected by this important architectural constraint and this is one of the most important factors affecting the nature of the web and the design of applications that run on it.

This system of documents, links, requests and responses provide the fundamental application plaform on which every single web application is built.

Interestingly, the bright sparks who designed HTTP decided that whilst HTML could provide a natural format for hypertext documents, HTTP should not REQUIRE documents to be in HTML format. They can in fact be in any format including many that you'd maybe not think of as even being documents... text, images, videos, data and files containing programming code. Not only can documents be "linked" they can also be "embedded" so for example...

Using HTML you can link to an image for example, using the IMG "tag" (which has a src "attribute"). Instead of forcing the end user to request the resource the browser will fetch this resource inline and place it inline where the tag is.

so this....

<img src="http://pbskids.kids.us/images/sub-square-barney.gif" />

...will render as...

GOSH!

Interesting Historic note: The ability to "embed" was actually opposed by Tim Berners Lee (inventor of the web) presumably because he'd been "barneyed". The IMG was a custom tag used only by Marc Andressens "Mosaic" browser - "the pre-pre-precursor to todays FireFox, and it's certainly my view that if the IMG had not been included in HTML you'd not be reading this now.

In addition to linking and embedding documents HTML has all sorts of tags for formatting and describing documents.
There are about 100 tags in version 4 of HTML (HTML5 will introduce about another 25-30) which enable all kinds of display, embedding and linking of document elements.

If you view the source code of this page, you will see a huge collection of tags. This is the "markup" used to describe how this page should be displayed.

Of course, complex designs, necessitate complex descriptions, so you'll witness some fairly complicated looking HTML out there in the wild. There is no fast way to learn all these tags, but conceptually, as long as you understand the concepts of documents, elements and attributes then this should be able to build on this.

What I've tried to do over the last 2 posts is to try and get everyone to a common definition of the web. So, we all now understand that the web "is made of"...

DNS - The Domain name system and protocol. Used to convert named servers "www.myserver.com" in to IP addresses 127.0.0.1
TCP/IP - The Transmission Control and Internet Protocol Suite. - Used to send messages around the network
HTTP - The Hypertext Transfer Protocol - Used to specify the purpose of a message sent over TCP/IP
HTML - Hypertext Markup Language - Used to define the messages sent.

The web is centred around the "Request/Response" messaging pattern and linking and embedding of resources.

Cool. So now we're all on the same page. Onwards and upwards! Only 998 miles to go!

Monday 18 May 2009

The journey of 1000 miles

Right. First things first. Before you can have any reasonable discussion about how best to design applications for the web/cloud, you have to define what the web/cloud actually is, so I'll start with the web, as that's how most things in the world now talk to each other.

Everyone knows what the web is. You type in www.google.com, and enter "lolcats". Your browser connects to Google’s servers, sends "lolcats" as a search query, and Google searches its database and sends you lots of stuff back about, well erm... "cats with unusual grammar". Simple enough. However, as you may suspect, getting a list of cats (or whatever) to a screen thousands of miles away is more complicated than it may at first appear.

To see how this works in practice, we'll start with the simple example, that of fetching my first blog post. The commonly understood model of the web is that you connect to a server and download the requested information. This is a useful abstraction, but not entirely accurate.

The reality is more complicated, and involves a number of layers. Each layer builds upon the one below it. Each passes messages to the next layer via progressively more abstract protocols. So, the basic process works like this...

An end user using a browser asks for a resource in the form of a web address - a URL (Uniform Resource Locator)
http://scalethis.blogspot.com/2009/05/hello-world.html

This URL specifies the protocol (http://), domain (scalethis.blogspot.com) and resource (/2009/05/hello-world.html) requested.

We can pack this information up in an HTTP (Hypertext Transfer Protocol) request message which looks like this...

GET /2009/05/hello-world.html HTTP/1.1

Your browser then needs to find a machine that is capable of dealing with this message. To do this, it uses another Internet system called DNS (Domain Name System) to translate the domain into an actual machine to send the message to. This works like a telephone directory lookup. DNS finds that the name "scalethis.blogspot.com" is associated with the actual IP (Internet Protocol) address, 209.85.227.191.

You can see how this works by using "ping" from your command line.

C:\>ping scalethis.blogspot.com
Pinging blogspot.l.google.com [209.85.227.191] with 32 bytes of data

Now the browser knows...
what we are looking for (/2009/05/hello-world.html)
from where (209.85.227.191)
and how to ask for it (http)

Now, unlike some other networks, the internet’s big trick is that - despite appearing as if you connect to a remote machine - the TCP/IP protocol suite is in reality "connectionless". Instead of connecting directly to the remote machine, it basically packages up your request in the form of a message and writes an address on it. "Please send this to 209.85.227.191". It then sends this on to its nearest router, which forwards it on to another router, and another... until it reaches its destination.

You can see how this works by using "tracert" from your command line:

C:\>tracert scalethis.blogspot.com

Tracing route to blogspot.l.google.com [209.85.227.191]
over a maximum of 30 hops:
1 10.0.0.1
2 195.224.48.153
3 195.224.185.40
4 62.72.142.5
5 62.72.137.9
6 62.72.139.118
7 209.85.255.175
8 66.249.95.170
9 72.14.236.191
10 209.85.243.101
11 209.85.227.191

Here you can see all the machines through which your message has passed before finally reaching 209.85.227.191, where scalethis.blogspot.com can be found.

The clever part of this is that if one of the machines in the middle is suddenly unavailable, by way of either nuclear war or coffee spillage, the previous router can simply send the message to another router and so navigate around the problem, much in the same way that your SatNav would re-route you around Birmingham at rush hour. All this business of finding the shortest path and routing around traffic blackspots is a bit of rocket-science handled by various routing protocols, but we'll save that for another day.

So now your message has reached 209.85.227.191! Hooray!

Now, what to do with it? Well the server knows it's an HTTP message, which is a good thing because 209.85.227.191 is a web server, and knows how to understand the message

GET /2009/05/hello-world.html HTTP/1.1

It can see that you're asking to "GET" /2009/05/hello-world.html. "GET" is only one of a number of HTTP "verbs", some of which I'll describe in my next post. For now we can package up a response in order to reply. The HTTP server knows that the resource "/2009/05/hello-world.html" is held physically on "F:\Users\Temp\Backup\PleaseDontDeleteThis\ScaleThis\2009\05\hello-world.html", loads it up, and sends it back using the same forwarding technique.

Your browser reads this message, which contains HTML - like text but with lots of angled brackets - and displays it to you in a nicely formatted way! Yippee!

And it does all of this within a second or two (unless you're using AOL ;-)).

I've deliberately avoided talking in any detail about the higher level languages of the web (HTML, XML, SOAP, CSS, ECMA/JavaScript etc.) and what I'd refer to as the "overweb" made up of plug-ins (Flash/Silverlight, Java Applets, RealPlayer etc.), as I'll be discussing these quite a lot in the future. So for now the key takeaways are...

The web is a massive global network which uses messages to send information between computers, using routers.
These messages are all in standard formats (protocols) so that any software or hardware that sticks to those standards can understand them.
There is a fair amount of communication required to co-ordinate delivery of these messages, so the internet can be slow compared to networks that require a direct connection, such as traditional telephone networks. However, this co-ordinated exchange means that the web in its nature is flexible, reliable, and highly resilient to change.

If you're a developer you should get an overall idea of the how the protocols work. You don't necessarily need to understand the syntax of the Syn/Ack handshake in TCP, but you should at least know what the major protocols are, what they are for, and have an understanding of how they work, at least at a Wikipedia level. Your starting point can be found here...

http://en.wikipedia.org/wiki/Internet_Protocol_Suite

but the ones of particular concern to the functioning of the web are DNS, TCP, IP and HTTP. Whilst not strictly part of the internet protocol suite, you should have a read up on routing protocols too, particularly the Border Gateway Protocol (BGP).

Next time, I'll take a closer look at HTTP and some basic HTML/XML, and that should give us a reasonable common frame of reference on which we can build.

Thursday 14 May 2009

Hello World

Welcome to my blog. There are plenty of software blogs around (according to one colleague, the worlds first write-only medium), so why add yet another one? Well, I have a specific target audience. The people within my own company... although, as I am talking about things that may be of interest to the wider community, I decided to post on the web.

I work as a developer for a relatively small software company who is starting to get to grips with the ideas of Web applications, Software as a Service and the currently trendy term "Cloud Computing".

This company is a Microsoft partner, selling primarily a client-server style Windows forms-based application. So even some of our most experienced developers don't really have much experience dealing with the more esoteric subjects involved in Web development. The environment, culture, architectural style, and languages used on the web are quite different from the (traditionally) more centrally planned monoculture of enterprise software.

Later this year our company will begin exploring what we're referring to internally as "vNext" which is (possibly) the next major version of our core product (possibly a new product or collection of products). The redpills amongst us know (although not all are necessarily all that comfortable with the fact) that this really needs to be a scalable web application (or at the very least an application that has web architecture at its' core). Those people not yet entirely convinced, have increasingly flimsy reasons for keeping our core product as necessarily a client/server style application.

My aim is that if our company is moving in this direction (as I believe it has to), as a team we all need to have a better understanding of the nature of the web and the services that run on it, not only from a technical perspective, but from business, economic and environmental ones too if our company is to thrive in the future.

Over the coming weeks/months, I will be writing about all kinds of fascinating topics :-) but with the main aim of clarifying some of the concepts, patterns, practices, languages, dialects, rituals and sacrifices involved in producing large scale web applications.

My overall plan is to start with a general background to the Web, then to look at the principles, general constraints and architecture of distributed applications. I can then start to look at specific design patterns and practices that can be adopted to create web scale services and applications.

I've not written a blog before so any feedback would be greatly appreciated. (Particularly if you fundamentally disagree with anything I'm saying!)

Thanks for reading,

Chris.

About Me

Hi, My name is Chris and I have been working as a web developer for about 10 years, primarily working on the Microsoft plaform.

I have an interest in web standards as a member of the HTML Working Group at the W3C and have an active interest in large distributed systems and "Cloud Computing". I took part in the private beta of Amazon's SimpleDB service and I am currently experimenting with (the interesting parts of) Microsofts Azure platform preview.

I'm also re-discovering javascript/css via the brilliant jQuery and Sizzle selector engine.