Wednesday, February 24, 2010

I love eventual consistency but there are some applications that are much easier to implement with strong consistency. Many like eventual consistency because it allows us to scale-out nearly without bound but it does come with a cost in programming model complexity. For example, assume your program needs to assign work order numbers uniquely and without gaps in the sequence.  Eventual consistency makes this type of application difficult to write.

 

Applications built upon eventually consistent stores have to be prepared to deal with update anomalies like lost updates. For example, assume there is an update at time T1 where a given attribute is set to 2. Later, at time T2, the same attribute is set to a value of 3. What will the value of this attribute be at a subsequent time T3?  Unfortunately, the answer is we’re not sure. If T1 and T2 are well separated in time, it will almost certainly be 3. But it might be 2. And it is conceivable that it could be some value other than 2 or 3 even if there have been no subsequent updates. Coding to eventual consistency is not the easiest thing in the world. For many applications its fine and, with care, most applications can be written correctly on an eventually consistent model. But it is often more difficult.

 

What I’ve learned over the years is that strong consistency, if done well, can scale to very high levels. The trick is implementing it well. The naïve approach to achieve full consistency is to route all updates through a single master server but clearly this won’t scale. Instead divide the update space into a large number of partitions, each with its own master. That scales but there is still a tension between the number of partitions and the cost of maintaining many partitions and avoiding hot spots.  The obvious way to avoid hot spots is to use a large number of partitions but this increases partition management overhead.  The right answer is to be able to dynamically repartition to maintain a sufficient number of partitions and to be able to adapt to load increases on any single server by further spreading the update load.

 

There are many approaches to support dynamic hot sport management. One is to divide the workload into 10 to 100x more partitions than expected servers and make these fixed-sized partitions be the unit of migration. Servers with hot partitions will end up serving less partitions while servers with cold partitions will manage more. The other class of approaches, is to dynamically repartition. Start with large partitions and divide hot partitions to multiple smaller partitions to spread the load over multiple servers.

 

There are many variants of these techniques with different advantages and disadvantages. The constant is that full consistency is more affordable than many think. Clearly, eventual consistency remains a very good thing for workloads that don’t need full consistency and for workloads where the overhead of the above techniques is determined to be unaffordable. Both higher consistency models are quite useful.

 

This morning SimpleDB announced support for two new features that make it much easier to write many classes of applications: 1) consistent Reads, 2) Conditional put and delete.  Consistent reads allows applications that need full consistency to be easily written against SimpleDB. So, for example, if you wanted to implement an inventory management system that didn’t lose parts in the warehouse, doesn’t sell components twice, or place multiple orders, it would now be trivial to write this application against SimpleDB using the consistent read support. Consistent read is implemented as an optional Boolean flag on SimpleDB GetAttributes or select statements. Absence of the flag continues to deliver the familiar eventually consistent behavior with which many of you are very familiar with. If the flag is present and set, you get strong consistency. 

 

SimpleDB conditional PutAttributes and DeleteAttributes are a related feature that makes it much easier to write applications where the new value of an attribute are functionally related to the old value. Conditional update support allows a programmer to read the value of an attribute, operate upon it, and then write it back only if the value hasn’t changed in the interim which would render the planned update invalid. For example, say you were implementing a counter (+1). If the value of the counter at time T0 was 0, and subsequently an increment was applied at time T1 and another at increment was applied at time T2, what is the value of the counter? Using eventual consistency and, for simplicity, assuming no concurrent updates, the resulting value is probably is 2. Unfortunately, the value might be 1. With conditional updates, it will be 2.  Again, conditional puts and deletes are just another great tool to help write correct SimpleDB applications quickly and efficiently.

 

For more information on consistent reads and conditional put and delete, see SimpleDB Consistency Enhancements.

 

These two SimpleDB features have been in the works for some time and so it is exciting to see them announced and available today. It’s great to now be able to talk about these features publically. If you are interested in giving them a try, you can for free.  There is no charge for SimpleDB use for database sizes under 1GB (and silly close to free above that level). Go for it.

 

                                                                                --jrh

 

James Hamilton

e: jrh@mvdirona.com

w: http://www.mvdirona.com

b: http://blog.mvdirona.com / http://perspectives.mvdirona.com

 

Wednesday, February 24, 2010 3:17:57 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2] - Trackback
Services
Saturday, February 27, 2010 10:53:19 AM (Pacific Standard Time, UTC-08:00)
That looks really neat and I imagine getting all the pieces in place was no simple undertaking. Congratulations!
Sunday, February 28, 2010 6:39:52 AM (Pacific Standard Time, UTC-08:00)
Your totally right Andrew. Its only two features in the programming model but the implementation work was substantial on these two.

jrh@mvdirona.com
Comments are closed.

Disclaimer: The opinions expressed here are my own and do not necessarily represent those of current or past employers.

Archive
<February 2010>
SunMonTueWedThuFriSat
31123456
78910111213
14151617181920
21222324252627
28123456
78910111213

Categories
This Blog
Member Login
All Content © 2014, James Hamilton
Theme created by Christoph De Baene / Modified 2007.10.28 by James Hamilton