Monte Carlo-ing Your Eventual Consistency Bets
One of the features of not-only-SQL (NoSQL) data storage systems is the concept of eventual consistency (via Wikipedia):
Eventual Consistency… means that given a sufficiently long period of time over which no changes are sent, all updates can be expected to propagate eventually through the system and all the replicas will be consistent.
For those of us coming from a transactional system point of view, eventual consistency can be mind-boggling at first. Thinking about data being presented in an inconsistent manner is usually seen as a data quality failure — something to be avoided. But in non-transactional systems it’s worth the trade-off for speed and scalability. Think about your Facebook page for a minute: how bad would it be if one of your friend’s updates was not visible to you at the same time it was visible to someone else, but eventually you’d be able to see that update?
Paul Cannon has a great write up on using tools to estimate your eventual consistency with Cassandra:
"The best part is that they also provided the world with an interactive demo, which lets you fiddle with N, R, and W, as well as parameters defining your system’s read and write latency distributions, and gives you a nice graph showing what you can expect in terms of consistent reads after a given time.
See the interactive demo here.
This terrific tool actually runs thousands of Monte Carlo simulations per data point (turns out the math to create a full, precise formulaic solution was too hairy) to give a very reliable approximation of consistency for a range of times after a write."
Being able to plan your architecture to best fit the business need is what is important, not necessarily data purity at the cost of speed or reliability. Again, that sounds weird to a profession that has focused on fighting to keep data integrity on the radar of management, but the best design decisions are made balancing cost, benefit and risk. Those of us in the data world to understand that eventually consistent is often the best solution. Even if it feels weird.
Having tools that help us understand how to best architect the trade-offs is the first step in delivering the right data consistency for what the business needs.
Related articles
-
Eventually Consistent (queue.acm.org)

Leave a comment
Subscribe via E-mail
Recent Comments
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
- Joey D'Antoni on Strutting: We all Know When You are Doing It. So Stop.
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
- Thomas LaRock on Strutting: We all Know When You are Doing It. So Stop.
- Karen Lopez on Strutting: We all Know When You are Doing It. So Stop.
Recent Posts
Downloads
- EDW 2013 Karen Lopez Get Blogging
- Karen Lopez presentation DAMA PS 2012
- Data Modeling Contentious Issues - DAMA Nebraska
- Karen Lopez - 10 Physical Blunders - DAMA
- Career Success In Data Profession - DAMA
- The Straw Poll
- You've Just Inherited a Data Model CheckList
- KarenLopez - 5 Physical Blunders - 24HOP-2011
- Handouts for OEMUG / CA Global Modeling User Group Why Be Normal Webcast
- Handouts Database Design Contentious Issues - New York 2010
- Handouts Database Design Contentious Issues - DC 2010
Archive
- May 2013 (4)
- April 2013 (5)
- March 2013 (4)
- February 2013 (7)
- January 2013 (12)
- December 2012 (2)
- November 2012 (3)
- October 2012 (3)
- September 2012 (13)
- August 2012 (5)
- July 2012 (17)
- June 2012 (2)
- May 2012 (4)
- April 2012 (4)
- March 2012 (8)
- February 2012 (11)
- January 2012 (3)
- December 2011 (10)
- November 2011 (8)
- October 2011 (5)
- September 2011 (3)
- August 2011 (9)
- July 2011 (5)
- June 2011 (5)
- May 2011 (5)
- April 2011 (9)
- March 2011 (4)
- February 2011 (9)
- January 2011 (8)
- December 2010 (15)
- November 2010 (27)
- September 2010 (2)
- August 2010 (1)
- July 2010 (4)




