This morning I came across an article written by Sid Anand, an architect at Netflix that is super interesting. I liked it for two reasons: 1) it talks about the move of substantial portions of a high-scale web site to the cloud, some of how it was done, and why it was done, and 2) its gives best practices on AWS SimpleDB usage.
I love articles about how high scale systems work. Some past postings:
- FriendFeed use of MySQL
- Facebook Cassandra Architecture and Design
- Wikipedia Architecture
- MySpace Architecture and .Net
- Flickr DB Architecture
- Geo-Replication at Facebook
- Scaling at LucasFilms
- Facebook: Needle in a Haystack: Efficient Storage of Billions of Photos
- Scaling LinkedIn
- Scaling at MySpace
The article starts off by explaining why Netflix decided to move their infrastructure to the cloud:
Circa late 2008, Netflix had a single data center. This single data center raised a few concerns. As a single-point-of-failure, it represented a liability – data center outages meant interruptions to service and negative customer impact. Additionally, with growth in both streaming adoption and subscription levels, Netflix would soon outgrow this data center – we foresaw an imminent need for more power, better cooling, more space, and more hardware.
Our option was to build more data centers. Aside from high upfront costs, this endeavor would likely tie up key engineering resources in data center scale out activities, making them unavailable for new product initiatives. Additionally, we recognized the management of multiple data centers to be a complex task. Building out and managing multiple data centers seemed a risky distraction.
Rather than embarking on this path, we chose a more radical one. We decided to leverage one of the leading IAAS offerings at the time, Amazon Web Services. With multiple, large data centers already in operation and multiple levels of redundancy in various web services (e.g. S3 and SimpleDB), AWS promised better availability and scalability in a relatively short amount of time.
By migrating various network and back-end operations to a 3rd party cloud provider, Netflix chose to focus on its core competency: to deliver movies and TV shows.
I’ve read considerable speculation over the years on the difficulty of moving to cloud services. Some I agree with – these migrations do take engineering investment – while other reports seem to less well thought through focusing mostly on repeating concerns speculated upon by others. Often, the information content is light.
I know the move to the cloud can be done and is done frequently because, where I work, I’m lucky enough to see it happen every day. But the Netflix example is particularly interesting in that 1) Netflix is a fairly big enterprise with a market capitalization of $7.83B – moving this infrastructure is substantial and represents considerable complexity. It is a great example of what can be done; 2) Netflix is profitable and has no economic need to make the change – they made the decision to avoid distraction and stay focused on the innovation that made the company as successful as it is; and 3) they are willing to contribute their experiences back to the community. Thanks to Sid and Netflix for the later.
For more detail, check out the more detailed document that Sid Anand has posted: Netflix’s Transition to High-Availability Storage Systems.
For those SimpleDB readers, here’s a set of SimpleDB best practices from Sid’s write-up:
· Partial or no SQL support. Loosely-speaking, SimpleDB supports a subset of SQL
o Do GROUP BY and JOIN operations in the application layer
o One way to avoid the need for JOINs is to denormalize multiple Oracle tables into a single logical SimpleDB domain.
· No relations between domains
o Compose relations in the application layer
· No transactions
o Use SimpleDB’s Optimistic Concurrency Control API: ConditionalPut and ConditionalDelete
· No Triggers
o Do without
· No PL/SQL
o Do without
· No schema – This is non-obvious. A query for a misspelled attribute name will not fail with an error
o Implement a schema validator in a common data access layer
· No sequences
o Sequences are often used as primary keys
§ In this case, use a naturally occurring unique key. For example, in a Customer Contacts domain, use the customer’s mobile phone number as the item key
§ If no naturally occurring unique key exists, use a UUID
o Sequences are also often used for ordering
§ Use a distributed sequence generator
· No clock operations
o Do without
· No constraints. Specifically, no uniqueness, no foreign key, no referential & no integrity constraints
o Application can check constraints on read and repair data as a side effect. This is known as read-repair. Read repair should use the CondtionalPut or ConditionalDelete API to make atomic changes
If you are interested high sale web sites or cloud computing in general, this one is worth a read: Netflix’s Transition to High-Availability Storage Systems.
–jrh
Update: Netflix architects Adrian Cockcroft and Sid Anand are both presenting at QconSF between November 1st and 5th in San Francisco: http://qconsf.com/sf2010.
b: http://blog.mvdirona.com / http://perspectives.mvdirona.com
I understand your worry on pricing but a few factors should dampen the concerns: 1) Amazon.com has a fairly long history of being a low-price player in the market. Its just part the DNA; 2) AWS has a steady, history of doing nothing but lower prices. Search for "AWS price" and you’ll see a steady series of declines; and 3) if history is reversed, and the price does go up, move the workload to another provider. There are many competitors and its likely there will be more going forward.
I would recommend reflecting upon what services you depend upon today. Most companies use services for a wide variety of important company functions. Have a look at payroll, security, electrical supply, facility lease including the data center (often collocation or leased), telco, and many others. You likely are a heavy user of services today.
Looking at other domains, using services for some non-core business functions seems to lower costs and allow companies to focus on the core value of their respective businesses. If the services is good value today, there is a history of price reductions, and there are competitors, I don’t see the risk.
And what happens when AWS raises prices? Do you just chalk it up to the cost of doing business and force customers to pay. That would seem to invalidate the whole business model of Netflix, which is efficient low cost delivery of popular movies. I wouldn’t put my business in someone else hands. AWS would essentially be another Oracle and we know where that gets you.
From my relational days, average row sizes of operational systems typically run small. A couple of hundred bytes is common. a couple of kilobytes is getting very large for a row. So, 500k per user seems reasonable to me but I sense you have a specific use case in mind. A common approach is to store large objects in S3 and SimpleDB. Drop me a note if you have a specific use case in mind. You know the deal after being around high-scale storage for a long time. Generally, there doesn’t exist a store that is right for all use cases and, as a consequence, we’re seeing more stores emerging. Its a good thing.
–jrh
James Hamilton, jrh@mvdirona.ocm
Okay, now I’m really confused. What’s the maximum number of domains SimpleDB can handle? Is it hundreds? Thousands? Or tens of thousands?
If the maximum is hundreds of domains, as I thought you implied by saying that people have not wanted anywhere near 10k domains, then the maximum size of the data someone can store in SimpleDB is just a few terabytes, and so people like Netflix would have to be very careful to only store a few terabytes, right?
Netflix, for example, with its 10M subscribers could only store 500k per subscriber in SimpleDB if they only have 500 domains, which isn’t a lot of data per user. They could, of course, use SimpleDB as an index into blobs stored in S3, but that is kind of inconvenient to give up row level access to data, so that is what I mean that they have to carefully manage usage of SimpleDB and can’t use it like a real database if it can only hold a few terabytes. Or do I still have this wrong?
I’m not sure what data is leading to that conclusion Greg.
–jrh
Ah, sorry, I think I had a misunderstanding. Sounds like Netflix is carefully managing its storage in SimpleDB to work with the limitations of a database that can only handle a few terabytes. Got it, sorry, my mistake.
Where are you getting the 10k domain requirement Greg? I’ve not heard of anyone wanting anywhere approaching that number of domains. Tell me more about what you have in mind.
SimpleDB is generally not used for bulk storage applications. Some folks store bulk objects in S3 and use SimpleDB as a metadata store for large objects. Its commonly used for structured data, semi-structured data, and as metastore for blobs in S3. Generally, its not the engine I would recommend for very large objects.
James Hamilton
jrh@mvdirona.com
Hi, Adrian and James. Sorry, I might not have been clear. What surprised me is that the limit on SimpleDB domains easily can be as high as 10k+ (so that Netflix could be storing 100T+ in it). My expectation when I saw a 100 domain limit with the ability to have it increased was that I could apply to have it increased by a factor of x2 or x3 to 200 or 300, not that I could easily ask to have it increased by two orders of magnitude or more and that Amazon would be able to offer and support that.
I’m sure other people looked at the 1T limit on SimpleDB and rejected it for petabyte-sized data. If SimpleDB easily can handle data of that size, it might be wise to raise the default limit well beyond 100 domains. And, in general, if SimpleDB can work with petabyte-sized data, you might want to make that more clear, since I doubt that is widely known.
Adrian got there ahead of me and nailed it.
Greg, if you are running at scale and have a need for more than 100 domains, fill out the the "Request to Increase Allocation of Amazon SimpleDB Domains" form at: http://aws.amazon.com/contact-us/simpledb-limit-request/.
James Hamilton
jrh@mvdirona.com
I’d like to point out that Netflix currently runs a pair of redundant datacenters, we added one during 2009 and decided that we didn’t want to add any more. We currently run 60-70% of our total capacity on AWS (depending on how and when you measure it).
We do have over 100 SimpleDB domains in a single account. We have a premium support account, and when we hit an AWS limit we discuss it with Aamazon, just as you would expect with any vendor. We have also worked with Amazon to improve the performance and scalability of SimpleDB, and to optimize the way we use it, to make it a better product for everyone.
Yo Adrian, good to hear from you. Thanks for letting me know that you guys are presenting at QConSF. I’ll update the blog with a pointer to it.
–jrh
jrh@mvdirona.com
Very interesting, James. Thanks for pointing it out!
I take it Netflix has a very big exemption to the 100 domain limit in SimpleDB?
http://aws.amazon.com/simpledb/faqs/#How_much_data_can_I_store
That was a problem for me when I was looking at it in the past. It means a 1T database limit (100 domain shards of max 10G each), which is pretty small for a database, yes? Does Netflix have tens or hundreds of thousands of domains then? And does Amazon have plans to lift that 100 domain limit more broadly?
Thanks James,
Sid’s paper is the basis of his presentation at QConSF on Nov 5th. My own paper at QConSF on Nov 3 gives the big picture overview, and Sid follows with his in depth advice on migrating from a conventional SQL model to a NoSQL model.
We will both be at the Silicon Valley Cloud Computing Meetup this coming Thursday, when I’m presenting a version of the same presentation.
Cheers Adrian