I/O Performance (no longer) Sucks in the Cloud

Many workloads have high I/O rate data stores at the core. The success of the entire application is dependent upon a few servers running MySQL, Oracle, SQL Server, MongoDB, Cassandra, or some other central database.

The best design patter for any highly reliable and scalable application whether on-premise or in cloud hosted, is to shard the database. You can’t be dependent upon a single server being able to scale sufficiently to hold the entire workload. Theoretically, that’s the solution and all workloads should run well on a sufficiently large fleet even if that fleet has a low individual server I/O performance. Unfortunately, few workloads scale as badly as database workloads. Even scalable systems such as MongoDB or Cassandra need to have a per-server I/O rate that meets some minimum bar to host the workload cost effectively with stable I/O performance.

The easy solution is to depend upon a hosted service like DynamoDB that can transparently scale to order 10^6 transactions per second and deliver low jitter performance. For many workloads, that is the final answer. Take the complexity of configuring and administering a scalable database and give it to a team that focuses on nothing else 24×7 and does it well.

Unfortunately, in the database world, One Size Does Not Fit All. DynamoDB is a great solution for some workloads but many workloads are written to different stores or depend upon features not offered in DynamoDB. What if you have an application written to run on sharded Oracle (or MySQL) servers and each database requires 10s of thousands of I/Os per second? For years, this has been the prototypical “difficult to host in the cloud” workload. All servers in the application are perfect for the cloud but the overall application won’t run unless the central database server can support the workload.

Consequently, these workloads have been difficult to host on the major cloud services. They are difficult to scale out to avoid needing very high single node I/O performance and they won’t yield a good customer experience unless the database has the aggregate IOPS needed.

Yesterday an ideal EC2 instance type was announced. It’s the screamer needed by these workloads. The new EC2 High I/O Instance type is a born database machine. Whether you are running Relational or NoSQL, if the workload is I/O intense and difficult to cost effectively scale-out without bound, this instance type is the solution. It will deliver a booming 120,000 4k reads per second and between 10,000 and 85,000 4k random writes per second. The new instance type:

· 60.5 GB of memory

· 35 EC2 Compute Units (8 virtual cores with 4.4 EC2 Compute Units each)

· 2 SSD-based volumes each with 1024 GB of instance storage

· 64-bit platform

· I/O Performance: 10 Gigabit Ethernet

· API name: hi1.4xlarge

If you have a difficult to host I/O intensive workload, EC2 has the answer for you. 120,000 read IOPS and 10,000 to 85,000 write IOPS for $3.10/hour Linux on demand or $3.58/hour Windows on demand. Because these I/O workloads are seldom scaled up and down in real time, the Heavy Utilization Reserved instance is a good choice where the server capacity can be reserved for $10,960 for a three year term and usage is $0.482/hour.

· Amazon EC2 detail page: http://aws.amazon.com/ec2/

· Amazon EC2 pricing page: http://aws.amazon.com/ec2/#pricing

· Amazon EC2 Instance Types: http://aws.amazon.com/ec2/instance-types/

· Amazon Web Services Blog: http://aws.typepad.com/aws/2012/07/new-high-io-ec2-instance-type-hi14xlarge.html

· Werner Vogels: http://www.allthingsdistributed.com/2012/07/high-performace-io-instance-amazon-ec2.html

Adrian Cockcroft of Netflix wrote an excellent blog on this instance type where he gave benchmarking results from Netflix: Benchmarking High Performance I/O with SSD for Cassandra on AWS.

You can now have 100k IOPS for $3.10/hour.

–jrh

James Hamilton
e: jrh@mvdirona.com
w: http://www.mvdirona.com
b: http://blog.mvdirona.com / http://perspectives.mvdirona.com

Services

Perspectives

I/O Performance (no longer) Sucks in the Cloud

Leave a Reply Cancel reply