Wednesday, August 01, 2012

In I/O Performance (no longer) Sucks in the Cloud, I said

 

Many workloads have high I/O rate data stores at the core. The success of the entire application is dependent upon a few servers running MySQL, Oracle, SQL Server, MongoDB, Cassandra, or some other central database.

 

Last week a new Amazon Elastic Compute Cloud (EC2) instance type based upon SSDs was announced that delivers 120k reads per second and 10k to 85k writes per second. This instance type with direct attached SSDs is an incredible I/O machine ideal for database workloads, but most database workloads run on virtual storage today. The administrative and operational advantages of virtual storage are many. You can allocate more storage with a call of an API. Blocks are redundantly stored on multiple servers. It’s easy to checkpoint to S3. Server failures don’t impact storage availability.

 

The AWS virtual block storage solution is the Elastic Block Store (EBS).  Earlier today two key features were released to support high performance databases and other random I/O intensive workloads on EBS. The key observation is that these random I/O-intensive workloads need to have IOPS available whenever they are needed. When a database runs slowly, the entire application runs poorly. Best effort is not enough and competing for resources with other workloads doesn’t work. When high I/O rates are needed, they are needed immediately and must be there reliably.

 

Perhaps the best way to understand the two new features is to look at how demanding database workloads are often hosted on-premise. Typically large servers are used so the memory and CPU resources are available when needed. Because a high performance storage system is needed and because it is important to be able to scale the storage capacity and I/O rates during the life of the application, direct attached disk isn’t the common choice. Most enterprise customers put these workloads on Storage Area Network devices which are typically connected to the server by a Fiber Channel network (a private communication channel used only for storage).

 

The aim of the announcement today is to take some of what has been learned from 30+ years of on-premise storage evolution. Customers want virtualized storage but, at the same time, they need the ability to reserve resources for demanding workloads. In this announcement, we take some of the best aspects what has emerged in on-premise storage solutions and  give EC2 customers the ability to scale high-performance storage as needed, reserve and scale the available I/Os per Second (IOPS) as needed, and reserve dedicated network bandwidth to the storage device. The latter is perhaps the most important and the combination allows workloads to reserve both the IOPS rates at the storage as well as the network channel to get to the storage and be assured it will be there when they need it.

 

The storage, IOPS, and network capacity is there even if you haven’t used it recently. It’s there even if your neighbors are also busy using their respective reservations. It’s even there if you are running full networking traffic load to the EC2 instance. Just as when an on-premise customer allocates a SAN volume with a Fiber Channel attach that doesn’t compete with other network traffic, allocated resources stay reserved and they stay available. Let’s look at the two features that deliver a low-jitter, virtual SAN solution in AWS.


Provisioned IOPS is a feature of Amazon Elastic Block Store. EBS has always allowed customers to allocate storage volumes of the size they need and to attach these virtual volumes to their EC2 instances. Provisioned IOPS allows customers to declare the I/O rate they need the volumes to be able to deliver, up to 1,000 I/Os per second (IOPS) per volume. Volumes can be striped together to achieve reliable, low-latency virtual volumes of 20,000 IOPS or more. The ability to reliably configure and reserve over 10,000 IOPS means the vast majority of database workloads can be supported. And, in the near future, this limit will be raised allowing increasingly demanding workloads to be hosted on EC2 using EBS.

 

EBS-Optimized EC2 instances are a feature of EC2 that is the virtual equivalent of installing a dedicated network channel to storage. Depending upon the instance type, 500 Mbps up to a full 1Gbps are allocated and dedicated for storage use only. This storage communications channel is in addition to the network connection to the instance. Storage and network traffic no longer compete and, on large instance types, you can drive full 1Gbps line rate network traffic while, at the same time, also be consuming 1Gbps to storage. Essentially EBS Optimized instances have a dedicated storage channel that doesn’t compete with instance network traffic.

 

From the EBS detail page:

 

EBS standard volumes offer cost effective storage for applications with light or bursty I/O requirements.  Standard volumes deliver approximately 100 IOPS on average with a best effort ability to burst to hundreds of IOPS.  Standard volumes are also well suited for use as boot volumes, where the burst capability provides fast instance start-up times.

 

Provisioned IOPS volumes are designed to deliver predictable, high performance for I/O intensive workloads such as databases.  With Provisioned IOPS, you specify an IOPS rate when creating a volume, and then Amazon EBS provisions that rate for the lifetime of the volume.  Amazon EBS currently supports up to 1,000 IOPS per Provisioned IOPS volume, with higher limits coming soon.  You can stripe multiple volumes together to deliver thousands of IOPS per Amazon EC2 instance to your application. 

To enable your Amazon EC2 instances to fully utilize the IOPS provisioned on an EBS volume, you can launch selected Amazon EC2 instance types as “EBS-Optimized” instances.  EBS-optimized instances deliver dedicated throughput between Amazon EC2 and Amazon EBS, with options between 500 Mbps and 1000 Mbps depending on the instance type used.  When attached to EBS-Optimized instances, Provisioned IOPS volumes are designed to deliver within 10% of the provisioned IOPS performance 99.9% of the time.  See Amazon EC2 Instance Types to find out more about instance types that can be launched as EBS-Optimized instances. 

 

Providing scalable block storage at-scale, in 8 regions around the world is one of the most interesting combinations of distributed systems and storage problems we face. The problem has been well solved in high-cost on-premise solutions. We now get to apply what has been learned over the last 30+ years to solve the problem at cloud-scale with low-cost and 100s of thousands of concurrent customers. An incredible number of EC2 customers depend upon EBS for their virtual storage needs, the number is growing daily, and we are really only just getting started. If you want to be part of the engineering effort to make Elastic Block Store the virtual storage solution for the cloud, send us a note at ebs-jobs@amazon.com.

 

With the announcement today, EC2 customers now have access to two very high performance storage solutions. The first solution is the EC2 High I/O Instance type announced last week which delivers a direct attached, SSD-powered 100k IOIPS for $3.10/hour. In today’s announcement this direct attached storage solution is joined by a high-performance virtual storage solution. This new type of EBS storage allows the creation of striped storage volumes that can reliably delivery 10,000 to 20,000 IOPS across a dedicated virtual storage network.

 

Amazon EC2 customers now have both high-performance, direct attached storage and high-performance virtual storage with a dedicated virtual storage connection.

 

                                                --jrh

 

James Hamilton 
e: jrh@mvdirona.com 
w: 
http://www.mvdirona.com 
b: 
http://blog.mvdirona.com / http://perspectives.mvdirona.com

 

Wednesday, August 01, 2012 4:57:52 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2] - Trackback
Services
Tracked by:
"http://planetcassandra.org/blog/post/what-is-the-story-with-aws-storage/" (http... [Pingback]
Friday, August 03, 2012 1:34:54 AM (Pacific Standard Time, UTC-08:00)
"Providing scalable block storage at-scale, in 8 regions around the world is one of the most interesting combinations of distributed systems and storage problems we face. The problem has been well solved in high-cost on-premise solutions."

There are 2 problems here - the first is the technical one of providing real, on-demand block storage with the ability to choose whether you need high performance or just somewhere to throw files. With the release of both provisioned IOPS and EBS optimised instances, combined you can easily get extremely good performance - guaranteed disk i/o and the network bandwidth to use it.

However, I'm not sure the second problem is solved yet - cost. The advantage of EBS over on-premise solutions is you can dynamically spin things up anywhere AWS has a point of presence, but the cost is still relatively high for non-transient use cases e.g. powering an app's primary data store vs high intensity, short term data processing. In US East that starts at $3.10/hr + the per GB EBS charge + the per i/o charge + the provisioned IOPS charge (both based on IOPS and storage space). To get to the level of dedicated storage (whether high end utility storage systems or even just buying SSDs in a dedicated server), it's currently looking very expensive.

I'm going to be interested to see how pricing changes over time, as it has done for other AWS services.
Saturday, August 04, 2012 6:57:13 AM (Pacific Standard Time, UTC-08:00)
Thanks for the comment David.

I screwed up a bit on my pricing description. In your summary of prices you are including the costs of both solutions. The $3.10/hour is a 100k IOPS direct attached SSD EC2 instance type that can do roughly 100k IOPS. That is one complete, very high I/O solution and it doesn't depend upon EBS.

Another approach is to use EBS in which case you are paying $0.12/GB + $0.10 per provisioned IOPS/month.

--jrh
Comments are closed.

Disclaimer: The opinions expressed here are my own and do not necessarily represent those of current or past employers.

Archive
<November 2014>
SunMonTueWedThuFriSat
2627282930311
2345678
9101112131415
16171819202122
23242526272829
30123456

Categories
This Blog
Member Login
All Content © 2014, James Hamilton
Theme created by Christoph De Baene / Modified 2007.10.28 by James Hamilton