All new technologies go through an early phase when everyone initially is completely convinced the technology can’t work. Then for those that actually do solve interesting problems, they get adopted in some workloads and head into the next phase. In the next phase, people see the technology actually works well for some workloads and they generalize this outcome to a wider class of workloads. They get convinced the new technology is the solution for all problems. Solid State Disks (SSDs) are now clearly in this next phase.
Well intentioned people are arguing emphatically that SSDs are great because they are “fast”. For the most part, SSDs actually are faster than disks both in random reads, random writes and sequential I/O. I say “for the most part” since some SSDs have been incredibly bad at random writes. I’ve seen sequential write rates as low as ¼ that of magnetic HDDs but Gen2 SSD devices are now far better. Good devices are now delivering faster than HDD results across random read, write, and sequential I/O. It’s no longer the case that SSDs are “only good for read intensive workloads”.
So, the argument that SSDs are fast is now largely true but “fast” really is a misleading measure. Performance without cost has no value. What we need to look at is performance per unit cost. For example, SSD sequential access performance is slightly better than most HDDs but the cost MB/s is considerably higher. It’s cheaper to obtain sequential bandwidth from multiple disks than from a single SSD. We have to look at performance per unit cost rather than just performance. When you hear a reference to performance as a one dimensional metric, you’re not getting a useful engineering data point.
When do SSDs win when looking at performance per unit dollar on the server? Server workloads requiring very high IOPS rates per GB are more cost effective on SSDs. Online transaction systems such as reservation systems, many ecommerce systems, and anything with small, random reads and writes can run more cost effectively on SSDs. Some time back I posted When SSDs make sense in server applications and the partner post When SSDs make sense in client applications. What I was looking at is where SSDs actually do make economic sense. But, with all the excitement around SSDs, some folks are getting a bit over exuberant and I’ve found myself in several arguments where smart people are arguing that SSDs make good economic sense in applications requiring sequential access to sizable databases. They don’t.
It’s time to look at where SSDs don’t make sense in server applications. I’ve been intending to post this for months and my sloth has been rewarded. The Microsoft Research Cambridge team recently published Migrating Server Storage to SSDs: Analysis of Tradeoffs and the authors save me some work by taking this question on. In this paper the authors look at three large server-side workloads:
1. 5000 user Exchange email server
2. MSN Storage backend
3. Small corporate IT workload
The authors show that these workloads are far more economically hosted on HDDs and I agree with their argument. They conclude:
…across a range of different server workloads, replacing disks by SSDs is not a cost effective option at today’s price. Depending on the workload, the capacity/dollar of SSDs needs to improve by a factor of 3 – 3000 for SSDs to replace disks. The benefits of SSDs as an intermediate caching tier are also limited, and the cost of provisioning such a tier was justified for fewer than 10% of the examined workloads
They have shown that SSDs don’t make sense across a variety of server-side workloads. Essentially that these workloads are more cost effectively hosted on HDDs. I don’t quite agree with the generalization of this argument that SSDs don’t make sense on the server-side for any workloads. They remain a win for very high IOPS OLTP databases but it’s fair to say that these workloads are a tiny minority of server-side workloads. The right way to make the decision is to figure out the storage budget for the workload to be hosted on HDD and compare that with the budget to support the workload on SSDs and make the decision on that basis. This paper argues that the VAST majority of workloads are more economically hosted on HDDs.
Thanks to Zach Hill who sent this my way.
James Hamilton, Amazon Web Services
1200, 12th Ave. S., Seattle, WA, 98144W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 | email@example.com
H:mvdirona.com | W:mvdirona.com/jrh/work | blog:http://perspectives.mvdirona.com
Disclaimer: The opinions expressed here are my own and do not
necessarily represent those of current or past employers.