All new technologies go through an early phase when everyone initially is completely convinced the technology can’t work. Then for those that actually do solve interesting problems, they get adopted in some workloads and head into the next phase. In the next phase, people see the technology actually works well for some workloads and they generalize this outcome to a wider class of workloads. They get convinced the new technology is the solution for all problems. Solid State Disks (SSDs) are now clearly in this next phase.
Well intentioned people are arguing emphatically that SSDs are great because they are “fast”. For the most part, SSDs actually are faster than disks both in random reads, random writes and sequential I/O. I say “for the most part” since some SSDs have been incredibly bad at random writes. I’ve seen sequential write rates as low as ¼ that of magnetic HDDs but Gen2 SSD devices are now far better. Good devices are now delivering faster than HDD results across random read, write, and sequential I/O. It’s no longer the case that SSDs are “only good for read intensive workloads”.
So, the argument that SSDs are fast is now largely true but “fast” really is a misleading measure. Performance without cost has no value. What we need to look at is performance per unit cost. For example, SSD sequential access performance is slightly better than most HDDs but the cost MB/s is considerably higher. It’s cheaper to obtain sequential bandwidth from multiple disks than from a single SSD. We have to look at performance per unit cost rather than just performance. When you hear a reference to performance as a one dimensional metric, you’re not getting a useful engineering data point.
When do SSDs win when looking at performance per unit dollar on the server? Server workloads requiring very high IOPS rates per GB are more cost effective on SSDs. Online transaction systems such as reservation systems, many ecommerce systems, and anything with small, random reads and writes can run more cost effectively on SSDs. Some time back I posted When SSDs make sense in server applications and the partner post When SSDs make sense in client applications. What I was looking at is where SSDs actually do make economic sense. But, with all the excitement around SSDs, some folks are getting a bit over exuberant and I’ve found myself in several arguments where smart people are arguing that SSDs make good economic sense in applications requiring sequential access to sizable databases. They don’t.
It’s time to look at where SSDs don’t make sense in server applications. I’ve been intending to post this for months and my sloth has been rewarded. The Microsoft Research Cambridge team recently published Migrating Server Storage to SSDs: Analysis of Tradeoffs and the authors save me some work by taking this question on. In this paper the authors look at three large server-side workloads:
1. 5000 user Exchange email server
2. MSN Storage backend
3. Small corporate IT workload
The authors show that these workloads are far more economically hosted on HDDs and I agree with their argument. They conclude:
…across a range of different server workloads, replacing disks by SSDs is not a cost effective option at today’s price. Depending on the workload, the capacity/dollar of SSDs needs to improve by a factor of 3 – 3000 for SSDs to replace disks. The benefits of SSDs as an intermediate caching tier are also limited, and the cost of provisioning such a tier was justified for fewer than 10% of the examined workloads
They have shown that SSDs don’t make sense across a variety of server-side workloads. Essentially that these workloads are more cost effectively hosted on HDDs. I don’t quite agree with the generalization of this argument that SSDs don’t make sense on the server-side for any workloads. They remain a win for very high IOPS OLTP databases but it’s fair to say that these workloads are a tiny minority of server-side workloads. The right way to make the decision is to figure out the storage budget for the workload to be hosted on HDD and compare that with the budget to support the workload on SSDs and make the decision on that basis. This paper argues that the VAST majority of workloads are more economically hosted on HDDs.
Thanks to Zach Hill who sent this my way.
–jrh
James Hamilton, Amazon Web Services
1200, 12th Ave. S., Seattle, WA, 98144
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 | james@amazon.com
H:mvdirona.com | W:mvdirona.com/jrh/work | blog:http://perspectives.mvdirona.com
” I say “for the most part” since some SSDs have been incredibly bad at random writes. ” Looks like a typo — should be “bad at sequential writes” as that is what the following sentence is mentioning.
Your right SM. Thanks for pointing that out.
nice posting thank for sharing
Thanks for the comment Mehul. Your conclusion was "So, its true — today — SSDs don’t make sequential sense. But, given these trends and the possibility of hybrid systems: SSD + HDDs, the answer may not be so clear-cut." I agree that there is great change happening in the SSD world and any technology driven by semiconductor improvement rates deserves careful watching. But, when making decisions today on what work loads are hosted cost effectively today, we should use today’s data. It’s going to be a fun next few years but, at current price points, SSDs for sequential workloads of substantial size, doesn’t make economic sense.
I agree Hybrids or future price/performance improvements may change this but that’s where we currently are with today’s components and pricing.
–jrh
James Hamilton, jrh@mvdirona.com
Certainly, the price point for SSDs is not quite there yet for sequential workloads. However, I urge you to consider two points:
1) Most SSDs are priced based on potential consumer value rather than underlying HW cost, though this is changing. Nevertheless, the underlying NAND costs are dropping nearly 2x per year: http://www.slideshare.net/nomathjobs/TheFutureOfMemoryStorageKlein. At some point, we can imagine that SSDs will not be far from the underlying NAND costs.
2) Most sequential workloads are sequential because they were designed to work on HDDs. Assuming #1 eventually happens, the SW vendors will realize that leveraging random I/Os of commodity SSDs could provide both improved performance as well as price-performance. In such case, sequential workloads will become less sequential. In fact, we have a paper in SIGMOD ’09: "Query Processing Techniques for Solid State Drives" that shows how one can do this for traditional query processing.
So, its true — today — SSDs don’t make sequential sense. But, given these trends and the possibility of hybrid systems: SSD + HDDs, the answer may not be so clear-cut.
– Mehul
Igor, it’s sort of hard to create an up-to-date paper in this field – things are moving fast. It is more apt to look at the trends rather than fixed points in time. The trend I find most intriguing about flash is the size and price reduction. In great part this results from cross-subsidy by CE manufacturers. Every time you buy an iPhone or a digital camera a little bit of money trickles down to the flash manufacturers, allowing greater economies of scale. HDDs however have likely exhausted their economy of scale and will soon suffer from the reverse effect – in a year or so consumers will start switching from HDD to SSD on their laptops and the total size of HDD market will begin to shrink. The research budget for producing higher density plates will shrink as well. I find this fascinating.
James, the pleasure is mine. I think the idea of cold data is consistent with "disk is tape" vision. I am probably underestimating the magnitude of cold storage needs – my place of work might have something to do with it. :-)
Largely agreed on O/S differences being mostly uninteresting. The main challenge with the OS, of course, is in tuning it to your needs so that you are getting whatever abstraction advantages it can provide you over working with "bare metal" while keeping it out of your way when you don’t want it to do anything. I have had some terrible experiences with "clever" read-ahead logic, poor software RAID-O implementations, and the like, which have resulted in more than a factor of two degradation of performance for usage patterns that were important to me. A poorly configured filesystem can hamper or even cripple your performance — HDD or SSD — and if you are unable to tune the appropriate parameters then you are out of luck. And you will almost certainly want to make sure things are tuned differently between SSDs and HDDs.
Chad Walters
I agree that the SSD choice isn’t great. Both Samsung and Intel are producing wonderful price performing parts and many other produce high performance parts but at high cost. Much better SSDs are available. Better price/performing SSDs would improve the test but it won’t change the outcome: for a given, application required performance level, most workloads can be more economically hosted HDDs. Some very hot OLTP workloads and other apps that require very high IOPS rates per unit storage are cost effectively hosted on SSDs.
The point I want to make is that these workloads do exist but they are a tiny minority. Most server-side workloads don’t belong on SSDs. Some do but the I’m seeing deployments all the time that don’t make economic sense.
I don’t buy the argument that the results would be different with different O/S. The differences in O/S are lost in the noise in this sort of discussion. In fact, I find it hard to get excited at all about O/Ss these days. That’s just not where the interesting issues are in high-scale distributed systems.
–jrh
jrh@mvdirona.com
This paper is already OBSOLETE due to very poor choice of drives they’ve used and some methodology errors: the only SSD drive they’ve analyzed is VERY expensive and has slow write IOPS performance (for SSD) and for HDs they seems to have totally ignored cost, SIZE and power consumption of storage enclosures for 3.5" Cheetah drives they choose as representative – 106 spindles in 1st line of Table 1 will use at least 7 16-drive 3U enclosures (which will cost more than drives themselves) etc.
Why they’re comparing 2.5" SSD drive to 3.5" Cheetah HDs is rather strange also – except possibly to justify predetermined conclusion. Also, it seems they’ve limited analysis to Windows only for large – results for Linux or Solaris could be entirely different.
Version 2 of this paper that will:
– use SSD (one high-end and one midrange) with best price/performance available
– use realistic hard drives (all 2.5" or will properly account for enclosures cost/size/power if 3.5")
– clearly state which OS traces were collected for
– include analysis for large non-Windows (Linux, Solaris etc) traces
may be very useful indeed.
DISCLAIMER I’m NOT associated with any SSD (or storage or OS) manufacturer, reseller etc in any way, shape or form.
I agree with Denis’ points 1 & 2 and also recommend the article from anandtech.com he linked. Not sure I should prognosticate on flash killing HDDs — it’ll likely take a while and probably something else will come along and blow them both out of the water before then.
One problem with this paper is that this is such a fast-moving space. I am not sure when the paper was written but there are better performing SSDs on the market now than the one they looked at. Also, the price has dropped substantially on some of the best SSDs in the last few months. The Intel X25-E and X25-M are better than the SSD they selected by a factor of 2 on most dimensions in table 4 — particularly if you have heavily read-oriented workloads and can go with the MLC-based X25-M. Given that, the minimum of 3x improvement they needed to see for any of their workloads to see benefit from SSDs has already been exceeded.
I can also personally vouch for the fact that there are workloads where an SSD is already the most cost-effective means to performance improvement.
That said, there are certainly workloads where SSDs do not make sense yet and may never. Understanding the specific workload you care about and determining the crossover point for that workload is the key — not following hype or looking at a cross-section of workloads.
Always good to hear from you Denis.
I generally agree with what you have above. Our one strong point of difference is that, although I expect to see SSDs heavily used in server applications, I believe disk will remain. Here’s the argument: the vast majority of enterprise data is ice cold. In fact, a recent study shows that most of it never gets looked at. You and I work on DBs which take care of the most important data in the world. Most of the data — well over 95% — never gets to a DB. It rarely if ever gets accessed. And, I firmly believe that 10 years from now, it’ll still be stored on HDDs.
I agree that OLTP data will be on flash — I’m just arguing that most server-side data won’t be there a decade from now.
–jrh
I think the Gartner Hype Cycle can accurately describe the irrational exuberance (and the other phases of technology adoption): http://en.wikipedia.org/wiki/Hype_cycle
As to the subject matter, I want to make a couple of points:
1. The tables 3&4 in the linked paper lead me to believe they only considered acquisition cost and even then only the disk itself. It’s not clear to me that this is the right way to measure, and I can see where SSD would have lower costs of other parts of the system – less power => smaller power supplies & less cooling capapcity, also smaller physical dimensions => higher density. The method of measuring cost is not an open-and-shut case.
2. We’re yet to see software actually optimized for SSDs.
3. To paint a broader picture I’d like to go back to the words of Jim Gray – "flash is disk, disk is tape and tape is dead". Despite being witty this point also seems to provide an accurate predictive framework – as time goes by SSD will gain on its advatages gradually displacing HDD one step at a time, just like HDD has gradually displaced tape. The cost is currently higher, but it only matters for so long.
In other words the server-side SSD may be still climbing up the hill of inflated expectations, but the actual displacement is nonetheless progressing at its own pace and given inherent latency limitations of sinning drives it will eventually complete (as in – they will be as rare as tape is today).
On a related note, consumer SSDs are already past the peak and descending: http://www.anandtech.com/storage/showdoc.aspx?i=3531