It’s a clear sign that the Cloud Computing market is growing fast and the number of cloud providers is expanding quickly when startups begin to target cloud providers as their primary market. It’s not unusual for enterprise software companies to target cloud providers as well as their conventional enterprise customers but I’m now starting to see startups building products aimed exclusively at cloud providers. Years ago when there were only a handful of cloud services, targeting this market made no sense. There just weren’t enough buyers to make it an interesting market. And, many of the larger cloud providers are heavily biased to internal development further reducing the addressable market size.

A sign of the maturing of the cloud computing market is there now many companies interested in offering a cloud computing platform not all of which have substantial systems software teams. There is now a much larger number of companies to sell to and many are eager to purchase off the shelf products. Cloud providers have actually become a viable market to target in that there are many providers of all sizes and the overall market continues to expand faster than any I have seen any I’ve seen over the last 25 years.

An excellent example of this new trend of startups aiming to sell to the Cloud Computing market is SolidFire which targets the high performance block storage market with what can be loosely described as a distributed Storage Area Network. Enterprise SANs are typically expensive, single-box, proprietary hardware. Enterprise SANs are mostly uninteresting to cloud providers due to high cost and the hard scaling limits that come from scale-up solutions. SolidFire implements a virtual SAN over a cluster of up to 100 nodes. Each node is a commodity 1RU, 10 drive storage server. They are focused on the most demanding random IOPS workloads such as database and all 10 drives in the SolidFire node are Solid State Storage devices. The nodes are interconnected by up 2x 1GigE and 2x10GigE networking ports.

In aggregate, each node can deliver a booming 50,000 IOPS and the largest supported cluster with 100 nodes can support 5m IOPS in aggregate. The 100 node cluster scaling limit may sound like a hard service scaling limit but multiple storage clusters can be used to scale to any level. Needing multiple clusters has the disadvantage of possibly fragmenting the storage but the advantage of dividing the fleet up into sub-clusters with rigid fault containment between them limiting the negative impact of software problems. Reducing the “blast radius” of a failure makes moderate sized sub-clusters a very good design point.

Offering distributed storage solution isn’t that rare – there are many out there. What caught my interest at SolidFire was 1) their exclusive use of SSDs, and 2) an unusually nice quality of service (QoS) approach. Going exclusively with SSD makes sense for block storage systems aimed exclusively at high random IOPS workload but they are not a great solution for storage bound workloads. The storage for these workloads is normally more cost-effectively hosted on hard disk drives. For more detail on where SSDs are win an where they are not:

· When SSDs Make Sense in Server Applications

· When SSDs Don’t Make Sense in Server Applications

· When SSDs make sense in Client Applications (just about always)

The usual solution to this approach is do both but SolidFire wanted a single SSD optimized solution that would be cost effective across all workloads. For many cloud providers, especially the smaller ones, a single versatile solution has significant appeal.

The SolidFire approach is pretty cool. They exploit the fact that SSDs have abundant IOPS but are capacity constrained and trade off IOPS to get capacity. Dave Wright the SolidFire CEO describes the design goal as SSD performance at a spinning media price point. The key tricks employed:

· Multi-Layer Cell Flash: They use MLC Flash Memory storage since it is far cheaper than Single Level Cell, the slightly lower IOPS rate supported by MLC is still more than all but a handful of workloads require and they can solve the accelerated wear issues with MLC in the software layers above

· Compression: Aggregate workload dependent gains estimated to be 30 to 70%

· Data Deduplication: Aggregate workload dependent gains estimated to be 30 to 70%

· Thin Provisioning: Only allocate blocks to a logical volume as they are actually written to. Many logical volumes never get close to the actual allocated size.

· Performance Virtualization: Spread all volumes over many servers. Spreading the workload at a sub-volume level allows more control of meeting the individual volume performance SLA with good utilization and without negatively impacting other users

The combination of the capacity gains of thin provisioning, duplication, and compression bring the dollars per GB of the SolidFire solution very near to some hard disk based solutions at nearly 10x the IOPS performance.

The QoS solution is elegant in that they have three settings that allow multiple classes of storage to be sold. Each logical volume has 2 QoS settings: 1) Bandwidth, and 2) IOPS. Each setting has a min, max, and burst capacity setting. The min setting sets a hard floor where capacity is reserved to ensure this resource is always available. The burst is the hard ceiling that prevents a single user for consuming excess resource. The max is the essentially the target. If you run below the max you build up credits that allow a certain time over the max. The Burst limits the potential negative impact of excursions above max on other users.

This system can support workloads that need dead reliable, never changing I/O requirements. It can also support dead reliable average case with rare excursions above (e.g. during a database checkpoint). Its also easy to support workloads that soak up resources left over after satisfying the most demanding workloads without impacting other users. Overall, a nice simple and very flexible solution to a very difficult problem.


James Hamilton



b: /