Sunday, October 31, 2010

I did a talk earlier this week on the sea change currently taking place in datacenter networks. In Datacenter Networks are in my Way I start with an overview of where the costs are in a high scale datacenter. With that backdrop, we note that networks are fairly low power consumers relative to the total facility consumption and not even close to the dominant cost. Are they actually a problem? The rest of the talk is arguing networks are actually a huge problem across the board including cost and power. Overall, networking gear lags behind the rest of the high-scale infrastructure world, block many key innovations, and actually are both cost and power problems when we look deeper.

 

The overall talk agenda:

·         Datacenter Economics

·         Is Net Gear Really the Problem?

·         Workload Placement Restrictions

·         Hierarchical & Over-Subscribed

·         Net Gear: SUV of the Data Center

·         Mainframe Business Model

·         Manually Configured & Fragile at Scale

·         New Architecture for Networking

 

In a classic network design, there is more bandwidth within a rack and more within an aggregation router than across the core. This is because the network is over-subscribed. Consequently, instances of a workload often needs to be placed topologically near to other instances of the workload, near storage, near app tiers, or on the same subnet. All these placement restrictions make the already over-constrained workload placement problem even more difficult. The result is either the constraints are not met which yields poor workload performance or the constraints are met but overall server utilization is lower due to accepting these constraints. What we want is all points in the datacenter equidistant and no constraints on workload placement.

 

Continuing on the over-subscription problem mentioned above, data intensive workloads like MapReduce and high performance computing workloads run poorly on oversubscribed networks.  Its not at all uncommon for a MapReduce workload to transport the entire data set at least once over the network during job run. The cost of providing a flat, all-points-equidistant network are so high, that most just accept the constraint and other run MapReduce poorly or only run them in narrow parts of the network (accepting workload placement constraints).

 

Net gear doesn’t consume much power relative to total datacenter power consumption – other gear in the data center are, in aggregate much worse. However, network equipment power is absolutely massive today and it is trending up fast. A fully configured Cisco Nexus 7000 requires 8 circuits of 5kw each. Admittedly some of that power is for redundancy but how can 120 ports possibly require as much power provisioned as 4 average sized full racks of servers? Net gear is the SUV of the datacenter.

 

The network equipment business model is broken. We love the server business model where we have competition at the CPU level, more competition at the server level, and an open source solution for control software.  In the networking world, it’s a vertically integrated stack and this slows innovation and artificially holds margins high. It’s a mainframe business model.

New solutions are now possible with competing merchant silicon from Broadcom, Marvell, and Fulcrum and competing switch designs built on all three. We don’t yet have the open source software stack but there are some interesting possibilities on the near term horizon with OpenFlow being perhaps the most interesting enabler. More on the business model and why I’m interested in OpenFlow at: Networking: The Last Bastion of the Mainframe Computing.

 

Talk slides: http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_POA20101026_External.pdf

 

                                                                --jrh

 

James Hamilton

e: jrh@mvdirona.com

w: http://www.mvdirona.com

b: http://blog.mvdirona.com / http://perspectives.mvdirona.com

 

Sunday, October 31, 2010 11:30:28 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2] - Trackback
Hardware | Services
Tuesday, November 02, 2010 8:47:17 AM (Pacific Standard Time, UTC-08:00)
Good perspective, I agree overall. But couple of red herrings about open source and new business models. My blog at http://doubleclix.wordpress.com/2010/11/02/dragnet-in-south-lake-union/
Tuesday, November 09, 2010 5:56:21 AM (Pacific Standard Time, UTC-08:00)
You aren't specific as to why you think open source and new business models aren't important. My current feeling is the business model problem is the most important of all the issues I brought up.

Your blog entry was great Krishna.

James Hamilton, jrh@mvdirona.com
Comments are closed.

Disclaimer: The opinions expressed here are my own and do not necessarily represent those of current or past employers.

Archive
<October 2010>
SunMonTueWedThuFriSat
262728293012
3456789
10111213141516
17181920212223
24252627282930
31123456

Categories
This Blog
Member Login
All Content © 2014, James Hamilton
Theme created by Christoph De Baene / Modified 2007.10.28 by James Hamilton