Saturday, September 18, 2010

A couple of years ago, I did a detailed look at where the costs are in a modern , high-scale data center. The primary motivation behind bringing all the costs together was to understand where the problems are and find those easiest to address. Predictably, when I first brought these numbers together, a few data points just leapt off the page: 1) at scale, servers dominate overall costs, and 2) mechanical system cost and power consumption seems unreasonably high. Both of these areas have proven to be important technology areas to focus upon and there has been considerable industry-wide innovation particularly in cooling efficiency over the last couple of years.

 

I posted the original model at the Cost of Power in Large-Scale Data Centers. One of the reasons I posted it was to debunk the often repeated phrase “power is the dominate cost in a large-scale data center”. Servers dominate with mechanical systems and power distribution  close behind. It turns out that power is incredibly important but it’s not the utility kWh charge that makes power important. It’s the cost of the power distribution equipment required to consume power and the cost of the mechanical systems that take the heat away once the power is consumed. I referred to this as fully burdened power. 

 

Measured this way, power is the second most important cost. Power efficiency is highly leveraged when looking at overall data center costs, it plays an important role in environmental stewardship, and it is one of the areas where substantial gains continue to look quite attainable. As a consequence, this is where I spend a considerable amount of my time – perhaps the majority – but we have to remember that servers still dominate the overall capital cost.

 

This last point is a frequent source of confusion.  When server and other IT equipment capital costs are directly compared with data center capital costs, the data center portion actually is larger. I’ve frequently heard “how can the facility cost more than the servers in the facility – it just doesn’t make sense.”  I don’t know whether or not it makes sense but it actually is not true at this point. I could imagine the infrastructure costs one day eclipsing those of servers as server costs continue to decrease but we’re not there yet. The key point to keep in mind is the amortization periods are completely different. Data center amortization periods run from 10 to 15 years while server amortizations are typically in the three year range. Servers are purchased 3 to 5 times during the life of a datacenter so, when amortized properly, they continue to dominate the cost equation.

 

In the model below, I normalize all costs to a monthly bill by taking consumable like power and billing them monthly by consumption and taking capital expenses like servers, networking or datacenter infrastructure, and amortizing over their useful lifetime using a 5% cost of money and, again, billing monthly. This approach allows us to compare non-comparable costs such as data center infrastructure with servers and networking gear each with different lifetimes. The model includes all costs “below the operating system” but doesn’t include software licensing costs mostly because open source is dominant in high scale centers and partly because licensing costs very can vary so widely. Administrative costs are not included for the same reason. At scale, hardware administration, security, and other infrastructure-related people costs disappear into the single digits with the very best services down in the 3% range. Because administrative costs vary so greatly, I don’t include them here. On projects with which I’ve been involved, they are insignificantly small so don’t influence my thinking much. I’ve attached the spreadsheet in source form below so you can add in factors such as these if they are more relevant in your environment.

 

Late last year I updated the model for two reasons: 1) there has been considerable infrastructure innovation over the last couple of years and costs have changed dramatically during that period and 2) because of the importance of networking gear to the cost model, I factor out networking from overall IT costs. We now have IT costs with servers and storage modeled separately from networking. This helps us understand the impact of networking on overall capital cost and on IT power.

 

When I redo these data, I keep the facility server count in the 45,000 to 50,000 server range. This makes it an reasonable scale facility –big enough to enjoy the benefits of scale but nowhere close to the biggest data centers. Two years ago, 50,000 servers required a 15MW facility (25MW total load). Today, due to increased infrastructure efficiency and reduced individual server power draw, we can support 46k servers in an 8MW facility (12MW total load). The current rate of innovation in our industry is substantially higher than it has been any time in the past with much of this innovation driven by mega service operators.

 

Keep in mind, I’m only modeling those techniques well understood and reasonably broadly accepted as good quality data center design practices. Most of the big operators will be operating at efficiency levels far beyond those used here. For example, in this model we’re using a Power Usage Efficiency (PUE) of 1.45 but Google, for example, reports PUE across the fleet of under 1.2: Data Center Efficiency Measurements. Again, the spread sheet source is attached below so feel free to change to the PUE used by the model as appropriate.

 

These are the assumptions used by this year’s model:

 

Using these assumptions we get the following cost structure:

 

 

 

For those of you interested in playing with different assumptions, the spreadsheet source is here: http://mvdirona.com/jrh/TalksAndPapers/PerspectivesDataCenterCostAndPower.xls.

 

If you choose to use this spreadsheet directly or the data above, please reference the source and include the URL to this pointing.

 

                                                --jrh

 

James Hamilton

e: jrh@mvdirona.com

w: http://www.mvdirona.com

b: http://blog.mvdirona.com / http://perspectives.mvdirona.com

 

Saturday, September 18, 2010 12:56:19 PM (Pacific Standard Time, UTC-08:00)  #    Comments [16] - Trackback
Services
Saturday, September 18, 2010 7:10:56 PM (Pacific Standard Time, UTC-08:00)
What spec server are you using that comes out to $1450/165 watts?
James Case
Sunday, September 19, 2010 5:57:08 AM (Pacific Standard Time, UTC-08:00)
I hear you James and you are right those numbers are quite good. The short answer is: select servers carefully and buy in very large quantity.

Several server manufacturers design and build servers specifically for the mega-facility operators often in response to a specific request for proposals (semi-custom builds). Dell Datacenter Solutions team, Rackable Systems (now SGI), HP, and ZT Systems all build special products for high very large operators. Some of the largest operators cut out the middle man and do private designs and get them directly built by contract manufacturers (CM). Others take a middle ground and work directly with original design manufacturers (ODMs)to have custom designs built and delivered.

At scale, all the above work and will deliver products at these price and power points. The big operators tend not to discuss their equipment in much detail but this paper has a fair amount of detail: http://research.microsoft.com/pubs/131487/kansal_ServerEngineering.pdf

Try using higher numbers like $2,500/server and 250W/server in the spreadsheet. The server costs fraction grows under these assumptions to 61%. Generally, the message stays the same: server cost dominates and power drives the bulk of the remainder.

--jrh
jrh@mvdirona.com
Sunday, September 19, 2010 10:45:04 AM (Pacific Standard Time, UTC-08:00)
Jonathan, I would equally like to see a fully-burdened 'infrastrucutre' cost breakdown, as well, beyond that which takes into account only the fully-burdened power and other areas in your analysis. Here I'm referring to the amortized costs, on an application specific allocation basis, of the base building shell, i.e., the actual floor space built on concrete and steel. This can become especially meaningful over a long stretch of time in lease-rental situations where the dependency on power and air cooling, chillers, etc. demands that specially-designed space enclosures be used.

The subject of real estate is usually treated as a 'facilities' based problem, and as such it is not one that IT generally gets involved in justifying, but so too was power treated in this manner at one time. It becomes especially germane in smaller data centers and other small venues, such as LAN equipment closets, comms centers and the smaller server farms, where I devote a great deal of my time and focus on bringing new efficiencies.

The larger subject, that of real estate costs in general, among other things (including how "all"-copper-based LANs in work areas today are taken for granted and considered as being ossified in place), unfortunately exists directly in the middle of many operators' and IT managers' blind spots. Yet, when viewed as a function of total "actual" costs to the enterprise, real estate (and power) comprise a large share of 'fully burdened infrastructure costs' on a par with big-box data centers, although it's far more difficult to get an accurate picture and characterize quantitatively all of the millions of smaller enclosures that exist today.

Great post, btw. Thanks.

frank@fttx.org

------
Sunday, September 19, 2010 11:25:28 AM (Pacific Standard Time, UTC-08:00)
My apologies, James, for my earlier misrepresentation of your name. Since that was the second time I've done this now, I think it's time I wrote a script ;)

Frank

------
Frank A. Coluccio
Monday, September 20, 2010 3:57:28 AM (Pacific Standard Time, UTC-08:00)
Frank, I agree, the outside-the-box infrastructure including wan infrastructure, net gear, wiring closets, and client devices consumes more power than the data centers on which the depend. And, small scale infrastructure like wiring closest are often poorly designed and inefficient.

--jrh
James Hamilton, jrh@mvdirona.com
Monday, September 20, 2010 4:40:28 AM (Pacific Standard Time, UTC-08:00)
One item that might change these calcs. quite a lot (and make IT and overall power consumption a much large portion) is that the bricks and mortar actually have to be amortized (as required by tax laws) at between 27 and 30 years (not 10 to 15).
Wayne Barker
Monday, September 20, 2010 5:50:26 AM (Pacific Standard Time, UTC-08:00)
Wayne, in your comment you mentioned that brick and mortar facilities are normally amortized over 27 to 30 years and this could lead to "IT and overall power consumption being a larger portion". That's considerably longer than I'm used to but let's take a look at it. As I suspect you know, the challenge is the current rate of change. Trying to use 30 year old datacenter technology is tough and the current rate of change is now much faster than it was even 15 years ago. A 1980 DC isn't worth much.

If we were to use 30 years, what we would get is along the lines of what you predicted with IT costs going up, datacenter going down but perhaps surprisingly, power is actually down. The 30 year infrastructure amortization shown with 10 year amortization in parenthesis:

1) Servers & storage gear: 65% (57%)
2) Net gear: 9% (8%)
3) power distribution: 10% (18%)
4) power: 15% (13%)
5) Other: 2% (4%)

Using a 30 year amortization has a dramatic impact on overall costs but the change in percentages is less dramatic. The reason why I've moved from a 15 year amortization for infrastructure to a 10 year is the current rate of change. Things are improving so fast, I suspect the residual value of a 10+ year old facility won't be all that high. Thanks

--jrh
James Hamilton, jrh@mvdirona.com
Monday, September 27, 2010 10:15:34 AM (Pacific Standard Time, UTC-08:00)
James,

When they say that "power is the dominant cost", could it be that folks are somehwat confused and are really pointing at the fact that power is now the bounding resource in the data center (as opposed to, say, space)?
Chad Walters
Monday, September 27, 2010 1:00:54 PM (Pacific Standard Time, UTC-08:00)
Yeah, I think your likely correct on that guess Chad. Although some people who say "power is the dominant cost" are still optimizing for space consumption. Go figure :-)

--jrh
Wednesday, September 29, 2010 6:36:23 PM (Pacific Standard Time, UTC-08:00)
Hi James,

First off, excellent article, I found it very insightful. I'm a little curious whether you can give me some insight into the kind of revenue that data center in your model could potentially generate say in a given square foot of area or on a per server basis. I know a lot of factors come into play when talking about revenue but I haven't come across a generalized model which I think can be quite useful. Any insight will be greatly appreciated!
David
Saturday, October 02, 2010 10:37:31 AM (Pacific Standard Time, UTC-08:00)
It varies so greatly depending upon whether its used as a color, platform provider, private data center, etc. Its just really hard to offer general rules on revenue. The reason you havne't found the revenue model thus far is that its really very application and use specific.

--jrh
jrh@mvdirona.com
Sunday, October 03, 2010 8:23:15 PM (Pacific Standard Time, UTC-08:00)
Hi James,

Excellent post thanks. I am just curious (apologies if you have answered this before, I am new to your blog), but what would be a healthy rule of thumb with regards to network admins salary and/or "people" compensation etc, vs the total costs of running a data center as a percentage (are you suggesting 2% as mentioned to Wayne above?). Also, what about bandwidth costs/security/hardware replacement (or is that bundled in to your assumptions above?)

Pardon me if this is off topic (I do mean that), I am also curious to know what you personally would do with a vacant building in a downtown core location that has ~3MW of utility power and an abundance of diverse bandwidth...?


David
Monday, October 04, 2010 5:09:51 AM (Pacific Standard Time, UTC-08:00)
This doesn't include admin staff costs. The challenge here is highly heterogeneous enterprise datacenters with a small number of many different applications can have very high admin costs. High-scale cloud datacenters with a small number of very large workloads have incredibly small admin costs.

Generally, at scale, admin costs round to zero. Even well run medium-sized deployments get under 10% of overall costs and I've seen them as low as 3%.

The data here is everything below the O/S so doesn't include people and doesn't include software licensing.

What would I do with an incredibly well connected 3MW downtown? The key question is downtown where but, generally, that's big enough to be broadly interesting. Roughly 300 racks at 10kW a rack. I would likely lease it out. If you have good networking and power, it has value as a datacenter. If its in downtown NY, its incredibly valuable. If its in the San Francisco/Silicon Valley area, it's less valuable but it'll still move very quickly.

As always with real estate, location matters. Having good networking and power makes it an ideal datacenter candidate.

--jrh
Monday, October 04, 2010 12:05:43 PM (Pacific Standard Time, UTC-08:00)
Thanks for your feedback James.

Our building is located in Toronto, Canada.
David
Thursday, October 07, 2010 4:50:54 AM (Pacific Standard Time, UTC-08:00)
Hello James, very interesting article. i have a few questions in regards to an appropriate pricing modell and some technical details that i would like to ask.

we can observe many companies that run data center generate arround 20% of profits on reinvoicing the peak electricity consumption to their clients. therefore datacenter companies aren't interested in creating green datacenters, optimizing all factors such have energy efficient servers, PDU's, cooling systems a.s.o. because this will reduce the electricity consumption and as a result reduce the revenue of a datacenter company.
Can you explain me what kind of pricing model should be applied for green datacenters?
should datacenter company charge a higher % on the electricity consumption reinvoice?
should datacenters sell not the entire rack to clients but charge per node or per user and space, as this will be easier to calculate a cost of investment and maintenance?

what is your experience with Intel intelligent power node manager
http://software.intel.com/sites/datacentermanager/faqs.php

Dell and HP have in their product portfolio cloud computing solutions. what is your experience with Dell C6100 and HP Z6000 with 2x170h 6G servers?

In advance I thank you for your help
RP
Rafael P
Saturday, October 09, 2010 5:49:27 AM (Pacific Standard Time, UTC-08:00)
The billing model I like is, predictably based upon where I work, the Amazon pay-for-what-you-use model. When billing for compute usage by the hour, we are very motivated to reduce energy consumption since it direct reduces are costs. The Amazon EC2 charge model is at: http://aws.amazon.com/ec2/

James Hamilton
jrh@mvdirona.com
Comments are closed.

Disclaimer: The opinions expressed here are my own and do not necessarily represent those of current or past employers.

Archive
<September 2010>
SunMonTueWedThuFriSat
2930311234
567891011
12131415161718
19202122232425
262728293012
3456789

Categories
This Blog
Member Login
All Content © 2014, James Hamilton
Theme created by Christoph De Baene / Modified 2007.10.28 by James Hamilton