Tuesday, May 05, 2009

High data center temperatures is the next frontier for server competition (see pages 16 through 22 of my Data Center Efficiency Best Practices talk: http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_Google2009.pdf and 32C (90F) in the Data Center). At higher temperatures the difference between good and sloppy mechanical designs are much more pronounced and need to be a purchasing criteria.


The infrastructure efficiency gains of running at higher temperatures are obvious. In a typical data center 1/3 of the power arriving at the property line is consumed by cooling systems.  Large operational expenses can be avoided by raising the temperature set point.  In most climates raising data center set points to the 95F range will allow a facility to move to a pure air-side economizer configuration eliminating 10% to 15% of the overall capital expense with the later number being the more typical.


These savings are substantial and exciting.  But, there are potential downsides: 1) increased server mortality, 2) higher semi-conductor leakage current at higher temperatures, 3) increased air movement costs driven by higher fan speeds at higher temperatures.  The former, increased server mortality, has very little data behind it. I’ve seen some studies that confirm higher failure rates at higher temperature and I’ve seen some that actually show the opposite.  For all servers there clearly is some maximum temperature beyond which failure rates will increase rapidly. What’s unclear is what that temperature point actually is.


We also know that the knee of the curve where failures start to get more common is heavily influenced by the server components chosen and the mechanical design.  Designs that cool more effectively, will operate without negative impact at higher temperatures. We could try to understand all details of each server and try to build a failure prediction model for different temperatures but this task is complicated by the diversity of servers and components and the near complete lack of data at higher temperatures. 


So, not being able to build a model, I chose to lean on a different technique that I’ve come to prefer: incent the server OEMs to produce the models themselves. If we ask the server OEMs to warrant the equipment at the planned operating temperature, we’re giving the modeling problem to the folks that have both the knowledge and the skills to model the problem faithfully and, much more importantly, they have ability to change designs if they aren’t fairing well in the field. The technique of transferring the problem to the party most capable of solving it and financially incenting them to solve it will bring success. 


My belief is that this approach of transferring the risk, failure modeling, and field result tracking to the server vendor will control point 1 above (increased server mortality rate). We also know that the Telecom world has been operating at 40C (104F) for years (see NEBS)so clearly equipment can be designed to operate correctly at these temperatures and last longer than current servers are used. This issue looks manageable.


The second issue raised above was increased semi-conductor current leakage at higher temperatures.  This principle is well understood and certainly measureable. However, in the crude measurements I’ve seen, the increased leakage is lost in the noise of higher fan power losses. And, the semiconductor leakages costs are dependent upon semi-conductor temperature rather than air inlet temperature. Better cooling designs or higher air volumes can help prevent substantial increases in actually semi-conductor temperatures. Early measurements with current servers suggests that this issue is minor so I’ll set it aside as well.


The final issue issues is hugely important and certainly not lost in the noise. As server temperatures go up, the required cooling air flow will increase.  Moving more air consumes more power and, as it turns out, air is an incredibly inefficient fluid to move.  More fan speed is a substantial and very noticeable cost.  What this tells us is the savings of higher temperature will get eaten up, slowly at first and more quickly as the temperature increases, until some cross over point where fan power increases dominate conventional cooling system operational costs.


Where is the knee of the curve where increased fan power crosses over and dominates the operational savings of running at higher temperatures? Well, like many things in engineering, the answer is “it depends.” But, it depends in very interesting ways. Poor mechanical designs built by server manufactures who think a mechanical engineers are a waste of money, will be able to run perfectly well at 95F.  Even I’m a good enough mechanical engineer to pass this bar. The trick is to put a LARGE fan in the chassis and move lots of air. This approach is very inefficient and wastes much power but it’ll work perfectly well at cooling the server. The obvious conclusion is that points 1 and 2 above really don’t matter. We clearly CAN use 95F approach air to cool servers and maintain them at the same temperature they run today which eliminates server mortality issues and potential semi-conductor leakage issues. But, eliminating these two issues with a sloppy mechanical design will be expensive and waste much power.


A well designed server with careful part placement, good mechanical design, and careful impeller selection and control will perform incredibly differently from a poor design. The combination of good mechanical engineering and intelligent component selection can allow a server to run at 95F at a nominal increase in power due to higher air movement requirements. A poorly designed system will be expensive to run at elevated temperatures. This is a good thing for the server industry because it’s a chance for them to differentiate and compete on engineering talent rather than all building the same thing and chasing the gray box server cost floor.


In past postings, I’ve said that server purchases should be made on the basis of work done per dollar and work done per joule (see slides at http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_Google2009.pdf). Measure work done using your workload or a kernel of your workload or a benchmark you feel is representative of your work load.  When measuring work done per dollar and work done joule (one watt for one second), do it at your planned data center air supply temperature. Higher temperatures will save you big operational costs and, at the same time, measuring and comparing servers at high temperatures will show much larger differentiation between server designs.  Good servers will be very visibly better than poor designs. And, if we all measure work done joule (or just power consumption under load) at high inlet temperatures, we’ll quickly get efficient servers that run reliably at high temperature.


Make the server suppliers compete for work done per joule at 95F approach temperatures and the server world will evolve quickly. It’s good for the environment and is perhaps the largest and easiest to obtain cost reduction on the horizon.




James Hamilton, Amazon Web Services

1200, 12th Ave. S., Seattle, WA, 98144
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 |

H:mvdirona.com | W:mvdirona.com/jrh/work  | blog:http://perspectives.mvdirona.com


Tuesday, May 05, 2009 6:53:55 AM (Pacific Standard Time, UTC-08:00)  #    Comments [13] - Trackback
Tuesday, May 05, 2009 7:09:34 AM (Pacific Standard Time, UTC-08:00)
One more thing:

Unless a wholesale DC distribution approach is taken, eliminating as many DC-DC on board conversions as possible, these high peak current high freq regulators and filters will be affected negatively by increase in the ambient. I think this is often overlooked, but the ESR and hysteresis losses do climb precipitously in higher ambient, and are prone to contributing to sudden and often, unpredictable failures. It not all about the semiconductors.

This used to be my bread and butter before I got into the stupid internet applications sector. I should have stayed in ATE test of power distribution and monitoring. Anyone need a guy?
Tuesday, May 05, 2009 8:30:17 AM (Pacific Standard Time, UTC-08:00)
Agree. Good servers have one conversion after the PSU these days. These can be efficient but, since folks don't make it a buying criteria, they often aren't. See this talk for more along these lines: http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_Google2009.pdf.

Thanks for the comment Alan.
Tuesday, May 05, 2009 8:54:31 AM (Pacific Standard Time, UTC-08:00)

What about the operating temperature for humans? With inlet temps at 95F (nevermind the exhaust side), I can't imagine that would be too comfortable a temperature for anyone in the data center maintaining equipment. Also, if the fan speeds were to increase, that could create noise issues in the environment as well.
Tuesday, May 05, 2009 9:35:15 AM (Pacific Standard Time, UTC-08:00)
The data center is today a unpleasant workplace and, as you point out, it's getting worse. What I would recommend: classify the hot aisle as non-occupied and move to remove and replace server service model. Pull from rack, replace with blanking and do service work outside the raised floor area. The replace. We need to find ways to minimize the time techs need to spend on the floor.

James Hamilton
Tuesday, May 05, 2009 9:48:10 AM (Pacific Standard Time, UTC-08:00)
I applied for a job at an R&D org that was proposing the addition of predictive on-board DC-DC PF waveshape failure detection. Very advanced stuff that actually picked up the waveforms of the converters and monitored for inconsistencies in PF over time.I didn't get the job :(. Fail modes that may affect the main board, blade, or rack can be predicted by monitoring waveshapes and thus the implied load of the pre-filtered pre-rectified and regulated commutation wave form.

I hear that some of the blade configurations actually allow for multiple failures to accumulate over several server cards and racks, as the VM's are just dynamically reallocated, and then the rack is pulled after an accumulation of failed server cards and racks. No?
Tuesday, May 05, 2009 12:12:01 PM (Pacific Standard Time, UTC-08:00)
Alan, I suspect you are right that we might be able to predict some server failures using the techniques you outline but some failures won't be predictable. The technique still could be interesting though.

You are correct that most high-scale services allow failures to accumulate and service in batches. Typically live migration isn't used since some failures are hard to predict and, once a failure has occurred, the server is down and there nothing "live" left to migrate. But, as you suggest, batch repair is the common case.

James Hamilton
Wednesday, May 06, 2009 6:48:27 AM (Pacific Standard Time, UTC-08:00)
By higher fan speeds, I presume you mean both in the air plant, and in the servers themselves. I am guessing that a higher ambient temperature also reduces the temperature differentials (as various internal fans are likely to be temperature dependent); this will make convection cooling less efficient, which will add to the fan speed requirements.

Another guess: failure rate of mechanical component degrades more with temperature than failure rate of non-mechanical components. If this is true, and the server architecture is for centralized storage (not small disks in each server), the storage could presumably be put in an aisle cooled more or put in a separate room.
Alex Bligh
Wednesday, May 06, 2009 8:16:05 AM (Pacific Standard Time, UTC-08:00)
This post is focused on server efficiency at higher temperature but, yes, increased datacenter air flow is also required.

You guessing that disks need to be stored remotely. Not the case, disk case temps are typically specified in the 50C to 60C (122F to 140F) range.

James Hamilton
Thursday, May 07, 2009 11:32:44 PM (Pacific Standard Time, UTC-08:00)
"What I would recommend: classify the hot aisle as non-occupied and move to remove and replace server service model."

I would go further if constructing on a greenfield. I would seal the hot isle and convectively vent it directly upward through a chimney and out of the building. Fresh air would be directed into the building through a sidewall, through a filter, and into the cold isle itself.

Moving outside air through a big pipe over short distances is reasonably cheap.
Chris Bock
Sunday, May 10, 2009 5:11:45 PM (Pacific Standard Time, UTC-08:00)
Another thing to remember is that you don't need to stay at 95°F or 104°F all the time. Even if you assume that you could hit a high of 104°F (40°C) on a summer afternoon, the outside temperature will generally fall quite a bit lower than that later in the day. Though you would want to avoid wild swings that happen too quickly, swings of 20+°F per day should easily be tolerated. In most data center locations, the high temperature will not exceed 80°F most days.

As mentioned in some <href="http://greentechnologyinsights.blogspot.com/2009/04/human-side-of-higher-data-center.html" title="blog posts">, you may be able to adjust your service practices to avoid equipment replacements during the hottest part of the hottest days. And in most locations, you don't even have to worry about this for 8+ months of the year, as the highest outside temperature is not too hot for extended work (though I agree with James that technician time on the data center floor should be minimized where possible). In fact, many locales will end up needing to mix exhaust air with outside air in the winter to maintain acceptable temperatures.

--Kevin Bross
Sunday, May 10, 2009 5:54:05 PM (Pacific Standard Time, UTC-08:00)
Yeah, if you want to control temperature, mix the outdoor air with exhaust. I largely don't think you need to do this as long as the outdoor air is above freezing, although exhaust mixing to 20C could make the datacenter more confortable for those working in it. In hotter locales (i.e Australia, where latency prevents better site placement) you could also throw an evaporator onto the inlet for use during heatwaves to bring the 40C air down a couple of degress.

This stuff costs pennies compared to a chiller and it largely never breaks or requires maintanence.
Chris Bock
Monday, May 11, 2009 4:29:29 AM (Pacific Standard Time, UTC-08:00)
Thanks for the blog reference Kevin. Your suggestion to service during the cooler parts of the day quite sensible. Early morning before the building heat load has built is likely a good time to service.

Chris, I'm 100% with you that that the combination of exhaust recirc to control the low end and evap coolers for the high end can eliminate A/C. Thanks for the comment.

James Hamilton
Sunday, May 17, 2009 5:41:49 AM (Pacific Standard Time, UTC-08:00)
John, I was intrigued by your comment

"Moving more air consumes more power and, as it turns out, air is an incredibly inefficient fluid to move".

May be in the future we would have racks immersed tanks of non conducting fluids such distilled water or other liquids. This could help in efficient cooling. The problem seems in effectively move the heat generated out of the data center.

I was also wondering whether people have used Stirling engines (energy efficient engines that run due to temperature differences) to cool parts of data centers.
Shiv Shankar
Comments are closed.

Disclaimer: The opinions expressed here are my own and do not necessarily represent those of current or past employers.

<May 2009>

This Blog
Member Login
All Content © 2014, James Hamilton
Theme created by Christoph De Baene / Modified 2007.10.28 by James Hamilton