Earlier this week I got a thought provoking comment from Rick Cockrell in response to the posting: 32C (90F) in the Data Center. I found the points raised interesting and worthy of more general discussion so I pulled the thread out from the comments into a separate blog entry. Rick posted:
Guys, to be honest I am in the HVAC industry. Now, what the Intel study told us is that yes this way of cooling could cut energy use, but what is also said is that there was more than a 100% increase in server component failure in 8 months (2.45% to 4.46%) over the control study with cooling... Now with that said if anybody has been watching the news lateley or Wall-e, we know that e-waste is overwhlming most third world nations that we ship to and even Arizona. Think?I see all kinds of competitions for energy efficiency, there should be a challenge to create sustainable data center. You see data centers use over 61 billion kWh annually (EPA and DOE), more than 120 billion gallons of water at the power plant (NREL), more than 60 billion gallons of water onsite (BAC) while producing more than 200,000 tons of e-waste annually (EPA). So for this to be a fair game we can't just look at the efficiency. It's SUSTAINABILITY!It would be easy to just remove the mechanical cooling (I.E. Intel) and run the facility hotter, but the e-waste goes up by more than 100% (Intel Report and Fujitsu hard drive testing), It would be easy to not use water cooled equipment, to reduce water onsite use but the water at the power plant level goes up, as well as the energy use. The total solution has to be a solution of providing the perfect environment, the proper temperatures, while reducing e-waste.People really need to do more thinking and less talking. There is a solution out there that can do almost everything that needs to be done for the industry. You just have to look! Or maybe call me I'll show you.
Rick, you commented that “it’s time to do more thinking and less talking” and argued that the additional server failures seen in the Intel report created 100% more ewaste so simply wouldn’t make sense. I’m willing to do some thinking with you on this one.
I see two potential issues with your assumption. The first that the Intel report showed “100% more ewaste”. What they saw in a 8 rack test is server mortality rate of 4.46% whereas their standard data centers were 3.83%. This is far from double and with only 8 racks may not be statistically significant. Further evidence that the difference may not be significant we see that the control experiment where they had 8 racks in the other half of the container running on DX cooling showed failure rates of 2.45%. It may be noise given that the control differed from the standard data center by about as much as test data set. And, it’s a small sample.
Let’s assume for a second that the increase in failure rates actually was significant. Neither the investigators or I are convinced this is the case but let’s make the assumption and see where it takes us. They have 0.63% more than their normal data centers and 2.01% more than the control. Let’s take the 2% number and think it through assuming these are annualized numbers. The most important observation I’ll make is that 85% to 90% of servers are replaced BEFORE they fail which is to say that obsolescence is the leading cause of server replacement. They no longer are power efficient and get replaced after 3 to 5 years. If I could save 10% of the overall data center capital expense and 25%+ of the operating expense at the cost of having an additional 2% in server failures each year. Absolutely yes. Further driving this answer home, Dell, Rackable, and ZT Systems will replace early failures if run under 35C (95F) on warranty.
So, the increased server mortality rate is actually free during the warranty period but let’s ignore that and focus on what’s better for the environment. If 2% of the servers need repair early and I spend the carbon footprint to buy replacement parts but saving 25%+ of my overall data center power consumption, is that a gain for the environment? I’ve not got a great way to estimate true carbon footprint of repair parts but it sure looks like a clear win to me.
On the basis of the small increase in server mortality weighed against the capital and operating expense savings, running hotter looks like a clear win to me. I suspect we’ll see at least a 10F average rise over the next 5 years and I’ll be looking for ways to make that number bigger. I’m arguing it’s a substantial expense reduction and great for the environment.
James Hamilton, Amazon Web Services
1200, 12th Ave. S., Seattle, WA, 98144W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 | firstname.lastname@example.org
H:mvdirona.com | W:mvdirona.com/jrh/work | blog:http://perspectives.mvdirona.com
Disclaimer: The opinions expressed here are my own and do not
necessarily represent those of current or past employers.