I like Power Usage Effectiveness as a course measure of data center infrastructure efficiency.
In what follows, I give an overview of PUE, talk about some the issues I have with it as currently defined, and then propose some improvements in PUE measurement using a metric called tPUE.
What is PUE?
PUE is defined in Christian Belady’s Green Grid Data Center Power Efficiency Metrics: PUE and DCiE. It’s a simple metric and that’s part of why it’s useful and it’s the source of some of the sources of the flaws in the metric. PUE is defined to be
PUE = Total Facility Power / IT Equipment Power
Total Facility Power is defined to be “power as measured at the utility meter”. IT Equipment Power is defined as “the load associated with all of the IT equipment”. Stated simply, PUE is the ratio of the power delivered to the facility divided by the power actually delivered to the servers, storage, and networking gear. It gives us a measure of what percentage of the power actually gets to the servers with the rest being lost in the infrastructure. These infrastructure losses include power distribution (switch gear, uninterruptable power supplies, Power Distribution Units, Remote Power Plugs, etc.) and mechanical systems (Computer Room Air Handlers/Computer Room Air Conditioners, cooling water pumps, air moving equipment outside of the servers, chillers, etc.). The inverse of PUE is called Data Center Infrastructure Efficiency (DCiE):
DCiE = IT Equipment Power / Total Facility Power * 100%
So, if we have a PUE of 1.7 that’s a DCiE of 59%. In this example, the data center infrastructure is dissipating 41% of the power and the IT equipment the remaining 59%.
This is useful to know in that allows us to compare different infrastructure designs and understand their relative value. Unfortunately, where money is spent, we often see metrics games and this is no exception. Let’s look at some of the issues with PUE and then propose a partial solution.
Issues with PUE
Total Facility Power: The first issue is the definition of total facility power. The original Green Grid document defines total facility power as “power as measured at the utility meter”. This sounds fairly complete at first blush but its not nearly tight enough. Many smaller facilities meter at 480VAC but some facilities meter at mid-voltage (around 13.2kVAC in North America). And a few facilities meter at high voltage (~115kVAC in North America). Still others purchase and provided the land for the 115kVAC to 13.2kVAC step down transformer layer but still meter at mid-voltage.
Some UPS are installed at medium voltage whereas others are at low (480VAC). Clearly the UPS has to be part of the infrastructure overhead.
The implication of the above observations is that some PUE numbers include the losses on two voltage conversion layers getting down to 480VAC, some include 1 conversion, and some don’t include any of them. This muddies the water considerably and makes small facilities look somewhat better than they should and it’s an just another opportunity to inflate numbers beyond what the facility can actually produce.
Container Game: Many modular data centers are built upon containers that take 480VAC as input. I’ve seen modular data center suppliers that chose to call the connection to the container “IT equipment” which means the normal conversion from 480VAC to 208VAC (or sometimes even to 110VAC) is not included. This seriously skews the metric but the negative impact is even worse on the mechanical side. The containers often have the CRAH or CRAC units in the container. This means that large parts of the mechanical infrastructure is being included under “IT load” and this makes these containers look artificially good. Ironically, the container designs I’m referring to here actually are pretty good. They really don’t need to play metrics games but it is happening so read the fine print.
Infrastructure/Server Blur: Many rack based modular designs use large rack levels fans rather than multiple inefficient fans in the server. For example, the Rackable CloudRack C2 (SGI is still Rackable to me :)) moves the fans out of the servers and puts them at the rack level. This is a wonderful design that is much more efficient than tiny 1RU fans. Normally the server fans are included as “IT load” but in these modern designs that move fans out of the servers, its considered infrastructure load.
In extreme cases, fan power can be upwards of 100W (please don’t buy these servers). This makes a data center running more efficient servers potentially have to report a lower PUE number. We don’t want to push the industry in the wrong direction. Here’s one more. The IT load normally includes the server Power Supply Unit (PSU) but in many designs such as IBM iDataPlex the individual PSUs are moved out of the server and placed at the rack level. Again, this is a good design and one we’re going to see a lot more of but it takes losses that were previously IT load and makes them infrastructure load. PUE doesn’t measure the right thing in these cases.
PUE less than 1.0: In the Green Grid document, it says that “the PUE can range from 1.0 to infinity” and goes on to say “… a PUE value approaching 1.0 would indicate 100% efficiency (i.e. all power used by IT equipment only). In practice, this is approximately true. But PUEs better than 1.0 is absolutely possible and even a good idea. Let’s use an example to better understand this. I’ll use a 1.2 PUE facility in this case. Some facilities are already exceeding this PUE and there is no controversy on whether its achievable.
Our example 1.2 PUE facility is dissipating 16% of the total facility power in power distribution and cooling. Some of this heat may be in transformers outside the building but we know for sure that all the servers are inside which is to say that at least 83% of the dissipated heat will be inside the shell. Let’s assume that we can recover 30% of this heat and use it for commercial gain. For example, we might use the waste heat to warm crops and allow tomatoes or other high value crops to be grown in climates that would not normally favor them. Or we can use the heat as part of the process to grow algae for bio-diesel. If we can transport this low grade heat and net only 30% of the original value, we can achieve a 0.90 PUE. That is to say if we are only 30% effective at monetizing the low-grade waste heat, we can achieve a better than 1.0 PUE.
Less than 1.0 PUE are possible and I would love to rally the industry around achieving a less than 1.0 PUE. In the database world years ago, we rallied around the achieving 1,000 transactions per second. The High Performance Transactions Systems conference was originally conceived with a goal of achiving these (at the time) incredible result. 1,000 TPS was eclipsed decades ago but HPTS remains a fantastic conference. We need to do the same with PUE and aim to get below 1.0 before 2015. A PUE less than 1.0 is hard but it can and will be done.
Christian Belady, the editor of the Green Grid document, is well aware of the issues I raise above. He proposes that it be replaced long haul by the Data Center Productivity (DCP) index. DCP is defined as:
DCP = Useful Work / Total Facility Power
I love the approach but the challenge is defining “useful work” in a general way. How do we come up with a measure of useful work that spans all interesting workloads over all host operating systems. Some workloads use floating point and some don’t. Some use special purpose ASICs and some run on general purpose hardware. Some software is efficient and some is very poorly written. I think the goal is the right one but there never will be a way to measure it in a fully general way. We might be able to define DCP for a given workload type but I can’t see a way to use it to speak about infrastructure efficiency in a fully general way.
Instead I propose tPUE which is a modification of PUE that mitigates some of the issues above. Admittedly it is more complex than PUE but it has the advantage of equalizing different infrastructure designs and allows comparison across workload types. Using tPUE, HPC facility can compare how they are doing against commercial data processing facilities.
tPUE standardizes where the total facility power is to be measured from and precisely where the IT equipment starts and what portions of the load are infrastructure vs server. With tPUE we attempt to remove some of the negative incentive to the blurring of the lines between IT equipment and infrastructure. Generally, this blurring is very good thing. 1RU fans are incredibly inefficient so replacing them with large rack or container level impellers is a good thing. Multiple central PSUs can be more efficient and so moving the PSU from the server out to the module or rack again is a good thing. We want a metric that measure the efficiency of these changes correctly. PUE, as currently designed, will actually show a negative “gain” in both examples.
We define as:
tPUE =Total Facility Power / Productive IT Equipment Power
This is almost identical to PUE. It’s the next level of definitions that are important. The tPUE definition of “Total Facility Power” is fairly simple. It’s power delivered to the medium voltage (~13.2kVAC) source prior to any UPS or power conditioning. Most big facilities are delivered at this voltage level or higher. Smaller facilities may get 480VAC delivered, in which case, this number is harder to get. We solve the problem by using a transformer manufacturer specified number if measurement is not possible. Fortunately, the efficiency numbers for high voltage transformers are accurately specified by manufacturers.
For tPUE the facility voltage must be actually measured at medium voltage if possible. If not possible, it is permissible to measure at low voltage (480VAC in North America and 400VAC in many other geographies) as long as the efficiency loss of the medium voltage transformer(s) is included. Of course, all measurements must be before UPS or any form of power conditioning. This definition permits using a non-measured, manufacturer-specified efficiency number for the medium voltage to low transformer but it does ensure that all measurements are using medium voltage as the baseline.
The tPUE definition of “Productive IT Equipment Power” is somewhat more complex. PUE measure IT load as the power delivered to the IT equipment. But, high scale data centers IT equipment are breaking the rules. Some have fans inside and some use the infrastructure fans. Some have no PSU and are delivered 12VDC by the infrastructure whereas most still have some form of PSU. tPUE “charges” all fans and all power conversions to the infrastructure component. I define “Productive IT Equipment Power” to be all power delivered to semiconductors (memory, CPU, northbridge, southbridge, NICs), disks, ASIC, FPGAs, etc. Essentially we’re moving the PSU losses, the voltage regulator down (VRD) and/or voltage regulator modules (VRM), and cooling fans from “IT load” to infrastructure. In this definition, infrastructure losses unambiguously includes all power conversions, UPS, switch gear, and other losses in distribution. And it includes all cooling costs whether they be in the server or not.
This hard part is how to measure tPUE. It achieves our goals of being comparable since everyone would be using the same definitions. And doesn’t penalize innovative designs that blur the conventional lines between server and infrastructure. I would argue we have a better metric but the challenge will be how to measure it? Will data center operators be able to measure it and track improvements in their facilities and understand how they compare with others?
We’ve discussed how to measure total facility power. The short summary is it must be measured prior to all UPS and power conditioning at medium voltage. If high voltage is delivered directly to your facility, you should measure after the first step down transformer. If your facility is delivered low voltage, then ask your power supplier whether it be the utility, the colo-facility owner, or your companies infrastructure group, the efficiency of the medium to low step down transformer at your average load. Add this value in mathematically. This is not perfect but it better than where we are right now when we look at a PUE.
At the low voltage end where we are delivering “productive IT equipment power” we’re also forced to use estimate with our measures. What we want to measure is the power delivered to individual components. We want to measure the power delivered to memory, CPU, etc. Our goal is to get power after the last conversion and this is quite difficult since VRDs are often on the board near the component they are supplying. Given that non-destructive power measurement at this level is not easy, we use an inductive ammeter on each conductor delivering power to the board. Then we get the VRD efficiencies from the system manufacturer (you should be asking for these anyway – they are an important factor in server efficiency). In this case, we often can only get efficiency at rated power and the actually efficiency of the VRD will be less in your usage. Nonetheless, we use this single efficiency number since it at least is an approximation and more detailed data is either unavailable or very difficult to obtain. We don’t include fan power (server fans typically run on a 12 volt rail). Essentially what we are doing is taking the definition of IT Equipment load used by the PUE definition and subtracting off VRD, PSU, and fan losses. These measurement needs to be taken at full server load.
The measurements above are not as precise as we might like but I argue the techniques will produce a much more accurate picture of infrastructure efficiency than the current PUE definitions and yet these metrics are both measurable and workload independent.
We have defined tPUE to be:
We defined total facility power to be measured before all UPS and power conditioning at medium voltage. And we defined Productive IT Equipment Power to be server power not including PSU, VRD and other conversion losses nor including fan or cooling power consumption.
Please consider helping to evangelize tPUE and use tPUE. And, for you folks designing and building commercial servers, if you can help by measuring the Productive IT Equipment Power for one or more of your SKUs, I would love to publish your results. If you can supply Productive IT Equipment Power measurement for one of your newer servers, I’ll publish it here with a picture of the server.
Let’s make the new infrastructure rallying cry achieving a tPUE<1.0.
James Hamilton, Amazon Web Services
1200, 12th Ave. S., Seattle, WA, 98144W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 | email@example.com
H:mvdirona.com | W:mvdirona.com/jrh/work | blog:http://perspectives.mvdirona.com
Disclaimer: The opinions expressed here are my own and do not
necessarily represent those of current or past employers.