Last week, Google hosted the Data Center Efficiency Summit. While there, I posted a couple of short blog entries with my rough notes:
· Data Center Efficiency Summit
· Rough Notes: Data Center Efficiency Summit
· Rough Notes: Data Center Efficiency Summit (posting #3)
In what follows, I summarize the session I presented and go into more depth on some of what I saw in sessions over the course of the day.
I presented Data Center Efficiency Best Practices at the 1pm session. My basic point was that PUEs in the 1.35 range are possible and attainable without substantial complexity and without innovation. Good solid design, using current techniques, with careful execution is sufficient to achieve this level of efficiency.
In the talk, I went through power distribution from high voltage at the property line to 1.2V at the CPU and showed cooling from the component level to release into the atmosphere. For electrical systems, the talk covered an ordered list of rules to increase power distribution efficiency:
1. Avoid conversions (Less transformer steps & efficient or no UPS)
2. Increase efficiency of conversions
3. High voltage as close to load as possible
4. Size voltage regulators (VRM/VRDs) to load & use efficient parts
5. DC distribution potentially a small win (regulatory issues)
Looking at mechanical systems, the talk pointed out the gains to be had by carefully moving to higher data center temperatures. Many server manufacturers including Dell and Rackable will fully stand behind their systems at inlet temperatures as high as 95F. Big gains are possible via elevated data center temperatures. The ordered list of mechanical systems optimizations recommended:
1. Raise data center temperatures
2. Tight airflow control, short paths, & large impellers
3. Cooling towers rather than chillers
4. Air-side economization & evaporative cooling
The slides from the session I presented are posted at: http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_Google2009.pdf.
The overall workshop was excellent. Google showed the details behind 1) the modular data center they did 4 years ago showing both the container design and the that of the building that houses them, 2) the river water cooling system employed in their Belgium data center. And 3) the custom Google-specific server design.
Modular DC: The modular data center was a 45 container design where each container was 222KW (roughly 780W/sq ft). The containers were housed in a fairly conventional two floor facility. Overall, it was nicely executed but all Google data centers built since this one have been non-modular and each subsequent design has been more efficient than this one. The fact that Google has clearly turned away from modular designs is interesting. My read is that the design we were shown missed many opportunities to remove cost and optimize for the application of containers. The design chosen essentially built a well executed but otherwise conventional data center shell using standard power distribution systems and standard mechanical systems. No part of the building itself optimized for containers. Even though it was a two level design, rather than just stacking containers, a two floor shell was built. A 220 ton gantry crane further drove up costs but the crane was not fully exploited by packing the containers in tight and stacking them.
For a containerized model to work economically, the attributes of the container need to be exploited rather than merely installing them in a standard data center shell. Rather than building an entire facility with multiple floors, we would need to use a much cheaper shell if any at all. The ideal would be a design where just enough concrete is poured to mount four container mounting bolts so they can be tied down to avoid wind damage. I believe the combination of not building a full shell, the use of free air cooling, and the elimination of the central mechanical system would allow containerized designs to be very cost effective. What we learn from the Google experiment is that a the combination of a conventional data center shell and mechanical systems with containers works well (their efficiency data shows it to be very good) but isn’t notably better than similar design techniques used with non-containerized designs.
River water cooling: The Belgium river water cooled data center caught my interest when it was first discussed a year ago. The Google team went through the design in detail. Overall, it’s beautiful work but included a full water treatment plant to treat the water before using it. I like the design in that its 100% better both economically and environmentally to clean and use river water rather than to take fresh water from the local utility. But, the treatment plant itself represents a substantial capital expense and requires energy for operation. It’s clearly an innovative way to reduce fresh water consumption. However, I slightly prefer designs that depend more deeply on free air cooling and avoid the capital and operational expense of the water treatment plant.
Custom Server: The server design Google showed was clearly a previous generation. It’s a 2005 board and I strongly suspect there exist subsequent designs at Google that haven’t yet been shown publically. I fully support this and think showing publically the previous generaion design is a great way to drive innovation inside a company while contributing to the industry as a whole. I think it’s a great approach and the server that was shown last Wednesday was a very nice design.
The board is a 12volt only design. This has been come more common of late with IBM, Rackable, Dell and others all doing it. However, when the board was first designed, this was considerably less common. 12V only supplies are simpler, distributing on-board the single voltage is simpler and more efficient, and distribution losses are lower at 12v than either 3.3 or 5 for a given sized trace. Nice work.
Perhaps the most innovative aspect of the board design is the use of a distributed UPS. Each board has a 12V VRLA battery that can keep the server running for 2 to 3 minutes during power failures. This is plenty of time to ride through the vast majority of power failures and is long enough to allow the generators to start, come on line, and sync. The most important benefit of this design is it avoids the expensive central UPS system. And, it also avoids the losses of the central UPS (94% to 96% efficient UPSs are very good and most are considerably worse). Google reported their distributed UPS was 99.7% efficient. I like the design.
The motherboard was otherwise fairly conventional with a small level of depopulation. The second Ethernet port was deleted as was USB and other components. I like the Google approach to server design.
The server was designed to be rapidly serviced with the power supply, disk drives, and battery all being Velcro attached and easy to change quickly. The board itself looks difficult to change but I suspect their newer designs will address that shortcoming.
Hat’s off to Google for organizing this conference to get high efficiency data center and server design techniques more broadly available across the industry. Both the board and the data center designs shown in detail where not Google’s very newest but all were excellent and well worth seeing. I like the approach of showing the previous generation technology to the industry while pushing ahead with newer work. This technique allows a company to reap the potential competitive advantages of its R&D investment while at the same time being more open with the previous generation.
It was a fun event and we saw lots of great work. Well done Google.
James Hamilton, Amazon Web Services
1200, 12th Ave. S., Seattle, WA, 98144W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 | firstname.lastname@example.org
H:mvdirona.com | W:mvdirona.com/jrh/work | blog:http://perspectives.mvdirona.com
Disclaimer: The opinions expressed here are my own and do not
necessarily represent those of current or past employers.