The pace of innovation in data center design has been rapidly accelerating over the last 5 years driven by the mega-service operators. In fact, I believe we have seen more infrastructure innovation in the last 5 years than we did in the previous 15. Most very large service operators have teams of experts focused on server design, data center power distribution and redundancy, mechanical designs, real estate acquisition, and network hardware and protocols. But, much of this advanced work is unpublished and practiced at a scale that is hard to duplicate in a research setting.
At low scale with only a data or center or two, it would be crazy to have all these full time engineers and specialist focused on infrastructural improvements and expansion. But, at high scale with 10s of data centers, it would be crazy not to invest deeply in advancing the state of the art.
Looking specifically at cloud services, the difference between an unsuccessful cloud service and a profitable, self-sustaining business is the cost of the infrastructure. With continued innovation driving down infrastructure costs, there is investment capital available, services can be added and improved, and value can be passed on to customers through price reductions. Amazon Web Services, for example, has had 11 price reductions in 4 years. I don’t recall that happening in my first 20 years working on enterprise software. It really is an exciting time in our industry.
Facebook is a big business operating at high scale and they also have elected to invest in advanced infrastructure designs. Jonathan Heiliger and the Facebook infrastructure team have hired an excellent group of engineers over the past couple of years and are now bringing these designs to life in their new Prineville Oregon facility. I had the opportunity to visit this datacenter 6 weeks back just before it started taking production load. I had an excellent visit, got to catch up with some old friends, meet some new ones, and tour an impressive facility. I saw an unusually large number of elegant designs ranging from one of the cleanest mechanical systems I’ve come across, three phase 480VAC directly to the rack, a low voltage direct current distributed uninterruptable power supply system, all the way through to custom server designs. But, what made this trip really unusual is that I’m actually able to talk about what I saw.
In fact, more than allowing me to talk about it, Facebook has decided to release most of the technical details surrounding these designs publically. In the past, I’ve seen some super interesting but top secret facilities and I’ve seen some public but not particularly advanced data centers. To my knowledge, this is the first time an industry leading design has been documented in detail and released publically.
The set of specifications Facebook is releasing are worth reading so I’m posting links to all below. I encourage you to go through these in as much detail as you chose. In addition, I’ll also post summary notes over the next couple of days explain aspects of the design I found most interesting or commenting upon the pros and cons of some of the approaches employed.
· Battery Cabinet (Distributed UPS)
My commendations to the specification authors Harry Li , Pierluigi Sarti, Steve Furuta, Jay Park and to the rest of the Facebook infrastructure team for releasing this work publically and for doing so in sufficient detail that others can build upon it. Well done.
· Open Compute Web Site: http://opencompute.org/
· Live Blog of the Announcement: http://www.insidefacebook.com/2011/04/07/live-blogging-facebooks-open-compute-project/