Why are there so many data centers in New York, Hong Kong, and Tokyo? These urban centers have some of the most expensive real estate in the world. The cost of labor is high. The tax environment is unfavorable. Power costs are high. Construction is difficult to permit and expensive. Urban datacenters are incredibly expensive facilities and yet a huge percentage of the world’s computing is done in expensive urban centers.
One of my favorite examples is the 111 8th Ave data center in New York. Google bought this datacenter for $1.9B. They already have facilities on the Columbia river where the power and land are cheap. Why go to New York when neither is true? Google is innovating in cooling technologies in their Belgium facility where they are using waste water cooling. Why go to New York where the facility is conventional, the power source predominantly coal-sourced, and the opportunity for energy innovation is restricted by legacy design and the lack of real estate available in the area around the facility. It’s pretty clear that 111 8th Ave isn’t going to be wind farm powered. A solar array could likely be placed on the roof but that wouldn’t have the capacity to run the interior lights in this large facility (See I love Solar but … for more on the space challenges of solar power at data center power densities). There isn’t space to do anything relevant along these dimensions.
Google has some of the most efficient datacenters in the world, running on some of the cleanest power sources in the world, and custom engineered from the ground up to meet their needs. Why would they buy an old facility, in a very expensive metropolitan area, with a legacy design? Are they nuts? Of course not, Google is in New York because many millions of Google customers are in New York or nearby.
Companies site datacenters near the customers of those data centers. Why not serve the planet from Iceland where the power is both cheap and clean? When your latency budget to serve customers is 200 msec, you can’t give up ¾ of that time budget on speed of light delays traveling long distances. Just crossing the continent from California to New York is a 74 msec round trip time (RTT). New York to London is 70 msec RTT. The speed of light is unbending. Actually, it’s even worse than the speed of light in that the speed of light in a fiber is about 2/3 of the speed of light in a vacuum (see Communicating Beyond the Speed of Light).
Because of the cruel realities of the speed of light, companies must site data centers where their customers are. That’s why companies selling world-wide, often need to have datacenters all over the world. That’s why the Akamai content distribution network has over 1,200 points of presence world-wide. To serve customers competitively, you need to be near those customers. The reason datacenters are located in Tokyo, New York, London, Singapore and other expensive metropolitan locations is they need to be near customers or near data that is in those locations. It costs considerably to maintain datacenters all over the world but there is little alternative.
Many articles recently have been quoting the Greenpeace open letter asking Ballmer, Bezos and Cook to “go to Iceland”. See for example Letter to Ballmer, Bezos, and Cook: Go to Iceland. Having come many of these articles recently, it seemed worth stopping and reflecting on why this hasn’t already happened. It’s not like company just love paying more or using less environmentally friendly power sources for their data centers.
Google is in New York because it has millions of customers in New York. If it were physically possible to serve these customers from an already built, hyper efficient datacenter like Google Dalles, they certainly would. But that facility is 70 msec round trip away from New York. What about Iceland? Roughly the same distance. It simply doesn’t work competitively. Companies build near their users because physics of the speed of light is unbending and uncaring.
So, what can we do? It turns out that many workloads are not latency sensitive. The right strategy is to house latency sensitive workloads near customers or the data needed at low latency and house latency insensitive workloads optimizing on other dimensions. This is exactly what Google does but, to do that, you need to have many datacenters all over the world so the appropriate facility can be selected on a workload-by-workload basis. This isn’t a practical approach for many smaller companies with only 1 or 2 datacenters to choose from.
This is another area where cloud computing can help. Cloud computing can allow mid-sized and even small companies to have many different datacenters optimized for different goals all over the world. Using Amazon Web Services, a company can house workloads near customers in Singapore, Tokyo, Brazil, and Ireland to be close to their international customers. Being close to these customers makes a big difference in the overall quality of customer experience (see: The Cost of Latency for more detail on how much latency really matters). As well as allowing a company to cost effectively have an international presence, cloud computing also allows companies to make careful decisions on where they locate workloads in North America. Again using AWS as the example, customers can place workloads in Virginia to serve the east coast or use Northern California to serve the population dense California region. If the workloads are not latency sensitive or is serving customers near the Pacific Northwest, they can be housed in the AWS Oregon region where the workload can be hosted coal free and less expensively than in Northern California.
The reality is that physics is uncaring and many workloads do need to be close to users. Cloud computing allows all companies to have access to datacenters all over the world so they can target individual workloads to the facilities that most closely meet their goals and the needs of their customers. Some computing will have to stay in New York even though it is mostly coal powered, expensive, and difficult to expand. But some workload will run very economically in the AWS West (Oregon) region where there is no coal power, expansion is cheap, and power inexpensive.
Workload placement decisions are more complex than “move to Iceland.”
–jrh
James Hamilton
e: jrh@mvdirona.com
w: http://www.mvdirona.com
b: http://blog.mvdirona.com / http://perspectives.mvdirona.com
9.27.2017
Hi
Interesting overview of data center locations!
In your opinion how would your criteria enhance known outer space station development & as we continue to explore the outer reaches of space or the unexplored depths of our oceans??
Would you agree without energy not much any thing would function??
How do you feel the continued use of satellites would enter into your equation??
Theres a big ball of endless free energy we call sun that we have not yet experienced its full potential yet we entertain ourselves with it’s magnificence & chase our tails with other events!
Why should we continue to enslave ourselves when we can do it better???
Joran, wherever it is possible to run the client asynchronously, I agree we should. But this isn’t possible on all workloads. For example, on internet search. It’s difficult to anticipate what query I might enter next, its hard to cache an interesting portion of the query space locally, and once a user starts to enter the search string, it is desirable to be able to start popping results up locally.
I suppose that if a client was at least often connected at very high bandwidth and it were possible to maintain a VERY large local cache, the search could be done locally. But this requires a very substantial client and lots of bandwidth. Not practical on most clients today and would likely negatively impact battery life.
I don’t see an easy solution to proximity being good. Where you need access to users or data at low latency, there really is no substitute for being near. You are right that not all applications need this but many do.
–jrh
Thank you for your insight.
All of that is true to the extent that the user experience is improved by reducing latency.
But what if we focused on making the quality of the user experience orthogonal to latency?
i.e. this is something that could be solved in software, e.g. see the history of networking for multiplayer games like Quake and how they decoupled latency from user experience at the software layer.
John, your first 2 make perfect sense. The 3rd (near datacenter technicians) is doubtless a factor but not one supported by careful cost accounting. I’m not saying it doesn’t happen, only that it doesn’t make sense. The only folks that need to be close to a data center are datatechs doing h/w installs and repairs and security and neither of which is a sufficiently large expense to motivate using expensive real estate and power. The power and real estate dominate by a large margin.
-jrh
James,
I sure that some of the reasons you cite for data centres being located in the main capital cities of the world are true, however I know of one data centre being located where it was (close to Heathrow in London) because it happened to be mid way between the homes of the IT director, the Head of Data Centre services and the Head of Desktop Services for the company that leased the space. A lot of location choices are made, that have nothing to do with technology issues.
One DC owner said to me, of a DC in prime office space that they were moving, as paying £x per square foot to host servers and communications equipment when they could house high earning city traders was a no brainer.
Data Centres have three main selection criteria, the first interconnectivity, the second availability and cost of power, the third cost and skill sets of IT personnel.
When the new Icelandic interconnecter is complete, http://www.emeraldnetworks.com/about-us/ with its low latency, we may see a bit of a change in thinking.
Thanks for the data point Anonymous Coward.
Datacenters are in New York and other expensive metro areas because they need to be. Iceland can’t serve many workloads becuase they are too far from the customers and too far from the data.
–jrh
$2 billion for 3 million square feet of office space is about $666 / square foot. I’ve never looked at the commercial real estate market, but $1000 / square foot is the norm for the residential market in New York City, so it doesn’t seem too far off to me.
Yes,I’m 100% sure that the reason 111 8th is a valuable facility is that its a super well connected data center. That’s why its worth $1.9B.
Matt, you mentioned that some folks might put their datacenter in downtown NY to avoid needing a full time datacenter person. I agree that it is possible that some people might come to that conclusion. But, the cost of a datatech is pretty close to irrelevant when compared to the operating cost of even a fairly small facility. You could save enough by locating where land and power are cheap to by 100s of datatechs. People might make that decision but the math doesn’t even come close to working at scale.
The reason to be in New York is to be close to customers and/or close to the data. Buildings like 111 8th are very well connected.
–jrh
Are you sure? I don’t believe that google has any systems beyond their office IT in 111 8th.
*Other* people do, but I don’t think google does. They have multiple floors of the building as offices and cafeteria and such, and it’s not a small place, being both 1/4 mile long and 1/20th of a mile wide.
Internap is in there, as well as Sprint, IIRC. It has a ton of data connections, and peering is cheap there, so connecting to bandwidth is easy for Google, but they needed a campus in NYC. It’s the second largest IIRC.
Now, reasons to be near NY vs upstate? Bandwidth & fibers capacity.
Upstate may be nice, and cheaper, but it Internet traffic has to follow the existing backbones.
Also, a bunch of other companies host in NYC because they are located in NYC, which means they don’t need a full time datacenter person. That’s smaller sites, but until you start building your DC’s, it’s a a factor.
Paul, you are right that upstate NY or Jersey is just about identical to end client users to NY city. But Iceland is not a viable option for these users. And,for high frequency trading and other low latency dependent applications, even across town may be too "far" as the photon flies.
–jrh
Rick, you are right that office space in NY is valuable and there are many offices in 111 8th. But, what makes it worth nearly $2B is that its very well connected at very low latency. Other than that, its just a valuable NY office building.
–jrh
I get it and I don’t. Sure, if you locate in the Dalles you pay the speed-of-light-tax. But what if you locate in upstate New York? Latency takes up hardly any of your time budget, and the land price/regulatory overheads plummet. I’d always heard that the reason you got network facilities in big cities is that they are where (for historical reasons as much as anything) the major cross-ocean, and cross-country data lines interconnect. Yes? no? Maybe so?
111 8th Ave may perhaps be a datacenter, but it’s an office building too and probably first. There are like 15+ ENORMOUS floors to house employees. Floors the size of NY city blocks.
I’m not privy to the reasons behind Google’s real estate decisions but the company has thousands of current and potential employees are in New York that they need to support, and this is a good location. A/C/E hits west side. L hits Queens and East side. Also 14th st location is close to the PATH.
I know because I work at BN.com which is on the 9th floor of the building, and we don’t work in a datacenter :) My understanding is Google is trying to slowly take over the existing tenant contracts they inherited and refit it for Googlers.
Spotify is in the same building.