Archive For The “Services” Category

Cost of Power in Large-Scale Data Centers

Cost of Power in Large-Scale Data Centers

I’m not sure how many times I’ve read or been told that power is the number one cost in a modern mega-data center, but it has been a frequent refrain. And, like many stories that get told and retold, there is an element of truth to the it. Power is absolutely the fastest growing operational…

Read more »

Mesh Services Architecture and Concepts

Abolade Gbadegesin, Windows Live Mesh Architect gave a great talk at the Microsoft Professional Developers Conference on Windows Live Mesh (talk video, talk slides). Live mesh is a service that supports p2p file sharing amongst your devices, file storage in the cloud, remote access to all your devices (through firewalls and NATS), and web access…

Read more »

Monitoring at Scale

Service monitoring at scale is incredibly hard. I’ve long argued that you should never learn anything about a problem your service is experiencing from a customer. How could they possibly know first when there is a service outage or issue? And, yet it happens frequently. The reason it happens is most sites don’t have close…

Read more »

Hotnets 2008 Paper

Albert Greenberg and I missed Hotnets 2008 last week due to a conflicting meeting down in California but Ken Church was there to present our On Delivering Embarrassingly Distributed Cloud Services paper. I summarized the paper in a recent blog entry: Embarrassingly Distributed Cloud Services and the abstract from the paper follows: Very large data…

Read more »

A Small Window into Google’s Data Centers

Google has long enjoyed a reputation for running efficient data centers. I suspect this reputation is largely deserved but, since it has been completely shrouded in secrecy, that’s largely been a guess built upon respect for the folks working on the infrastructure team rather than anything that’s been published. However, some of the shroud of…

Read more »

Embarrasingly Distributed Cloud Services

Ken Church, Albert Greenberg, and I just finished On Delivering Embarrassingly Distributed Cloud Services which has been accepted for presentation at ACM Hotnets 2008 in Calgary, Alberta October 6th and 7th. This paper followed from the discussion and debate around a blog entry that Ken and I did some time back: Diseconomies of scale where…

Read more »

Internet-Scale Service Efficiency

Earlier today, I gave a talk at LADIS 2008 (Large Scale Distributed Systems & Middleware) in Yorktown Heights, New York. The program for LADIS is at: http://www.cs.cornell.edu/projects/ladis2008/program.html. The slides presented are posted to: http://mvdirona.com/jrh/TalksAndPapers/JamesRH_Ladis2008.pdf. The quick summary of the talk: Hosted services will be a large part of enterprise information processing and consumer services with…

Read more »

Degraded Operations Mode

In Designing and Deploying Internet Scale Services I’ve argued that all services should expect to be overloaded and all services should expect mass failures. Very few do and I see related down-time in the news every month or so. The Windows Genuine Advantage failure (WGA Meltdown…) from a year ago is a good example in…

Read more »

Facebook F8 Conference Notes

Facebooks F8 conference was held last month in San Francisco. During his mid-day keynote Mark Zuckerberg reported that the Facebook platform now has 400,000 developers and 90 million users of which 32% are from the United States. The platforms US user population grew 2.4x last year while the international population grew at an astounding 5.1x….

Read more »

Scaling at LucasFilms

Kevin Clark, Director of IT Operations at Lucasfilm was interviewed by On-Demand Enterprise in We’ve Come a Long Way Since Star Wars. His organization owns IT for LucasArts, Lucasfilm, and Industrial Light and Magic. Lucasfilm runs a 4,500 server dedicated rendering farm and they expand this farm with workstations when they are not in use…

Read more »

Geo-Replication at Facebook

Last Friday I arrived back from vacation (Back from the Outside Passage in BC) to 3,600 email messages. I’ve been slogging through them through the weekend to now and I’m actually starting to catch up. Yesterday Tom Kleinpeter pointed me to this excellent posting from Jason Sobel of Facebook: Scaling Out. This excellent post describes…

Read more »

Flickr DB Architecture

I’ve been collecting scaling stories for some time now and last week I came across the following run down on Fliker scaling: Federation at Flickr: Doing Billions of Queries Per Day by Dathan Vance Pattishall, the Flickr database guy. The Flickr DB Architecture is sharded with a PHP access layer to maintain consistency. Flickr users…

Read more »

Google Megastore

What follows is a guest posting from Phil Bernstein on the Google Megastore presentation by Jonas Karlsson, Philip Zeyliger at SIGMOD 2008: Megastore is a transactional indexed record manager built by Google on top of BigTable. It is rumored to be the store behind Google AppEngine but this was not confirmed (or denied) at the…

Read more »

Facebook: Needle in a Haystack: Efficient Storage of Billions of Photos

Title: Needle in a Haystack: Efficient Storage of Billions of Photos Speaker: Jason Sobel, Manager of the Facebook, Infrastructure Group) Slides: http://beta.flowgram.com/f/p.html#2qi3k8eicrfgkv An excellent talk that I really enjoyed. I used to lead a much smaller service that also used a lot of NetApp storage and I recognized many of the problems Jason mentioned. Throughout…

Read more »

Structure 2008: Put Cloud Computing to Work

Alex Mallet and Viraj Mody of the Windows Live Mesh team took great notes at the Structure ’08 (Put Cloud Computing to Work) conference (appended below). Some pre-reading information was made available to all attendees as well: Refresh the Net: Why the Internet needs a Makeover? Overall – Interesting mix of attendees from companies in…

Read more »

Google’s Dr. Kai-Fu Lee on Cloud Computing

John Breslin did an excellent job of writing up Kai-Fu Lee’s Keynote at WWW2008. John’s post: Dr. Kai-Fu Lee (Google) – “Cloud Computing”. There are 235m internet users in China and Kai-Fu believes they want: 1. Accessibility 2. Support for sharing 3. Access data from wherever they are 4. Simplicity 5. Security He argues that…

Read more »