Author Archive

HadoopDB: MapReduce over Relational Data

MapReduce has created some excitement in the relational database community. Dave Dewitt and Michael Stonebraker’s MapReduce: A Major Step Backwards is perhaps the best example. In that posting they argued that map reduce is a poor structured storage technology, the execution engine doesn’t include many of the advances found in modern, parallel RDBMS execution engines,…

Read more »

SIGMETRICS/Performance 2009 & USENIX 2009 Keynotes

I presented Where does the Power Go in High Scale Data Centers the opening keynote at SIGMETRICS/Performance 2009 last month. The video of the talk was just posted: SIGMETRICS 2009 Keynote. The talk starts after the conference kick-off at 12:20. The video appears to be incompatible with at least some versions of Firefox. I was…

Read more »

Why I Enjoy Reading about Engineering Accidents, Failures, & Disasters

I’m a boater and I view reading about boating accidents as important. The best source that I’ve come across is the UKs Marine Accident Investigation Branch (MAIB). I’m an engineer and again, I view it as important to read about engineering failures and disasters. One of the best sources I know of is Peter G….

Read more »

Pictures from the Fisher Plaza Data Center Fire

There have been many reports of the Fisher Plaza data center fire. An early one was the Data Center Knowledge article: Major Outage at Seattle Data Center. Data center fires aren’t as rare as any of us would like but this one is a bit unusual in that fires normally happen in the electrical equipment…

Read more »

Barbara Liskov 2008 Turing Award Winner

MIT’s Barbara Liskov was awarded the 2008 Association of Computing Machinery Turing Award. The Turning award is the highest distinction in computer science and is often referred to as the Nobel price of computing. Past award winners are listed at: http://en.wikipedia.org/wiki/Turing_Award. The full award citation: Barbara Liskov has led important developments in computing by creating…

Read more »

Services Change Everything

Our industry has always moved quickly but the internet and high-scale services have substantially quickened the pace. Search is an amazingly powerful productivity tool and available effectively to free to all. The internet makes nearly all information available to anyone who can obtain time on an internet connection. Social networks and interest-area specific discussion groups…

Read more »

Microsoft Bringing 35 Megawatts on-line

Microsoft announced yesterday that it was planning to bring both Chicago and Dublin online next month. Chicago is initially to be a 30MW critical load facility with a plan to build out to a booming 60MW. 2/3 of the facility is a high scale containerized facility. It’s great to see the world’s second modular data…

Read more »

ISCA 2009 Keynote II: Internet-Scale Service Infrastructure Efficiency

I presented the keynote at the International Symposium on Computer Architecture 2009 yesterday. Kathy Yelick kicked off the conference with the other keynote on Monday: How to Waste a Parallel Computer. Thanks to ISCA Program Chair Luiz Borroso for the invitation and for organizing an amazingly successful conference. I’m just sorry I had to leave…

Read more »

ISCA 2009 Keynote I: How to Waste a Parallel Computer — Kathy Yelick

Title: Ten Ways to Waste a Parallel Computer Speaker: Katherine Yelick An excellent keynote talk at ISCA 2009 in Austin this morning. My rough notes follow: · Moore’s law continues o Frequency growth replaced by core count growth · HPC has been working on this for more than a decade but HPC concerned as well…

Read more »

PUE and Total Power Usage Efficiency (tPUE)

I like Power Usage Effectiveness as a course measure of data center infrastructure efficiency. It gives us a way of speaking about the efficiency of the data center power distribution and mechanical equipment without having to qualify the discussion on the basis of server and storage used or utilization levels, or other issues not directly…

Read more »

Erasure Coding and Cold Storage

Erasure coding provides redundancy for greater than single disk failure without 3x or higher redundancy. I still like full mirroring for hot data but the vast majority of the worlds data is cold and much of it never gets referenced after writing it: Measurement and Analysis of Large-Scale Network File System Workloads. For less-than-hot workloads,…

Read more »

The SmugMug Tale

Don MacAskill did one of his usual excellent talks at MySQL Conf 09 this. My rough notes follow. Speaker: Don MacAskill Video at: http://mysqlconf.blip.tv/file/2037101 · SmugMug: o Bootstrapped in ’02 and still operating without external funding o Profitable and without debt o Top 400 website o Doubling yearly · SmugMug Challenge: o Users get unlimited…

Read more »

Select Past Perspectives Postings

I’ve brought together links to select past postings and posted them to: http://mvdirona.com/jrh/AboutPerspectives/. It’s linked to the blog front page off the “about” link. I’ll add to this list over time. If there is a Perspectives article not included that you think should be, add a comment or send me email. Talks and Presentations Best…

Read more »

Server under 30W

Server under 30W

Two years ago I met with the leaders of the newly formed Dell Data Center Solutions team and they explained they were going to invest deeply in R&D to meet the needs of very high scale data center solutions. Essentially Dell was going to invest in R&D for a fairly narrow market segment. “Yeah, right”…

Read more »

Amazon Web Services Inport/Export

Cloud services provide excellent value but it’s easy to underestimate the challenge of getting large quantities of data to the cloud. When moving very large quantities of data, even the fastest networks are surprisingly slow. And, many companies have incredibly slow internet connections. Back in 1996 MInix author and networking expert, Andrew Tanenbaum said “Never…

Read more »

High-Scale Service Server Counts

From an interesting article in Data Center Knowledge Who has the Most Web Servers: 1&1 Internet: 55,000 servers (company) OVH: 55,000 servers (company) Rackspace: 50,038 servers (company) The Planet: 48,500 servers (company) Akamai Technologies: 48,000 servers (company) SBC Communications: 29,193 servers (Netcraft) Verizon: 25,788 servers (Netcraft) Time Warner Cable: 24,817 servers (Netcraft) SoftLayer: 21,000 servers…

Read more »