Author Archive

Facebook Cassandra Architecture and Design

Last July, Facebook released Cassandra to open source under the Apache license: Facebook Releases Cassandra as Open Source. Facebook uses Cassandra as email search system where, as of last summer, they had 25TB and over 100m mailboxes. This video gets into more detail on the architecture and design: http://www.new.facebook.com/video/video.php?v=540974400803#/video/video.php?v=540974400803. My notes are below if you…

Read more »

Slides from Conference on Innovative Data Systems Research

I did the final day keynote at the Conference on Innovative Data Systems Research earlier this month. The slide deck is based upon the CEMS paper: The Case for Low-Cost, Low-Power Servers but it also included a couple of techniques I’ve talked about before that I think are super useful: · Power Load Management: The…

Read more »

Storage at 2TB for $250

Wow, 2TB for $250 from Western Digital: http://www.engadget.com/2009/01/26/western-digitals-2tb-caviar-green-hdd-on-sale-in-australia/. Once its shipping in North America, I’ll have to update The Cost of Bulk Cold Storage. Update: Released in the US at $299: Western Digital’s 2TB Caviar Green hard drive launches, gets previewed. Sent my way by Savas Parastatididis. –jrh James Hamilton, Amazon Web Services 1200, 12th…

Read more »

Low Power Amdahl Blades for Data Intensive Computing

In Microslice Servers and the Case for Low-Cost, Low-Power Servers, I observed that CPU bandwidth is outstripping memory bandwidth. Server designers can address this by: 1) designing better memory subsystems or 2) reducing the CPU per-server. Optimizing for work done per dollar and work done per joule argues strongly for the second approach for many…

Read more »

Microslice Servers

Microslice Servers

In The Case For Low-Power Servers I reviewed the Cooperative, Expendable, Micro-slice Servers project. CEMS is a project I had been doing in my spare time in investigating using low-power, low costs servers running internet-scale workloads. The core premise of the CEMS project: 1) servers are out-of-balance, 2) client and embedded volumes, and 3) performance…

Read more »

Recardo Hermann’s Snippets on Software

I recently stumbled across: Snippets on Software. It’s a collection of mini-notes on software with links to more if you are interested in more detail. Some snippets are wonderful, some clearly aren’t exclusive to software and some I would argue are just plain wrong. Nonetheless, it’s a great list. It’s too long to read from…

Read more »

The Case For Low-Cost, Low-Power Servers

The Conference on Innovative Data Systems Research was held last week at Asilomar California. It’s a biennial systems conference. At the last CIDR, two years ago, I wrote up Architecture for Modular Data Centers where I argued that containerized data centers are an excellent way to increase the pace of innovation in data center power…

Read more »

Amazon Web Services & Windows Live Mesh at Crunchies

Last night, TechCrunch hosted The Crunchies and two of my favorite services got awards. Ray Ozzie and David Treadwell accepted Best Technology Innovation/Achievement for Windows Live Mesh. Amazon CTO Werner Vogels accepted Best Enterprise Startup for Amazon Web Services. Also awarded (from http://crunchies2008.techcrunch.com/) Best Application Or Service Get SatisfactionGoogle Reader (winner)MintedMeeboMySpace Music (runner-up)Yelp Best Technology…

Read more »

Joel Spolsky: 12 Steps to Better Code

Back in 2000, Joel Spolsky published a set of 12 best practices for a software development team. It’s been around for a long while now and there are only 12 points but it’s very good. Simple, elegant, and worth reading: The Joel Test: 12 Steps to Better Code. Thanks to Patrick Niemeyer for sending this…

Read more »

Google’s Will Power and Data Center Efficiency

Earlier in the week, there was an EE Times posting, Server Makers get Googled, and a follow-up post from Gigaom How Google Is Influencing Server Design. I’ve long been an advocate of making industry leading server designs more available to smaller data center operators since, in aggregate, they are bigger power consumers and have more…

Read more »

CACM Interview with Pat Selinger

In a previous posting, Pat Selinger IBM Ph.D. Fellowships, I mentioned Pat Selinger as one of the greats of the relational database world. Working with Pat was one of the reasons why leaving IBM back in the mid-90’s was a tough decision for me. In the December 2008 edition of the Communications of the ACM,…

Read more »

Wikipedia Architecture

I’ve long argued that tough constraints often make for a better service and few services are more constrained than Wikipedia where the only source of revenue is user donations. I came across this talk by Domas Mituzas of Wikipedia while reading old posts on Data Center Knowledge. The posting A Look Inside Wikipedia’s Infrastructure includes…

Read more »

MySpace Architecture and .Net

From Viraj Mody of the Microsoft Live Mesh team sent this my way: Dan Farino About MySpace Architecture. MySpace, like Facebook, uses relational DBs extensively front-ended by a layer of Memcached servers. Less open source at MySpace but otherwise unsurprising – a nice scalable design with 3000 front end servers with well over 100 database…

Read more »

Bill Gates

Five or six years ago Bill Gates did a presentation to a small group at Microsoft on his philanthropic work at the Bill and Melinda Gates Foundation. It was by far and away the most compelling talk I had seen in that it was Bill applying his talent to solving world health problems with the…

Read more »

High Efficiency SATA Storage

Related to The Cost of Bulk Storage posting, Mike Neil dropped me a note. He’s built an array based upon this Western Digital part: http://www.wdc.com/en/products/Products.asp?DriveID=336. Its unusually power efficient: Power Dissipation Read/Write 5.4 Watts Idle 2.8 Watts Standby 0.40 Watts Sleep 0.40 Watts And it’s currently only $105: http://www.newegg.com/Product/Product.aspx?Item=N82E16822136151. It’s always been the case that…

Read more »

The Cost of Bulk Cold Storage

The Cost of Bulk Cold Storage

I wrote this blog entry a few weeks ago before my recent job change. It’s a look at the cost of high-scale storage and how it has fallen over the last two years based upon the annual fully burdened cost of power in a data center and industry disk costs trends. The observations made in…

Read more »