MySpace makes the odd technology choice that I don’t fully understand. And, from a distance, there are times when I think I see opportunity to drop costs substantially. But, let’s ignore that, and tip our hat to the MySpace for incredibly scale they are driving. It’s a great social networking site and you just can’t argue with the scale they are driving. Their traffic is monstrous and, consequently, it’s a very interesting site to understand in more detail.
Lubor Kollar of SQL Server just sent me this super interesting overview of the MySpace service. My notes follow and the original article is at: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?casestudyid=4000004532.
I particularly like social networking sites like Facebook and MySpace because they are so difficult to implement. Unlike highly partitionable workloads like email, social networking sites work hard to find as many relationships, across as many dimensions, amongst as many users as possible. I refer to this as the hairball problem. There are no nice clean data partitions which makes social networking sites amongst the most interesting of the high scale internet properties. More articles on the hairball problem:
· FriendFeed use of MySQL
· Geo-Replication at Facebook
· Scaling LinkedIn
The combination of the hairball problem and extreme scale makes the largest social networking sites like MySpace some of the toughest on the planet to scale. Focusing on MySpace scale, it is prodigious:
· 130M unique monthly users
· 40% of the US population has MySpace accounts
· 300k new users each day
The MySpace Infrastructure:
· 3,000 Web Servers
· 800 cache servers
· 440 SQL Servers
Looking at the database tier in more detail:
· 440 SQL Server Systems hosting over 1,000 databases
· Each running on an HP ProLiant DL585
o 4 dual core AMD procs
o 64 GB RAM
· Storage tier: 1,100 disks on a distributed SAN (really!)
· 1PB of SQL Server hosted data
As ex-member of the SQL Server development team and perhaps less than completely unbiased, I’ve got to say that 440 database servers across a single cluster is a thing of beauty.
More scaling stores: http://perspectives.mvdirona.com/2010/02/07/ScalingSecondLife.aspx.
Hats off to MySpace for delivering a reliable service, in high demand, with high availability. Very impressive.
b: http://blog.mvdirona.com / http://perspectives.mvdirona.com
Disclaimer: The opinions expressed here are my own and do not
necessarily represent those of current or past employers.