Wikipedia Architecture

I’ve long argued that tough constraints often make for a better service and few services are more constrained than Wikipedia where the only source of revenue is user donations. I came across this talk by Domas Mituzas of Wikipedia while reading old posts on Data Center Knowledge. The posting A Look Inside Wikipedia’s Infrastructure includes a summary of the talk Domas gave at Velocity last summer.

Interesting points from the Data Center Knowledge posting and the longer document referenced below from the 2007 MySQL coference:

· Wikipedia serves the world from roughly 300 servers

o 200 application servers

o 70 Squid servers

o 30 Memcached servers (2GB each)

o 20 MySQL servers using Innodb, each with 16GB of memory (200 to 300GB each)

o They also use Squid, Nagios, dsh, nfs, Ganglia, Linux Virtual Service, Lucene over .net on Mono, PowerDNS, lighttpd, Apache, PHP, MediaWiki (originated at Wikipedia)

· 50,000 http requests per second

· 80,000 MySQL requests per second

· 7 million registered users

· 18 million objects in the English version

For the 2007 MySQL Users Conference, Domas posted great details on the Wikipidia architecture: Wikipedia: Site internals, configuration, code examples and management issues (30 pages). I’ve posted other big service scaling and architecture talks at:

James Hamilton
Amazon Web Services

Updated: Corrected formatting issue.

2 comments on “Wikipedia Architecture
  1. I agree Greg. Many designs "over protect" the DB layer.


  2. Greg Linden says:

    In Domas’ article, this point on hitting your caching layer versus hitting the database is worth emphasizing:

    "The common mistake is to believe that database is too slow and everything in it has to be cached somewhere else. In scaled out environments reads are very efficient, and difference of time between efficient MySQL query and memcached request is negligible – both may execute in less than 1ms usually)."

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.