Hosting multiple MySQL engines with MySQL Replication between them is a common design pattern for scaling read-heavy MySQL workloads. As with all scaling techniques, there are workloads for which it works very well but there are also potential issues that need to be understood. In this case, all write traffic is directed to the primary server and, consequently is not scaled which is why this technique works best for workloads heavily skewed towards reads. But, for those fairly common read heavy workloads, the techniques works very well and allows scaling the read workload across over a fleet of MySQL instances. Of course, as with any asynchronous replication scheme, the read replicas are not transactionally updated. So any application running on MySQL read replica’s must be tolerant of eventually consistent updates.
Load balancing high read traffic over multiple MySQL instances works very well but this is only one of the possible tools used to scale this type of workload. Another very common technique is to put a scalable caching layer in front of the relational database fleet. By far the most common caching layer used by high-scale services is Memcached.
Another database scaling technique is to simply not use a relational database. For workloads that don’t need schema enforcement and complex query, NoSQL databases offer both a cheaper and a simpler approach to hosting the workload. SimpleDB is the AWS hosted NoSQL database with Netflix being one of the best known users (slide deck from Netflix’s Adrian Cockcroft: http://www.slideshare.net/adrianco/netflix-oncloudteaser). Cassandra is another common RDBMS alternative in heavy use by many high-scale sites including Facebook where it was originally conceived. Cassandra is also frequently run on AWS with the Cassandra Wiki offering scripts to make it easy install and configure on Amazon EC2.
For those workloads where a relational database is the chosen solution, MySQL read replication is a good technique to have in your scaling tool kit. Last week Amazon announced read replica support for the AWS Relational Database Service. The press release is at: Announcing Read Replicas, Lower High Memory DB Instance Price for Amazon AWS.
You can now create one or more replicas of a given “source” DB Instance and serve incoming read traffic from multiple copies of your data. This new database deployment option enables you to elastically scale out beyond the capacity constraints of a single DB Instance for read-heavy database workloads. You can use Read Replicas in conjunction with Multi-AZ replication for scalable, reliable, and highly available production database deployments.
If you are running MySQL and wish you had someone else to manage it for you, check out Amazon RDS. The combination of read replicas to scale read workloads and Multi-AZ support for multi-data center, high availability make it a pretty interesting way to run MySQL.