Erasure coding provides redundancy for greater than single disk failure without 3x or higher redundancy. I still like full mirroring for hot data but the vast majority of the worlds data is cold and much of it never gets referenced after writing it: Measurement and Analysis of Large-Scale Network File System Workloads. For less-than-hot workloads, erasure coding is an excellent solution. Companies such as EMC, Data Domain, Maidsafe, Allmydata, Cleversafe, and Panasas are all building products based upon erasure coding.

At FAST 2009 in late February, A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage will be presented. This paper looks at 5 open source erasure coding systems and compares there relative performance. The open source erasure coding packages implement Read-Solomon, Cauchy Read-Solomon, Even-Odd, Row-Diagonal Parity (RDP), and Minimal Density RAID-6 codes.

The authors found:

· The special-purpose RAID-6 codes vastly outperform their general-purpose counterparts. RDP performs the best of these by a narrow margin.

· Cauchy Reed-Solomon coding outperforms classic Reed-Solomon coding significantly, as long as attention is paid to generating good encoding matrices.

· An optimization called Code-Specific Hybrid Reconstruction is necessary to achieve good decoding speeds in many of the codes.

· Parameter selection can have a huge impact on how well an implementation performs. Not only must the number of computational operations be considered, but also how the code interacts with the memory hierarchy, especially the caches.

· There is a need to achieve the levels of improvement that the RAID-6 codes show for higher numbers of failures.

The paper also provides a good introduction of how erasure coding works. Recommended. I expect erasure codes to spring up in many more application in the near future.


James Hamilton, Amazon Web Services

1200, 12th Ave. S., Seattle, WA, 98144
W:+1(425)703-9972 | C:+1(206)910-4692 | H:+1(206)201-1859 | | | blog: