High scale network research is hard. Running a workload over a couple of hundred servers says little of how it will run over thousands or tens of thousands servers. But, having 10’s of thousands of nodes dedicated to a test cluster is unaffordable. For systems research the answer is easy: use Amazon EC2. It’s an ideal cloud computing application. Huge scale is needed during some parts of the research project but the servers aren’t needed 24 hours/day and certainly won’t be needed for the three year amortization life of the servers.

However, for high-scale network research, finding the solution is considerably more difficult. In some dimensions, it’s no different from systems research in that purchasing a few thousand servers for the research projects makes no sense. But the easy answer of simply using EC2 doesn’t work in that EC2 nodes come fully provisioned with networking. One solution that works well for many networking research problems is to use an overlay and test at scale in EC2. But, when new hardware devices are being investigated, unless they can be emulated with high fidelity using with software implementations running on EC2, this solution breaks down.

For all but a few folks at Cisco and Juniper, running a multi-thousand node physical cluster to test new network gear is impractical. And it’s even less practical in academic settings. I’m lucky enough to work near many thousands of server nodes and a huge networking infrastructure. But, even then, installing a parallel network to do network research is difficult to afford. High-scale network research at credible scale is difficult.

Zhangxi Tan of Cal Berkeley came up to visit a couple of weeks back. I’m interested in Zhangxi’s work for two reasons: 1) its based upon reconfigurable computing — a technology ready for commercial application and 2) the application of FPGA to network simulation might be a solution to the problem of how to test networking gear at credible scale.

Reconfigurable computing maintains the flexibility of reprogrammable software systems with the performance of high performance hardware implementations. Or, worded differently, most of the performance of Application Specific Integrated Circuits (ASIC) with the flexibility of software. Most reconfigurable computing designs are based upon Field Programmable Gate Arrays (FPGA) and some high level instruction set or programming language to allow device reconfiguration. Recently, C and C++ subset compilers have emerged that allow a constrained version C or C++ to be compiled directly to a FPGA and, once the software is stable, directly to an ASIC. See Platform-based Electronic Systems Level (ESL) Synthesis for more on reconfigurable computing and see Heterogeneous Computing using GPGPUs and FPGAs for related discussions on the application of hardware acceleration.

In the work that Zhangxi presented, the Cal Berkeley team is taking the RAMP gold FPGA-based many-core simulator (Research Accelerator for Multiple Processors) and applying it to the problem of high-scale network simulation with a goal of simulating an O(10k) server network. Zhangxi’s slides are here: Using FPGAs to Simulate Novel Datacenter Network Architectures at Scale and my rough notes follow:

· Lots of work going on in data center network research: VL2, Dcell, PortLand,…

· But:

o the test scale is usually WAY smaller than the problem targeted by these systems

o Often synthetic benchmarks are used rather than actual workloads

· RAMP Gold is:

o Full 32-bit SPARC v8 ISA support, including FP, traps and MMU.

o Use abstract models with enough detail, but fast enough to run real apps/OS

o Provide cycle-level accuracy

o Cost-efficient: hundreds of nodes plus switches on a single FPGA

· RAMP Gold implementation:

o Based upon Xilinx XUP V5 board ($750)

o Able to simulate 64 core, 2GB DDR2, FP and run production Linix

· Tested using trace data from Facebook and Yahoo Hadoop runs

· Demonstrating the incast TCP collapse problem and showed simulated results that closely matched actual measured results

James Hamilton

e: jrh@mvdirona.com

w: http://www.mvdirona.com

b: http://blog.mvdirona.com / http://perspectives.mvdirona.com