Linux/Apache on ARM Processors

In The Case for Low-Cost, Low-Power Servers, I made the argument that the right measures of server efficiency was work done per dollar and work done per joule. Purchasing servers on single dimensional metrics like performance or power or even cost alone, makes no sense at all. Single dimensional purchasing leads to micro-optimizations that push one dimension to the detriment of others. Blade servers have been one of my favorite examples of optimizing the wrong metric (Why Blade Servers aren’t the Answer to All Questions). Blades often trade increased cost to achieve server density. But density doesn’t improve work done per dollar nor does it produce better work done per joule. In fact, density often takes work done per joule in the wrong direction by driving higher power consumption due to the challenge of cooling higher power densities.

There is no question that selling in high volume drives price reductions so client and embedded parts have the potential to be the best price/performing components. And, as focused as the server industry has been on power of late, the best work is still in the embedded systems world where a cell phone designer would sell their souls for a few more amp-hours if they could have it without extra size or extra-weight. Nobody focuses on power as much as embedded systems designers and many of the tricks arriving in the server world showed up years ago in embedded devices.

A very common processor used in cell phone applications is the ARM. The ARM business is model is somewhat unusual in that they sells a processor design and then the design is taken and customized by many teams including Texas Instruments, Samsung, and Marvel. These processors find their way into cell phones, printers, networking gear, low-end Storage Area Networks, Network Attached Storage devices, and other embedded applications. The processors produce respectable performance and great price/performance and absolutely amazing power/performance.

Could this processor architecture be used in server applications? The first and most obvious push back is that it’s a different instruction set architecture but servers software stacks really are not that complex. If you can run Linux and Apache some web workloads can be hosted. There are many Linux ports to ARM — the software will run. The next challenge, and this one is the hard one, does the workload partition into sufficiently fine slices to be hosted on servers built using low end processors. Memory size limitations are particularly hard to work around in that ARM designs have the entire system on the chip including the memory controller and none I’ve seen address more than 2GB. But, for those workloads that do scale sufficiently finely, ARM can work.

I’ve been interested in seeing this done for a couple of years and have been watching ARM processors scale up for quite some time. Well, we now have an example. Check out That web site is hosted on 7 servers, each running the following:

· Single 1.2Ghz ARM processor, Marvell MV78100

· 1 disk

· 1.5 GB DDR2 with ECC!

· Debian Linux

· Nginx web proxy/load balancer

· Apache web server

Note that, unlike Intel Atom based servers, this ARM-based solution has the full ECC memory support we want in server applications (actually you really want ECC in all applications from embedded through client to servers).

Clearly this solution won’t run many server workloads but it’s a step in the right direction. The problems I have had when scaling systems down to embedded processors have been dominated by two issues: 1) some workloads don’t scale down to sufficiently small slices (what I like to call bad software but, as someone who spent much of his career working on database engines, I probably should know better), and 2) surrounding component and packaging overhead. Basically, as you scale down the processor expense, other server costs begin to dominate. For example, If you half the processor cost and also ½ the throughput, its potentially a step backwards since all the other components in the server didn’t also half in cost. So, in this example, you would get ½ the throughput with something more than ½ the cost. Generally not good. But, what’s interesting are those cases where it’s non-linear in the other direction. Cut the cost to N% with throughput at M% where M is much more than N. As these system on a chip (SOC) server solutions improve, this is going to be more common.

It’s not always a win based upon the discussion above but it is a win for some workloads today. And, if we can get multi-core versions of ARM, it’ll be a clear win for many more workloads. Actually, the Marvel MV78200 actually is a two core SOC but it’s not cache coherent which isn’t a useful configuration in most server applications.

The ARM is a clear win on work done per dollar and work done per joule for some workloads. If a 4-core, cache coherent version was available with a reasonable memory controller, we would have a very nice server processor with record breaking power consumption numbers. Thanks for the great work ARM and Marvel. I’m looking forward to tracking this work closely and I love the direction its taking. Keep pushing.


James Hamilton



b: /

4 comments on “Linux/Apache on ARM Processors
  1. Greg Fasnacht says:

    Please note that the above link to LinuxArmOrg is no longer viable.

  2. marc says:

    Had an example… “He’s dead Jim”

    You’re right though James, as SOCs get cheaper we can bring microservers into the home – I’m already doing that now (and failing to a large degree) but my limited successes with a per-configured solution like FreedomBox and YuNoHost have spurred me on.

    Some of these little beasties really burn rubber.

    • I agree ARM server processors are respectable today and better is near. Qualcomm, for example, has a very nice part in market with an exciting roadmap. As the TSMC 7 nanometer technology node heads into widespread use, the TSMC technology used by most ARM competitors will roughly parallel what Intel will be using at the same time. This will be the first time the ARM competitors have roughly the same semiconductor technology as Intel who, in the past, has been a full generation and sometimes a bit more ahead.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.