Skip to main content

Hardware is Expensive, Programmers are Cheap

The case that Jeff Atwood attempts to make is basically that hardware is generally cheap enough that code optimization doesn’t pay (or – don’t bother optimizing code until after you’ve tried solving the problem with cheap hardware). I read & re-read the argument. I’m convinced that in the general case, it doesn’t add up.

In my experience, the ‘hardware is cheap, programmers are expensive’ mantra really only applies to small, lightly used systems, where the fully loaded hardware cost is actually is trivial compared to the cost to put a programmer in a cube. Once the application is scaled to the point where adding a server or two no longer solves the scalability problem, or in cases where there are database, middleware or virtualization licenses involved, the cost to add hardware is not trivial. The relative importance of software optimization versus more hardware then shifts toward smart, optimized software, and the cheap hardware argument at Coding Horror quickly falls apart.

The comments at Coding Horror descended into the all too common ‘if I had a better monitoring I could write better code’ nonsense pretty quickly, which of course misses the point of the post. Some of the commenters got it right though:
“The initial cost of hardware (servers) is not the only cost, and - yes hardware is cheap, but is a drop in the proverbial bucket compared to the total cost of ownership” – JakeBrake
“Throwing more hardware at problems does not make them go away. It may delay them, but as an application scales either…you may get a combinatoric issue pop up outstripping you ability to add hardware…[or]…you just shift the problem to systems administration who have to pay to maintain the hardware. Someone pays either way.” – PJL
“Throwing hardware at a software problem has its place in smaller, locally hosted data facilities. When you're running in a hardened facility the leasing of space, power, etc. begins to hurt. One could argue the amount of time and labor necessary to design and implement a new server, along w/ the hardware costs, space, power -- and don't forget disk if you're running on a SAN (fibre channel disk isn't cheap!) -- can easily negate the time of a programmer to fix bad code.” – Jonathan Brown
The above comments correctly emphasize that the purchase price of a server is only a fraction of the cost of the server. A fully loaded server cost must include space, power, cooling, a replacement server every 3-4 years, system management, security, hardware and software maintenance costs and software licensing costs. And if the server needs a SAN attach, then fiber channel port costs can equal the server hardware costs. Some estimates (here and here) imply that the loaded power, space and cooling cost of a server can approximately equal the cost of the server.

Fortunately the hardware-is-cheap argument was promptly retorted with a well written post by David Berk:
“To add linear capacity improvements the organization starts to pay exponential costs. Compare this with exponential capacity improvements with linear programming optimization costs.”
In other words, time spent optimizing code pays back cost saving dividends continuously over the life of the application, with little or no additional ongoing costs. Money spent on hardware that only compensates for poorly written code costs money every day, and as the application grows, that cost rises exponentially.

That’s basically where we are at with a couple of our applications. They are at the the size/scale where doubling the hardware and associated maintenance, power, cooling, database licenses will cost more than a small team of developers, and because of the inherent limits of scalability in the design of these applications, the large outlay in capitol will at best result in minor capacity/scalability/performance improvements.

Adding on the David Berk’s response, I’d add that one should consider greenhouse gases (a ton or two per server per year!), database licensing costs (the list price for one CPU’s worth of Oracle Enterprise plus Oracle RAC is close to the cost of a programmer’s salary).

Another way of looking at this is well written, properly optimized software pays itself back in hardware, datacenter, cooling and system manager cost in a broad range of scenarios, the exception being the small, lightly used applications. For those – throw hardware at the problem and hope it goes away.


Popular posts from this blog

Cargo Cult System Administration

Cargo Cult: …imitate the superficial exterior of a process or system without having any understanding of the underlying substance --Wikipedia During and after WWII, some native south pacific islanders erroneously associated the presence of war related technology with the delivery of highly desirable cargo. When the war ended and the cargo stopped showing up, they built crude facsimiles of runways, control towers, and airplanes in the belief that the presence of war technology caused the delivery of desirable cargo. From our point of view, it looks pretty amusing to see people build fake airplanes, runways and control towers  and wait for cargo to fall from the sky.
The question is, how amusing are we?We have cargo cult science[1], cargo cult management[2], cargo cult programming[3], how about cargo cult system management?Here’s some common system administration failures that might be ‘cargo cult’:
Failing to understand the difference between necessary and sufficient. A daily backup …

Ad-Hoc Versus Structured System Management

Structured system management is a concept that covers the fundamentals of building, securing, deploying, monitoring, logging, alerting, and documenting networks, servers and applications. Structured system management implies that you have those fundamentals in place, you execute them consistently, and you know all cases where you are inconsistent. The converse of structured system management is what I call ad hoc system management, where every system has it own plan, undocumented and inconsistent, and you don't know how inconsistent they are, because you've never looked.

In previous posts (here and here) I implied that structured system management was an integral part of improving system availability. Having inherited several platforms that had, at best, ad hoc system management, and having moved the platforms to something resembling structured system management, I've concluded that implementing basic structure around system management will be the best and fastest path to…

The Cloud – Provider Failure Modes

In The Cloud - Outsourcing Moved up the Stack[1] I compared the outsourcing that we do routinely (wide area networks) with the outsourcing of the higher layers of the application stack (processor, memory, storage). Conceptually they are similar:In both cases you’ve entrusted your bits to someone else, you’ve shared physical and logical resources with others, you’ve disassociated physical devices (circuits or servers) from logical devices (virtual circuits, virtual severs), and in exchange for what is hopefully better, faster, cheaper service, you give up visibility, manageability and control to a provider. There are differences though. In the case of networking, your cloud provider is only entrusted with your bits for the time it takes for those bits to cross the providers network, and the loss of a few bits is not catastrophic. For providers of higher layer services, the bits are entrusted to the provider for the life of the bits, and the loss of a few bits is a major problem. These …