Skip to main content

Sometimes Hardware is Cheaper than Programmers

In Hardware is Expensive, Programmers are Cheap II I promised that I’d give an example of a case where hardware is cheap compared to designing and building a more efficient application. That post pointed out a case where a relatively small investment in program optimization would have paid itself back by dramatic hardware savings across a small number of the software vendors customers.

Here’s an example of the opposite.

Circa 2000/2001 we started hosting an ASP application running on x86 app servers with a SQL server backend. The hardware was roughly 1Ghz/1GB per app server. Web page response time was a consistent 2000ms. Each app server could handle no more than a handful of page views per second.

By 2004 or so, application utilization grew enough that the page response time and the scalability (page views per server per second) were both considered unacceptable. We did a significant amount of investigation into the application, focusing first on the database, and then on the app servers. After a week or so of data gathering we determined that the only significant bottleneck was a call to an XSLT/XML transformation function. The details escape me – and aren’t really relevant anyway, but what I remember is that most of the page response time was buried in that library call, and that call used most of the app server CPU. Figuring out how to make the app go faster was pretty straightforward.

  • The app servers were CPU bound on a single library call.
  • The library wasn’t going to get re-written or optimized with any reasonable work effort. (If I remember correctly, it was a Microsoft provided library, the software developers only option would and been a major re-write).
  • The servers were somewhere around 4 years old and due for a routine replacement.
  • The new servers would clock 3x as fast, have better memory bandwidth and larger caches. The CPU bound library call would likely scale with processor clock speed, and if it fit in the processor cache might scale better than clock.

Conclusion: Buy hardware. In this case, two new app servers replaced four old app servers, the page response time improved dramatically, and the pages views per server per second went up enough to handle normal application growth. It was a clear that throwing hardware at the problem was the simplest, cheapest way to make it go away.

In The Quarter Million Dollar Query I outlined how we attached an approximate dollar cost to a specific poorly performing query. “The developers - who are faced with having to balance impossible user requirements, short deadlines, long bug lists, and whiny hosting teams complaining about performance - likely will favor the former over the latter.”

Unless of course they have data comparing hardware, software licenses and hosting costs to their development costs. My preference is to express the operational cost of solving a performance problem in ‘programmer-salaries’ or ‘programmer-months’. Using units like that helps bridge the communication gap.

My conclusion in that post: “To properly prioritize the development work effort, some rational measurement must be made of the cost of re-working existing functionality to reduce [server or database] load verses the value of using that same work effort to add user requested features.”


Related:

The Quarter Million Dollar Query
Hardware is Cheap, Programmers are Expensive
Hardware is Expensive, Programmers are Cheap
Hardware is Expensive, Programmers are Cheap II

Comments

  1. Of course there are "this cases" and "that cases". Having old hardware and one single point of performance loss, that is not even written by your own is one thing.

    Having an application that serves some 60 mio. (complex) page calls dynamically we are optimizing the hell out of our 20 servers. These are databases and app servers. I read about an american games site that was comparable (site-complexity) and ran 40 mio. page calls needing 75 app servers and 20 databases.

    Assume one server will cost $1000 and $100 per month hosting and maintenance, then this is $165.000 in the first year they pay more than we. Only for the hardware. MAny hours for optimization ;-) - and they do not even deliver the amount of pages we do...

    You see, there are "this cases" and "that cases".

    ReplyDelete
  2. Georgi -

    The best part about focusing on optimization is that the money that you save on hosting the optimized app vs the non-optimized app can be applied toward programmers salary, and that programmer could do interesting things like add features, perform further optimizations, etc.

    And yes, the 'cases' are only examples of various situations.

    --Mike

    ReplyDelete

Post a Comment

Popular posts from this blog

Cargo Cult System Administration

“imitate the superficial exterior of a process or system without having any understanding of the underlying substance” --Wikipedia During and after WWII, some native south pacific islanders erroneously associated the presence of war related technology with the delivery of highly desirable cargo. When the war ended and the cargo stopped showing up, they built crude facsimiles of runways, control towers, and airplanes in the belief that the presence of war technology caused the delivery of desirable cargo. From our point of view, it looks pretty amusing to see people build fake airplanes, runways and control towers  and wait for cargo to fall from the sky.The question is, how amusing are we?We have cargo cult science[1], cargo cult management[2], cargo cult programming[3], how about cargo cult system management?Here’s some common system administration failures that might be ‘cargo cult’:Failing to understand the difference between necessary and sufficient. A daily backup is necessary, b…

Ad-Hoc Verses Structured System Management

Structured system management is a concept that covers the fundamentals of building, securing, deploying, monitoring, logging, alerting, and documenting networks, servers and applications. Structured system management implies that you have those fundamentals in place, you execute them consistently, and you know all cases where you are inconsistent. The converse of structured system management is what I call ad hoc system management, where every system has it own plan, undocumented and inconsistent, and you don't know how inconsistent they are, because you've never looked.

In previous posts (here and here) I implied that structured system management was an integral part of improving system availability. Having inherited several platforms that had, at best, ad hoc system management, and having moved the platforms to something resembling structured system management, I've concluded that implementing basic structure around system management will be the best and fastest path to …

The Cloud – Provider Failure Modes

In The Cloud - Outsourcing Moved up the Stack[1] I compared the outsourcing that we do routinely (wide area networks) with the outsourcing of the higher layers of the application stack (processor, memory, storage). Conceptually they are similar:
In both cases you’ve entrusted your bits to someone else, you’ve shared physical and logical resources with others, you’ve disassociated physical devices (circuits or servers) from logical devices (virtual circuits, virtual severs), and in exchange for what is hopefully better, faster, cheaper service, you give up visibility, manageability and control to a provider. There are differences though. In the case of networking, your cloud provider is only entrusted with your bits for the time it takes for those bits to cross the providers network, and the loss of a few bits is not catastrophic. For providers of higher layer services, the bits are entrusted to the provider for the life of the bits, and the loss of a few bits is a major problem. The…