As system managers, we are tasked with building and maintaining systems that are expected to be ‘always on’. Our objective of course, is to bridge the gap between our users definition of ‘always on’ and our accountants definition of ‘under budget’. In my experience, the gap tends to be large.
With that goal in mind, I’ve outlined a method for quickly estimating expected availability for simple systems, or ‘application stacks’. For purposes of these posts, a system includes power, cooling, WAN, network, servers and storage.
The introductory post outlines the basics for estimating the availability of simple systems, the assumptions used when estimating availability, and the basics of serial, parallel and coupled dependencies.
Part one covers estimating the availability of simple non-redundant systems.
Part two covers estimating the availability of systems with simple active/passive redundancy.
My hope is that there prove useful as rules of thumb for bridging the gap between user expectations and the realities of the systems on which they depend.
Cargo Cult: …imitate the superficial exterior of a process or system without having any understanding of the underlying substance --Wikipe...
Structured system management is a concept that covers the fundamentals of building, securing, deploying, monitoring, logging, alerting, and...
In The Cloud - Outsourcing Moved up the Stack  I compared the outsourcing that we do routinely (wide area networks) with the outsourcing ...