Skip to main content

Patch Now - What Does it Mean?

When security researchers/bloggers announce to the world 'patch now', are they are implying that the world should 'patch now without consideration for testing, QA, performance or availability'? Or are they advising an accelerated patch schedule, but in a change managed, tested, QA’d rollout of a patch that considers security and availability? And when they complain about others not patching fast enough, are they assuming that the foot draggers are incompetent? Or are they ignoring the operational realities of making untested changes to critical infrastructure?

Consider that:
  • All patches have a probability of introducing new bugs. That probability is always > 0  and <= 1. The probability is never equal to zero. (And for a certain large database vendor, our experience is that the probability of introducing new bugs is very close to one).
  • There are many, many bugs that are only relevant under high loads.
  • A patch that corrupts data, as in databases or file systems, can be impossible to back out or recover from without irretrievable data loss.
  • Building test cases that can put realistic real world loads on test servers is very difficult, very expensive, and may not uncover the new bugs anyway. 
  • A failed system or application has known, documented consequences. It is not a game of probability or chance. An unpatched security vulnerability is a game of chance where in most cases the odds against you are not known. 
As an operations person with real responsibilities, who is accountable to a very large group of paying customers, and who has to make security versus availability decisions almost every day, I need security researchers to uncover, analyze and communicate risks, threats, vulnerabilities and mitigation techniques. The best of the researchers already do that very well, and for that I am very grateful. To those who are doing that for public service, fame, fortune or personal ego, I sincerely thank you, no matter what your motivation. You are adding value to the Internet community.

But when security people push recommendations out to the world without consideration for availability and/or performance, their recommendations remove value from the Internet community.

Security Researchers add value when

  • Uncovering and analyzing vulnerabilities and active exploits. (Research)
  • Analyzing probable and improbable attack vectors and calculating and communicating probabilities. (Research)
  • Testing and verifying attack vectors. (Research)
  • Communicating to the community the relative and absolute risks of vulnerabilities and consequences of exploitation. (Public Service)
  • Developing and communicating mitigation options. (Research)
Security Researchers do not add value when
  • Making blanket patch advice without consideration for performance or availability. (Operations)
  • Complaining about enterprises that do not follow their advice.  (Carping)
(non-exhaustive lists, of course.)

In that context, when I hear 'patch now' advice, You can bet that I will filter the advice through the prism of availability, performance and operational reality.

  • I'll listen to 'patch now no matter what' advice from a security researcher/blogger who has real time operational responsibility for a large customer base, perhaps 100,000 or more customers, and who, if the patch fails, would be responsible for interruption of service for those hundred thousand customers, and who, if the patch fails, could or would be terminated for non-performance.
  • I'll listen to 'patch now no matter what' advice from a security researcher/blogger who has had a system with a hundred thousand customers down hard, has escalated to the vendors highest support level, and who has been on a tech support conference call for 13 continuous hours or more.
  • I'll listen to 'patch now no matter what' advice from consultants who are putting the reputation and existence of their consultancy on the line every time they give a customer advice.
  • I'll listen to 'patch now no matter what' advice from our own security staff, who I know will not point fingers, duck and hide when the patch goes bad and my systems fail.
As far as I am concerned, if you are in a position like one of the above, you can complain about service providers who do not patch fast enough to suite your preferences. If you are not in that position, you cannot complain when I don't (or your service provider doesn't) patch fast enough for you.

The bottom line is that unless the people who give the world advice to 'patch now no matter what' are also going to write my e-mail's and presentations explaining why my systems failed, unless they will absorb the inevitable backlash from customers, senior management, governing boards and will stand up in front of representatives from my internal business units and get grilled, castigated, chewed up and spit out for my decision, I don't need them to complain that I am not 'patching now'.

I've been in the 7pm vendor conference call with vendor VP and development supervisor, where our CIO came to the meeting with his/her letter of resignation, to be turned in to our CEO should the vendor fail to deliver performance fixes for the business critical application by 7am the next day.
It was not a fun meeting.

'Patch now' advice must be filtered through the prism of availability, performance and operational reality.


  1. So, how does the patching process differ between normal patching and "patch now", assuming you decide it's warranted in a given instance?

  2. I see 'patch now' as a process that leaves the change management, test and QA cycles intact, but greatly shortens the time lines. This is still an increase in risk over a measured deployment. Time finds bugs, sometimes in your own tests, or in the case of public and widely deployed software, the tests that others perform and report.

    The controls that attempt to keep bad code out of production need to remain in place.


Post a Comment

Popular posts from this blog

Cargo Cult System Administration

Cargo Cult: …imitate the superficial exterior of a process or system without having any understanding of the underlying substance --Wikipedia During and after WWII, some native south pacific islanders erroneously associated the presence of war related technology with the delivery of highly desirable cargo. When the war ended and the cargo stopped showing up, they built crude facsimiles of runways, control towers, and airplanes in the belief that the presence of war technology caused the delivery of desirable cargo. From our point of view, it looks pretty amusing to see people build fake airplanes, runways and control towers  and wait for cargo to fall from the sky.
The question is, how amusing are we?We have cargo cult science[1], cargo cult management[2], cargo cult programming[3], how about cargo cult system management?Here’s some common system administration failures that might be ‘cargo cult’:
Failing to understand the difference between necessary and sufficient. A daily backup …

Ad-Hoc Versus Structured System Management

Structured system management is a concept that covers the fundamentals of building, securing, deploying, monitoring, logging, alerting, and documenting networks, servers and applications. Structured system management implies that you have those fundamentals in place, you execute them consistently, and you know all cases where you are inconsistent. The converse of structured system management is what I call ad hoc system management, where every system has it own plan, undocumented and inconsistent, and you don't know how inconsistent they are, because you've never looked.

In previous posts (here and here) I implied that structured system management was an integral part of improving system availability. Having inherited several platforms that had, at best, ad hoc system management, and having moved the platforms to something resembling structured system management, I've concluded that implementing basic structure around system management will be the best and fastest path to…

The Cloud – Provider Failure Modes

In The Cloud - Outsourcing Moved up the Stack[1] I compared the outsourcing that we do routinely (wide area networks) with the outsourcing of the higher layers of the application stack (processor, memory, storage). Conceptually they are similar:In both cases you’ve entrusted your bits to someone else, you’ve shared physical and logical resources with others, you’ve disassociated physical devices (circuits or servers) from logical devices (virtual circuits, virtual severs), and in exchange for what is hopefully better, faster, cheaper service, you give up visibility, manageability and control to a provider. There are differences though. In the case of networking, your cloud provider is only entrusted with your bits for the time it takes for those bits to cross the providers network, and the loss of a few bits is not catastrophic. For providers of higher layer services, the bits are entrusted to the provider for the life of the bits, and the loss of a few bits is a major problem. These …