Skip to main content

Oracle/Sun ZFS Data Loss – Still Vulnerable

Last week I wrote about how we got bit by a bug and ended up with lost/corrupted Oracle archive logs and a major outage. Unfortunately, Oracle/Sun’s recommendation – to patch to MU8 – doesn’t resolve all of the ZFS data loss issues.

There are two distinct bugs, one fsync() related, the other sync() related. Update 8 may fix 6791160 zfs has problems after a panic , but

Bug ID 6880764 “fsync on zfs is broken if writes are greater than 32kb on a hard crash and no log attached”

is apparently not resolved until 142900-09 released on 2010-04-20.

DBA’s pay attention: Any Solaris 10 server kernel earlier than Update 8 + 142900-09 that is running any application that synchronously writes more than 32k chunks is vulnerable to data loss on abnormal shutdown.

As best as I can figure – with no access to any information from Sun other than what’s publically available – these bugs affect synchronous writes large enough to be written directly to the pool instead of indirectly via the ZIL. After an abnormal shutdown, on reboot the ZIL replay looks at the metadata in the ZIL wacks the write (and your Oracle Archive logs).

It appears that you can

  • limit database writes to 32k (and kill database performance)
  • or you can force writes larger than 32k to be written to the ZIL instead of the pool by setting zfs_immediate_write_sz  larger than your largest database write (and kill database performance)
  • or you can use a separate log intent device (slog)
  • or you can update to 142900-09

Ironically, the ZFS Evil Tuning Guide recommends the opposite – set “the zfs_immediate_write_sz parameter to be lower than the database block size” so that all database writes take the broken direct path.

Another bug that is a consideration for an out of order patch cycle and rapid move to 142900-09:

Bug ID 6867095: “User applications that are using Shared Memory extensively or large pages extensively may see data corruption or an unexpected failure or receive a SIGBUS signal and terminate.”

This sounds like an Oracle killer.

I’m not crabby about a killer data loss bug in ZFS. I’m crabby because Oracle/Sun knew about the bug and it’s enormous consequences and didn’t do a dammed thing to warn their customers. Unlike Entrust – who warned us that we had a bad cert even though it was our fault that our SSL certs had no entropy, and unlike Microsoft who warned it’s customers about potential data loss, Sun/Oracle really has their head in the sand on this.

Unfortunately – when your head is in the sand, your ass is in the air.


Popular posts from this blog

Cargo Cult System Administration

“imitate the superficial exterior of a process or system without having any understanding of the underlying substance” --Wikipedia During and after WWII, some native south pacific islanders erroneously associated the presence of war related technology with the delivery of highly desirable cargo. When the war ended and the cargo stopped showing up, they built crude facsimiles of runways, control towers, and airplanes in the belief that the presence of war technology caused the delivery of desirable cargo. From our point of view, it looks pretty amusing to see people build fake airplanes, runways and control towers  and wait for cargo to fall from the sky.The question is, how amusing are we?We have cargo cult science[1], cargo cult management[2], cargo cult programming[3], how about cargo cult system management?Here’s some common system administration failures that might be ‘cargo cult’:Failing to understand the difference between necessary and sufficient. A daily backup is necessary, b…

Ad-Hoc Verses Structured System Management

Structured system management is a concept that covers the fundamentals of building, securing, deploying, monitoring, logging, alerting, and documenting networks, servers and applications. Structured system management implies that you have those fundamentals in place, you execute them consistently, and you know all cases where you are inconsistent. The converse of structured system management is what I call ad hoc system management, where every system has it own plan, undocumented and inconsistent, and you don't know how inconsistent they are, because you've never looked.

In previous posts (here and here) I implied that structured system management was an integral part of improving system availability. Having inherited several platforms that had, at best, ad hoc system management, and having moved the platforms to something resembling structured system management, I've concluded that implementing basic structure around system management will be the best and fastest path to …

The Cloud – Provider Failure Modes

In The Cloud - Outsourcing Moved up the Stack[1] I compared the outsourcing that we do routinely (wide area networks) with the outsourcing of the higher layers of the application stack (processor, memory, storage). Conceptually they are similar:
In both cases you’ve entrusted your bits to someone else, you’ve shared physical and logical resources with others, you’ve disassociated physical devices (circuits or servers) from logical devices (virtual circuits, virtual severs), and in exchange for what is hopefully better, faster, cheaper service, you give up visibility, manageability and control to a provider. There are differences though. In the case of networking, your cloud provider is only entrusted with your bits for the time it takes for those bits to cross the providers network, and the loss of a few bits is not catastrophic. For providers of higher layer services, the bits are entrusted to the provider for the life of the bits, and the loss of a few bits is a major problem. The…