Last In - First Out: February 2009

Regulation E.

Spent the weekend digging into Regulation E., particularly Section 205.11. That’s the part where you try to convince your regional bank that you really didn’t authorize those charges, that you were not ‘card present’ in New York, and you didn’t have homeless people in your house rummaging through your stuff, borrowing your debit card, jetting to the east coast, buying cosmetics and jetting back.

This isn’t unexpected. We’ve kept this debit card attached to a special checking account that we never have more than $400 in at any time, just for this reason. The theory is that transactions will start to fail before the damage gets too expensive. In practice, I’m not sure if the bank will honor the overdraft attempts or not. I’d be un-amused if they had some sort of ‘convenience’ feature that turned the fraud into overdrafts and then into 22% loans. That would be a bad day.

This particular card was only used at a small number of merchants, mostly local and regional grocery chains, so my guess is that either a local/regional merchant or their upstream provider has a leak. The bank had already pulled the card and reissued it a couple days before we saw the bogus transactions.

So now I’m in paranoid mode, or more likely I’m in more-paranoid-than-usual mode. The good news is that I can finally close the loop on what I’ve been saying for years, namely ‘I wouldn’t be paranoid if everyone weren’t out to get me!’.

Unfortunately, the regional bank doesn’t have anything that helps mitigate something like this other than checking your online statement every day and sending a postal letter to ‘Regulation E Department’ when bad things show up. Bank of America, on the other hand, lets me do a few interesting things. First they let me use my cell phone as a two-factor SMS based proxy when logging in to their web portal with what they call SafePass^®(details here).

Second, they allow me to generate single-merchant, limited value card numbers for online transactions with what they call ShopSafe^®. With ShopSafe I can spin up different card numbers with different limits and expiration dates for each online vendor on an ad-hoc or as needed basis. This allows me to approximate single use cards.

Third, they have a reasonably robust SMS alerting system that allows me to set up alerts for routine activity that may or may not be an indicator of irregular activity, such as ‘any charge over $50’ or ‘Transaction outside of US’. They send me the SMS, I decide if it’s irregular. I like the idea of getting an SMS when someone logs into my account, changes my address, charges purchases online, orders checks, etc. Having some information ‘out of band’ can’t hurt. None of this really prevents anything, it just makes detection faster and easier.

The images list the various alerts that are configurable.

The only down side to getting an SMS every time you use your card is that some merchants don’t post transactions at the time of purchase. Occasionally I’ll buy something at noon and get woke up at 4am with an SMS from BofA telling me that I bought something 16 hours ago. Overall though, that’s better than any alternative that I know of, and in this case would have alerted us to the fraud much sooner.

For me, the more SMS’s the better.

Your PaaS Provider Failed, What’s Plan B?

Coghead:

SAP has purchased Coghead’s intellectual property assets…SAP did not assume any of Coghead’s customer relationships or obligations and, at this point in time, SAP does not have plans to continue offering the Coghead service commercially…

Infoweek:

"Customers can take the XML out that describes their application, but the reality is that only runs on Coghead, so customers will need to rewrite their app with something different,"

Hoff:

“It's a friendly reminder that "whens you rolls da dice, you takes your chances." Prudent and pragmatic risk assessment and relevant business decisions still have to be made when you decide to place your bets on a startup. Just because you move to the Cloud doesn't mean you stop employing pragmatic common sense. I hope these customers have a Plan B."

Rich Miller:

"Now, what this DOES emphasize is the importance of standards (de facto or de jure) by which interoperability or portability can be assured. Remember... the fear of "cloud vendor lock-in" doesn't only apply to that scenario in which the vendor has captured the customer and is "extorting" unreasonable fees. Lock-in also applies to being hand-cuffed to a boat anchor with no ship to prevent it (and the customer) from sinking into the depths. It argues for minimum sufficient means by which a customer can be assured of a migration path... not necessarily a cost-free, frictionless move from one platform to another, but an assurance of salvagability at a cost that's significantly less than a 100% do-over."

(emphasis is mine)

Janke:

You've bet the farm, your career, your bosses career on a technology, vendor or cloud provider. Assume the technology, vendor or cloud provider fails. What is your exit strategy? There is a clear case here for standards. The closer you are to something that is standardized and/or multi-vendor, the better off you'll be when things go bad.

I've had one really bad experience with a propriety technology (document management) that went bad. The vendor (DEC) sold off the application as they were gasping for air in the early 90's. The company that they sold it to disappeared. The documents (that now had no paper backups) existed only in the depths of a proprietary format, accessible only from hardware and software that was no longer available, built by a company that no longer existed. And the retirement incomes of thousands of faculty depended on those documents. That sucked. Finding a former employee of DEC who new enough about the software to write a custom program to convert thousands of proprietary formatted documents into a format readable by something non-proprietary was difficult. Paying that person, who was very aware of the difficultly of our situation and probably rather bitter about the whole layoff/unemployment thing, was expensive.

I'm in a relatively stable organization and I tend to stick around long enough to clean up any messes that I make, so for me, having at least a rough idea of an exit strategy makes jumping in with both feet much easier. If there is a failure, at least we have an idea how to salvage the project or technology at a cost less than '100% do-over'.

My guess is that if I were a in startup, where failure means you shut down the startup and pop up another, or if career-wise I were a jumper (new job every few years), I'd have a different attitude. In that case, “Damn the torpedoes! Four bells. Captain Drayton, go ahead! Jouett, full speed”

In this particular case, the abandoned customers might have gotten lucky. The competitors to the failed PaaS worked day and night to build a migration tool that lets you convert to their platform.

That’s really cool. Would you bet on that though?

See also: The Cloud - Provider Failure Modes

Performance Benchmarks that Include Energy Efficiency Data

Signs of the times:

Energy Benchmarking: Rich Miller at Datacenter Knowledge is reporting that TPC will update their performance benchmarks to include energy efficiency data. In the future, they’ll measure performance, price and energy in their benchmarks.

Actual datacenter energy costs (rather than power supply nameplate ratings) are hard to generalize. The numbers that I can find are all over the map. Energy use depends on server load, server configuration, server efficiency, power distribution efficiency and cooling efficiency, none of which are easily calculated and rarely measured. As a rough estimate, it looks like for small servers the cost of power + cooling approaches the cost of purchasing the server hardware and amortizing it over 4 years. Figuring energy use into the price/performance calculations for systems should skew future purchases toward efficiency.

Power Calculators: HP has a rack power calculator tool that provides useful estimates of power use for a given HP server and rack configuration. APC and others provide similar tools. I’m sure they build the tools to help figure per-rack UPS, power and cooling for custom rack configurations, but the tools can easily be used to help estimate energy costs.

Don’t forget cooling: One thing I’ve noticed is that people tend to forget that for every watt of electricity that their systems use, they’ll have more than one watt of cooling that they need to supply to remove the heat from the datacenter (or their house if they have air conditioning). The process of removing the watt of energy from the room is not 100% efficient. For example, if I have a rack that uses 5000 watts, a cooling system that was 100% efficient would use an additional 5000 watts to remove the heat from the room. But cooling systems are not 100% efficient. Worst case, you might spend up to an additional 10,000 watts of energy to cool the 5000 watt server rack.

Failed Backups – Unrecoverable Service

A small but high profile social bookmarking site ma.gnolia.com recently suffered catastrophic, unrecoverable data loss. The site’s creator and owner Larry Halff posted a video blog is which he talks about the failure and lessons learned.

[Since deleted]

Highlights from the vlog:

Software RAID volume or database corruption was the original cause.
The site was self hosted.
Complex dependencies made moving the site to professional hosting difficult.
The only backups were a copy to an attached firewire drive.
There were no integrity checks or test restores.
The site was hosted on Apple xServe’s and Mac Mini’s.

It’s a great ‘lessons learned’ for small startups. My take is that the people who create cool things on the Internet aren’t necessarily the ones that should be hosting those cool things. Those are rather different skill sets. The corollary is probably that people who are good at hosting the cool things on the Internet are likely not capable of creating them.

The big picture? When trusting others with your data, how do you know if they are taking appropriate steps to protect the data?

You don’t.

Not all Data Loss is Security Related

Matt invited me to guest author a post on his Standalone Sysadmin blog. One of the topics that I've had in the To-Blog pile is to dump out some thoughts on system backups. Head over to Matt's blog and read them.

Data loss events that result in data that is deleted, destroyed or corrupted are the DBA's and Sysadmins nightmare. Compare the results of these events:

Swatting – New Use for Internet Phones

This one is new to me. Call 911 from an Internet phone, faking the caller ID for a random address on the other side of the country. Then pretend to be the victim of a killer on a rampage and have the local SWAT team dispatched to an innocent person's house.

…a new kind of telephone fraud that exploits a weakness in the way the 911 system handles calls from Internet-based phone services. The attacks — called "swatting" because armed police SWAT teams usually respond…

Sounds useful. You can annoy your neighbors by having the police bust down their doors with a battering ram - right from the comfort of your local coffee shop. With the help of online maps, you can probably make it pretty realistic - ‘he’s in the back yard behind the big tree…. wait…he’s coming toward the back door…’.

…fake calls about a workplace shooting included realistic gunshot sounds and moaning in the background…

Beats the heck out of using spoofed caller ID to send bogus pizza deliveries to your ex-girlfriends.