Monday, December 27, 2010

It is a Platform or a Religion?

Blog posts like this annoy me. "Anyone who was ever fool enough to believe that Microsoft software was good enough to be used for a mission-critical operation..."

I’m annoyed enough to keep that link in my ‘ToBlog’ notebook for over a year. That’s annoyed, ‘eh?

Apparently the system failed and the blogger decided that all failed systems that happen to be running on Windows fail because they run on windows.

A word from the blind.

I've been known as 'anti-Microsoft', having had a strong preference for Netware and Solaris on the server side and OS/2 & Solaris on desktops. At home I went for half to a decade without an MS product anywhere in the house. Solaris on SunRays with Star Office made for great low energy, low maintenance home desktops that ran forever.

My anti-Microsoft attitude changed a bit with NT4 SP3, which even though it had a badly crippled UI, was robust enough to replace my OS/2 desktop at work. My real work still got done on Solaris though. On the server side, I didn't see much to like about Windows, using it only where there were no other choices.

Windows 2003 finally changed my mind. After running W2k3 and SQL Server 2000 'at scale' on a large mission critical on line application, and after having badly abused it by foisting upon it a poorly written turd of an application, and after further compounding the abuse by my own lack of Windows and SQL experience, I had to conclude that one couldn't simply declare that Windows was inferior, or that it didn't scale, or that is wasn't secure. If you wanted to bash Microsoft and still be honest, you'd have to qualify your bashing, hedge it a bit, and perhaps even provide specific details on what you are bashing.

It was after a few months of running a large MS/SQL stack that I was quoted as follows:

'It has display an unexpected level of robustness'


'It doesn't suck as bad as I thought it would'.

Now days I'm pretty close to being platform neutral. I have preferences, but they are not religious.

For any application that can run on up to 32 cores, SQL server works. Period. It might work on larger installs, but I don't have experience with them, so I can't comment on them.

Microsoft SQL server has a cost advantage over Oracle, so for any application that doesn't need Oracle Streams, Partitioning RAC or other advanced features, SQL Server tends to be the default. It certainly is far cheaper to meet a typical availability requirement with SQL server than Oracle, so for any application with an availability requirement that allows for an occasional 3 minute downtime (the time it takes for a database cluster to fail over), the Microsoft stack is a viable choice. My experience is that unplanned cluster failovers are rare enough that active/passive failover makes our customers happy.

For applications than can be load balanced, the Microsoft stack can be made as reliable as any other. Load balancing also mitigates the monthly patching that Microsoft requires.

For what it’s worth, the vendor of the ‘turd’  has improved the application to the point where it is a very well written, scalable, robust application running on a very, very robust database and operating system (Server 2008, SQL 2008).

It’s a platform, not a religion, so I reserve the right to change my dogma and preach to a different choir as systems evolve and circumstances change.

FWIW – I really, really liked OS/2’s Workplace Shell. I wish that Apple and Microsoft would figure out how to build a desktop like that.

Yeh, I’m still cleaning out my ToBlog queue.

Saturday, December 25, 2010

Feel kind of sorry for the sysadmins at Barnes & Nobel right now

It looks like they are having a bad day.

Do you suppose there were more shiny new Nooks brought into service than what their system could handle? Figure that last months sales were mostly wrapped for xmas and left idle until last night or this morning, and most of those are being booted and registered in a 24 hour window. A shiny new Nook isn't much fun without books...

...adds up to a rough day for them.

I've been there, as have many of you.

Thursday, December 23, 2010

ToBlog Dump – Time to Clean House

Geeze – Even after periodic culling, I still have twenty+ notes in Google Notebook, fifty-odd notes in Ubernote, and a whole bunch of Google Reader starred items, all waiting to be turned into blog posts.

Ain’t gonna happen. Time to clean house. I’ll dump the most interesting ones into a few posts & cull the rest.

Obviously tracking this sort of thing would be better served by a bookmarking service, but I’ve decided that my professional Internet presence will be Google and Google related apps. I use a combination of Yahoo & for things that I don’t want associated with my professional presence, and I try hard not to mix them. The only interesting bookmark service is a Yahoo property (for now, at least) so I don’t have a public bookmarking service. Lame? Yes. I don’t have Twitter or Facebook accounts either. Really lame. Maybe even lame2. I still would rather read blogs posts than tweets. Is that lame3 ?

Disclaimer – most of these links are more than a year old, but they’ve survived periodic culling, so maybe they are good links?

Here goes:

A read-once-for-sure and re-read-once-a-year post on the Anatomy of Security Disasters by Marcus Ranum describing …ummmm… the anatomy of security disasters, I guess. Good read.

I saved this DailyWTF post because it shows a bad security device implementation (and what I believe is a bad choice of identifiers). My luck the clowns would store my fingerprint un-hashed. One revocation down, nine to go.

Here’s a good ‘Sysadmin Principles’ list from Steve Stady and Seth Vidal. It’s in plain text, so those of you who surf the web with curl and less can read it too. I like reading what others think are the core system administration principles. To me, doing it ‘right’ has value, and I don’t appreciate people who shortcut just because they are lazy or in a hurry. They get by today, but someone else (probably me) will have to clean up after them later.

It’s possible that we’ll eventually end up migrating from Solaris to Linux. This post by ‘The Unix Blog’ reminded my why I like Solaris. Until Oracle fscks it up anyway.

I read and bookmarked a whole lot of articles about the Heartland breech. The most interesting one is Heartland Sniffer Hid in Unallocated Portion of Disk. Cool, unless you are Heartland or one of it’s victims. I’m a fan of network segmentation and bi-directional default deny firewall rules. I hope it makes a difference, ‘cause it sure is a lot of work to maintain.

A saved a few snarky anti-windows links, mostly written by the blind for the purpose of feeding the trolls. I don’t think that Unix is superior to Windows in every way. In some ways, like patching, I’d much rather have Windows/SQL than Solaris/Linux + Oracle. Microsoft has an really good patch management suite. And no, I don’t think Open Source is automatically bug free, cheaper, faster, easier to manage. My Firefox is on .13 right now. That’s not impressive. It’s annoying.

Speaking of the Microsoft stack, Todd Hoff’s High Scalability blog had a post on scaling StackOverflow.  Buy vs. rent, scale up vs. scale out, all the good stuff.

Here’s a couple more related links:

On a slightly different theme, Michael Nygards Why Do Enterprise Applications Suck was a good read. It’s hard to keep up the energy on backend apps.

I’ll close with a quote from the comment section of this InfoQ post on scalability worst practices:

Marcos Santos:

The Great Knuth said:

"Early Optimization is the root of all evil"

But Marcos Eliziario, who is a poor programmer, known of no one, said at 2 AM after two sleepless days:

"Reckless regard for optimization is the cube of an even greater evil"

Don’t worry, there will be more.

Thomas Limoncelli: Ten Software Vendor Do’s and Don’ts

From a panel discussion at a recent CHIMIT (Computer-Human Interaction for Management of Information Technology), summarized and published at the Association for Computing Machinery. A good read, right through the comments.

Thomas covers non-GUI, scripted and unattended installation, administrative interfaces, API’s, config files, monitoring, data restoration, logging, vulnerability notification, disk management, and documentation. The comments cover more.

Comments on the above:

API’s: In our latest RFP’s, we ask ‘What percentage of your application functionality is exposed via API’s?’ These can RFP’s can have an $8-digit tail on them, so odds are that they actually read them. I like sending messages to vendors.

Installation layout, location: I really like non-OS software to be completely contained in something like /opt/<application>. I don’t like third party software mucking up /etc, /var, or /usr. When I’m done with the software, I want to be able to pkgrm and rm –rf and have a server that is as clean as the day it was installed. I don’t like having to rummage through /usr, /usr/local, /var and /etc looking for remnants of your old install. Odds are I will not be able to figure it out and your junk will be there five years from now. That points seems to be contended in the comments though.

I’m to the point where if your application needs Perl, Python, or miscellaneous libraries, I am going to install a separate copy of the runtime or library in /opt/<application>/lib or /opt/<application>/perl.

More things that I’d like to see on application software:

11. Separately securable administrative interface. I likely will expect to be able to hide the /admin/everyfriggenthing URL behind a different IP address and protect it with a firewall, VPN, load balancer, Apache mod-something, 2-factor, etc. The security of any application interface that has the ability to modify more than one users data should not be treated the same as the interface used by the general public.

12. Updated Java, Python, Perl, … runtimes. Honestly – I have really expensive software from each of the worlds largest 2, 3 and 6 character fortune really-small-number corporations in my shop, and each of them has at least one current product that does not run under a current, non-exploitable JVM. Can you please recompile that crap so it works with a recent run time? A few years ago when Sun announced one of their Java exploits, we opened up security incidents with each of our vendors that had embedded JRE’s, asking for a version of the application that runs on a non-exploitable run time. In every case, the vendors bumbled around for days until they finally admitted that they do not update embedded JRE’s on products when the JRE is exploitable.

13. Software that doesn’t roll over and die when scanned with commonly available vulnerability scanners. That’s just dumb. That tells me that you shipped me software that YOU did not scan with a network vulnerability scanner. Tell me again about the security of your internal corporate network?

I’m afraid this could be a long list.

Thanks for the tip Kevin.

Saturday, December 18, 2010

Wireless Bandwidth Management

We know what some people are thinking:


We run a fair sized network (2Gbps inbound during the day). If we didn’t have aggressive bandwidth management, either it wouldn’t function or we wouldn’t be able to afford it.

We don’t charge extra for YouTube though.

Thursday, December 16, 2010

When the weather map look like this….


Odds are the traffic map will look something like this:


I’m sure there is a parallel between the DOS attacks that mother nature periodically foists us and internet security. I’ll take a stab at describing the parallels.

Predictability: Snow storms and hurricanes are very predictable (compared to tornadoes, where one has 0-10 minutes warning and rarely has accurate predictions). It is possible to prepare for weather that can be predicted. In certain regions, snow storms or hurricanes are a high enough probability event that you will certainly experience them. The probability of a major snow storm  hitting my house in a particular winter is close enough to ‘one’ that it might as well be ‘one’. Tornadoes, on the other hand, even though there are dozens per year in my region, are localized enough that I probably will never experience a direct hit on my house.

I might tend to be prepared for a predictable event (snow storm), but rest assured that I have not taken any significant precautions for a tornado. I’m playing the odds on that one.

Preparation: Many people prepare for predictable events, some do not. I’m a lifetime veteran of snowstorms, yet I was replacing shear pins and changing oil on my snow blower in the middle of the DOS attack (snow storm). I could have done that in summer, but man was it hot last summer. Way too hot to be changing oil on a snow blower. On the other hand, 2/3 of my vehicles are true 4wd and my 4wd’s have dedicated winter tires, so I normally can get to where I have to go whether I clean my driveway or not. Local governments spend a fortune on DOS (snow storm) preparation.  They have snow removal equipment, snow removal planning, emergency notification, etc. Their preparation allows me to function fairly well even when I am not prepared.

Preparation costs money though, as I can attest when I fill up the gas tanks on my 4wd’s. They cost money every day I drive them; they have twice as many driveline parts and are expected to incur significant driveline maintenance costs, but I only really need them a couple days per year.

Preparation has limits though. Even though I may be able to make it down an unplowed street, my neighbor may not have made it down that street & may be blocking my path, or worse, my neighbor may lose control of his car and whack my car, disabling me in spite of my preparation. In rural areas – the wide open prairie around here – you will be limited by visibility (white out), not traction, so your 4wd vehicle will only serve to get you deep enough into trouble that you can’t dig yourself out.

Don’t ask me how I know. 

As it turned out, 4wd vehicle #1, a Subaru with 7in of ground clearance ,was expected to be operable in an event of magnitude ‘n’ (4-5” of snow), or perhaps marginally operable in an event of ‘2n’ (8-10 inches of snow). It was not expected to be operable in a ‘4n’ event (16-20” of snow) and predictably was not operable on unplowed roads last weekend. My 4wd vehicle #2 on the other hand, a robust pickup truck, was expected to be operable in a ‘4n’ event. After a half hour of trying to get the vehicle out to a plowed road so I could take my pharmacist neighbor to work at the 24 hour pharmacy, I concluded that getting 4wd vehicle #2 back into my driveway would be a far more reasonable outcome. Apparently I’ve either configured 4wd vehicle #2 wrong, or I didn’t have an adequate pre-purchase test plan. The currently working theory is that even though it was purchased for a ‘4n’ event, it is only configured for a ‘3n’ event.

There is no doubt that one could prepare for a ‘4n’ event like last weekend. I’d like to think that someone made a serious calculation on dollars spent versus level of preparation. Odds are though, that nobody did. It probably went more like ‘here is how much money you can have, be as prepared as possible given that constraint’.

Or – in my case – spend whatever is necessary to prepare for a ‘4n’ event, but then configure it wrong and inadequately test it, and watch as it fails to manage the event.

Incident handling: During a storm of magnitude ‘n’, a prepared person might conduct business as usual, perhaps with reduced capacity or response time. For example, one  might still get to work on time, but suffer a longer commute. A storm of magnitude ‘2n’ might cause a prepared person to have degraded operations, cancelling non-essential activities. A storm of magnitude ‘4n’ might cause most activity to come to a halt. Preparation can affect the value of ‘n’. A snow storm that would shut down Washington DC would probably have only a minor effect in Minneapolis or Buffalo.

Last weekends storm might have been a ‘4n’ event – something that maybe occurs every 20 years or so. The round red signs in the above image are closed roads. You get a ticket if you try to drive on them. Odds also are pretty good that you’d fail to make it to the other end of that particular road. The MSP airport has probably the best winter capability of any airport, yet they ended up more closed than not. Most of the municipal snow plowing was halted during the worst of the storm, buses were halted, and even fire trucks and ambulances were severely affected. Operations were certainly degraded during the DOS.

Degraded operations: During the DOS attack, most local governments have some ability to operate in a degraded mode. For example, the State Patrol may close roads, shedding load by restricting traffic to emergency vehicles only. Snow plows may stop plowing streets and only venture out to open up streets for emergency vehicles, airports may restrict incoming flights, schools and business may close, etc. Degraded operations are an accepted outcome of large scale DOS attacks (snow storms), and most entities have a pretty good idea what service need to be maintained during a DOS attack (snow storm).

In my case, degraded operations mode means avoiding travel, maintaining power, heat, Internet and food in approximately that order. Food is the easiest. I still have my Y2K stash in the basement. Year 2038 is just around the corner & I’d hate to be caught unprepared.

Sunday, December 5, 2010

“There is nothing the governments can do to put the genie back into the bottle”

From Paul Homer’s The Effects of Computers:

The “rich and powerful” are rich and powerful precisely because they have access to information that the rest of us don’t have.

And – once you give people the power to access the information:

“There is nothing the governments can do to put the genie back into the bottle”

Wikileaks related. A good read.