Is anyone else thinking that there might be a connection between this event:
and this event:
Is anyone else thinking that there might be a connection between this event:
and this event:
“Further evaluation uncovered an extra computer board that had been placed inside the checkout machine, recording customers' financial information.”
So…how did a skimmer get inside the checkout machine?
I’ve never been in a Lucky Supermarket, but the places that I frequent that have self check-out lanes all have employees watching the lanes and presumably have security camera coverage.
These aren’t unattended ATM machines or gas pumps monitored by a harried convenience store cashier who has five other customers waiting to check out, so one would think that adding electronics to the inside of the checkout machine would be a detectable event.
The company CFO doesn’t think it’s an inside job though:
“Although the skimmers at Lucky's stores were apparently installed inside the checkout units, Ackerman said the company doesn't suspect an inside job”
Years Decades ago I worked for a small college with strong, forward looking leadership that firmly believed a significant fraction of our interactions with students should be computerized. He believed that if we automated background bureaucracy we could better handle the budgets cuts and shift more resources into classrooms. He also believed that if students had a clear, consistent interface into our bureaucracy they’d be better, happier learners.
My job was to make that happen. I networked all staff and student computers, computerized all student grading, registration, transcripts, fees; all college accounting, purchasing, invoicing, inventory, room scheduling and whatever else the president though was burrocratic. My toolkit was an ARCnet network, a Novell Netware 2.0a server, a few dozen 80286 computers and a database called Paradox. I turned off the IBM System/36. Their was no WWW.
It took a couple years, but we got to the point where a student could walk up to a kiosk (a netbooted computer running a Paradox client app) see their up-to-the-minute quiz & test scores, how much they owed, what the next assignment was on each course, what assignments were required for course completion, and what courses were required for degree/program completion. A push of a button got them an unofficial transcript on a nearby dot matrix printer.
Instructors entered each test/quiz/assignment into the system. Course grades, program and degree requirements were updated and maintained automatically. Students could look up their grades any time they wanted, with real time refreshes on recently entered grades.
Department heads entered their own purchase orders. Accounting approved them electronically and faxed them out the door. Department heads could look view up to the minute account balances any time, and our president had a kiosk in his office that showed real time account balances on all critical accounts (that turned red if the the account was debited since the last refresh).
We were really close to having our electronic testing center automatically upload test/quiz scores to our records system in real time.
No batch jobs, either.
Students who were granted aid or loans often received more in grants/loans than their tuition & fees, so getting that money out to students early in the semester was a critical but very manual process. It’s a huge deal for the students, as many of them need that balance for food, rent, etc. Financial aid was already computerized using a stand-alone commercial package. Manually calculating and applying financial aid disbursements to student accounts was a major work effort, and consumed more than one persons time for the first few weeks of the semester.
To get that automated, all I needed to do was interface that software with the student records/fees package I wrote, and a major paper based time consuming process would be eliminated.
The president put that on my plate for fall start. I wrote a Paradox program that shelled out to DOS, ran the DOS based financial aid application in a window, exported the necessary data from the financial aid application and imported it into the finance module. It applied the financial aid to the students account and figured out if there was anything left over. If there was, it would print the student a check for the balance.
We tested early & often, as neither the accountant nor the financial aid director was enthused about what to them was a cloudy process that they didn’t control, and I wasn’t enthused about mucking up a few hundred grand in financial aid disbursements. In the weeks before semester start, I’d run the program in simulation mode, the accountant and financial aid director would compare my simulation to their manual calculations, and we’d correct any differences.
Over and over and over….
The big day finally arrives. Ten days into the semester all bills are due and all aid checks get cut. The total disbursement is somewhere in the middle hundreds of thousands of dollars. Our financial aid director went line by line through each students fee statement one last time, compared it to each students financial aid disbursement, manually calculated the expected disbursement and compared each one to the simulation that I ran from my new program.
All was good – except that the accountant (fresh from college, new to the job) and myself (nasty flu, 103f fever) were more than a bit nervous at the prospect of dispersing a half million or so in checks using our own software.
On the morning of the big day – the day that financial aid gets disbursed and registration payments are due, the hundreds of students that had an outstanding disbursement were lined up at the registrars window not-so-patiently waiting for a check that they could run down to the local bank and cash.
The financial aid director, myself and the accountant carefully opened the safe, removed the signature plates and loaded them in the check signing machine. We removed the check stock from the safe and loaded it into the check printer. We (crossed our fingers and) pulled the trigger on the script that extracted each student’s financial aid data, compared it to their chart of accounts, credited any outstanding account balance and printed them a check for the rest.
The workflow was something like:
An hour or so later, a whole bunch of happy students have their fees paid and their financial aid balances paid out, check in hand. A large fraction of them drove straight to the issuing bank & stood in line to cash the check.
Then we get the call from the bank:
Bank President: “Umm are you guys aware that you are $200k over drawn, and we have a line of a hundred or so students trying to cash checks on that account?”
The way our small town college managed it’s banking was to maintain accounts at both of the local banks and use SmallBankA for the seven figure main account and SmallBankB for the petty cash account in year one, then alternate back and forth each year. That way both of the locally owned banks got a shot a managing the large account, and we kept close relationships with the local economy. The main account probably had a $1m balance, and the small account probably had a $20k balance.
Guess which check stock we loaded into the printer?
Fortunately it was a small town and the bank covered the few hundred grand overdraft for the day that it took to transfer the funds to cover it. The bank president was smart enough to know that we were good for the funds, and graciously covered our mistake.
A car goes over the edge of a cliff on a narrow mountain road. The driver survives, but the accident goes undetected for six days. After five days, the family files a missing persons report. Law enforcement tells them that follow up will take days. The family doesn’t wait. They locate the car using their own means with the help of a detective and the phone company, including what appears to be one of the controversial warrantless cell phone locates that law enforcement does millions of times per year.
When the family located their father at the bottom of a ravine, they also found a second car had gone over the edge. That accident was unrelated and also undetected. That driver died.
Number of cars over the edge: Two. Number of cars detected and located by law enforcement/rescue workers: Zero.
No doubt that there will be a call for guard rails. I don’t think it’s practical to put rails on every spot on every road, but I do think that one could devise an inexpensive tell-tale. This could be as simple as a breakable ribbon or a row of hay bales on the outside of the curve. If the ribbon is broke or the hail bales are gone, something happened.
If preventative controls aren’t practical, is there a detective control in place?
In interviews with nearly 100 survivors of the tornado, NOAA officials found that the perceived frequency of warning sirens that night and in previous storms caused people to become "desensitized or complacent to sirens" and to not take shelter.In other words the emergency sirens are not credible unless combined with other information sources. I don’t doubt that for a second. I do not consider the county emergency sirens to be credible unless I verify them with some other source (radar weather, for example). There simply are too many false alerts.
"Instead, the majority of Joplin residents did not take protective action until processing additional credible confirmation of the threat,"
Comcast is bringing their ‘Internet Essentials’ to our local service area. Under this program, families who qualify for free school lunches are eligible for $10/month internet from Comcast.
Kudo’s to Comcast.
I see programs like this as an important factor in reducing the number of “have nots” in the already wide disparity between those who have access to broadband and those who do not. Broadband today is as critical to rural and economically poor areas as electricity was in the 1930s and 1940s. Back then, a rural farmer that had electricity could dramatically improve their productivity versus farmers with no electricity.
In the 1930s, my grandfathers sister moved from an electrified area of Wisconsin to a farm in Minnesota with no electricity. She had to pump water by hand, wash clothes by hand, heat the farmhouse with a wood stove, light kerosene lamps…
Today in Minnesota we have rural area’s where there is no wired broadband coverage, and we have both rural and metro areas where people with low incomes can’t afford the $30-50/month broadband entry fee. One of our CIO’s made it clear (to me) how important this is when he offered that bandwidth to his college was nowhere near as important as bandwidth to the rural area around his college. Rural students were dissuaded from taking classes because they would be forced to complete much of their class work while at the college rather than at home. For some, that’s a barrier.
FWIW – For the last ten years or so, we’ve been using Comcasts metro area gigabit Ethernet as the wide area network connection for about a dozen of our metro area colleges. The service is less expensive than any competitors and it has been at least as reliable as services from other carriers.
Down for maintenance. Hacked…pwned…rooted…
Can you imagine the holy shitstorm that the Linux fanboys would be flinging out the door if this had happened to Microsoft?
The root cause analysis on these will be interesting reads.
What’s next, single pen flatbed plotters?
BTW- I must be old. I still have an HP 11C…
…and I remember when we upgraded our single pen flatbed plotter to a state of the art 6 pen moving paper plotter complete with automatic pen selection. Instead of the plotter stopping and waiting for you to switch from the black pen to the red pen, the plotter would automagically put the black pen back into the carousel and pick up the red pen.
We were impressed.
The message from kernel.org is consistent with the message from pretty much everyone that gets hacked.
I’ll be looking forward to something resembling ‘full disclosure’. It should be an interesting read.
One database, four SR’s at Sev one. The oldest one has been a one for 16 days.
We’re pretty sure that Oracle 11.2.0.wtf doesn’t play anywhere near as nice with our workload as 10.2.0..
FWIW - The ‘SUN box stuck’ SR is open because a diagnostic script that Oracle had us run deadlocked a DB writer on libaio bug in Solaris 10 (Bug 6994922).
In Service Deprovisioning as a Security Practice, I asserted that using a structured process for shutting down unused applications, servers & firewall rules was good security practice.
On the more traditional employee/contractor deprovisioning process, I often run into managers who view employee deprovisioning as something that protects the organization from the rogue former employee who creates chaos after they leave. If they feel that the former employee is leaving on good terms and unlikely to ‘go rogue’, they treat account deprovisioning as a background, low priority activity.
There is obviously an interest in protecting the organization from the actions of the former employee, but something that is just as important to me is to protect the employee/contractor from events that happen after they leave. I’d really hate to see someone get blamed for an event that happened after they left our employment. That’d be really unfair to them.
For employees who are leaving on good terms, making sure that they are properly disabled is essential to insure that they don’t get blamed for things that they didn’t do.
Have all big government internet projects pass the approval of a technical panel made of professionals from the tech statup[sic] sector.This is an interesting idea – and one that I could buy into (under the right conditions…)
All proposals of high budget IT projects should pass through a panel of independent professionals from the private sector who are experienced in running large scale internet start-ups. [emphasis mine)I’d suggest that there is no reason to think there is a relationship between large scale startups and large scale IT projects involving legacy business processes, government rules & laws, legacy systems, legacy processes, public sector budget cycles, etc. I’d rather see advice from those who are experienced in large scale IT projects, rather than successful startups. I don’t think that’s the same skill set.
I’ll be very interested if Gigabit Ethernet to the home makes a difference to the ordinary home user. I’ll go on record and say that I don’t think it will. The Gig.U experiment might come up with novel and interesting uses that can’t be met by a 10 or 100Mbps home connection, but if the interesting & novel new uses for high bandwidth to the home show up, they will not radically change ordinary home users lives.
Once you get above about 6Mbps to the home, what makes a difference to the home user isn’t bandwidth, it’s data caps & quotas. If I have a 6Mbps internet connection with a high data cap (like Comcast’s 250GB cap), I can radically change how I consume information. If I have higher bandwidth connection but a low data cap (like a 2GB cap on a 3G/4G phone or the 50GB caps imposed by other ISP’s), I can’t fundamentally change how I consume information/media. That’s why I don’t care if my phone is 3G or 4G. In either case it’s still a 2GB cap, so It’s still a handicapped phone. Because it’s capped, It’s not capable of changing my lifestyle.
As I’ve written before:
Broadband access like the railroads on the prairie. When the railroads got built, you either made sure they went through your town or your town died. That’ll happen with broadband too, communities that have incumbent telco/cable providers that do not deliver low cost, ubiquitous broadband will shrivel up and die, much like the communities that got bypassed by the railroads.
FWIW – At work I have GigE to the desktop connected to a GigE LAN that uplinks to a 10Gig backbone that is connected to multiple Tier 1 ISP’s at multi-gigabit or 10Gigabit speeds, but none of that has made any difference in how I work, how much work I get done or what I do at work.
What has made a difference?
. JOHN WALKER JANUARY 1975 . . . THIS PROGRAM IS A TOTALLY NEW WAY OF DISTRIBUTING VERSIONS OF . SOFTWARE THROUGHOUT THE 1100 SERIES USER COMMUNITY. PREVIOUS . METHODS REQUIRED THE DELIBERATE AND PLANNED INTERCHANGE OF . TAPES, CARD DECKS, OR OTHER TRANSFER MEDIA. THE ADVENT OF . 'PERVADE' PERMITS SOFTWARE TO BE RELEASED IN SUCH A MANNER THAT . IF SOMEONE CALLS YOU UP AND ASKS FOR A VERSION OF A PROCESSOR, . VERY LIKELY YOU CAN TELL THEM THAT THEY ALREADY HAVE IT, MUCH . TO THEIR OWN SURPRISE.
The FBI remotely disabled software installed on privately owned personal computers located in the United States.
If this isn’t controversial, it should be.
The software is presumed to be malicious, having been accused of stealing account information and passwords from hundreds of thousands of people.
Does that make it less controversial?
Hundreds of thousands of computers have one less bot on them. That’s certainly a good thing. Hundreds of thousands of computer owners had their computers remotely manipulated by law enforcement. Is that a good thing? A dangerous precedent?
Interesting, for sure.
Update: Gary Warner has an excellent write-up.
From: UPS Shipments <firstname.lastname@example.org>
Subject: Your package has arrived!
Date: Thu, 2 Dec 2010 14:31:34 +0000
To: Undisclosed recipients:;
Dear client<br />
Your package has arrived.<br />
The tracking# is : 1Z45AR990*****749 and can be used at : <br />
<a href="http://www.ups.com/tracking/tracking.html">http://www.ups.com/tracking/tracking.html</a><br />
The shipping invoice can be downloaded from :<br />
<a href="http://thpguild.net84.net/e107_files/cache/invoice.scr">http://www.ups.com/tracking/invoices/download.aspx?invoice_id=3483273</a> <br />
Thank you,<br />
United Parcel Service<br />
<p>*** This is an automatically generated email, please do not reply ***</p>
Today we were informed by Epsilon Interactive, our national email service provider, that your email address was exposed due to unauthorized access of their system. Robert Half uses Epsilon to send marketing and service emails on our behalf. We deeply regret this has taken place and any inconvenience this may have caused you. We take your privacy very seriously, and we will continue to work diligently to protect your personal information. We were advised by Epsilon that the information that was obtained was limited to email addresses only. Please note, it is possible you may receive spam email messages as a result. We want to urge you to be cautious when opening links or attachments from unknown third parties. We ask that you remain alert to any unusual or suspicious emails. As always, if you have any questions, or need any additional information, please do not hesitate to contact us email@example.com. Sincerely, Robert Half Customer Care Robert Half Finance & Accounting Robert Half Management Recourses Robert Half Legal Robert Half Technology The Creative Group
Adaptive FirewallApparently my Mac Air is doing something to annoy the Adaptive Firewall on my mini. After a day of running ipfw, my Air looses the ability to connect to the Mini Server and 'ipfw show' shows a deny any for the IP address of my Mac Air. I have no clue why it's blacklisting me - I'm connecting via AFP, Samba and Time Machine, all of which work fine until they don't.
Mac OS X v10.6 uses an adaptive firewall that dynamically generates a firewall rule if a user has 10 consecutive failed login attempts. The generated rule blocks the user’s computer for 15 minutes, preventing the user from attempting to log in.
The adaptive firewall helps to prevent your computer from being attacked by unauthorized users. The adaptive firewall does not require configuration and is active when you turn on your firewall.
Square allows you to turn your phone into a payment card terminal.
Cool. For a mere 2.75% overhead, a merchant can accept credit cards using a free magnetic card reader attached to your phone headset jack. Your customers swipe their card and scribble their signature on your iSplat’s screen, your bank account gets a credit.
The obvious questions: How do you secure a mobile application such that it can safely handle payments? Is your Square enabled phone now covered under some sort of compliance regime?
Square says they are secure, but they’ve loaded lots of weasel language into their User Agreement and Commercial Entity Agreement. (I don’t make a habit of reading merchant agreements though, so their language may be typical for the trade, but the part where they exempt themselves from any liability or damages caused by 3rd party trojans would concern me.)
VeriFone disagrees, claiming that the Square system is vulnerable to rogue mobile apps, and claiming to have (in an hour) written an app to exploit Square. But VeriFone is a competitor and FUD works, so we have to ask – is VeriFone any different? From what I read on their FAQ they encrypt in hardware and only use the phone for transmission of encrypted data, so they might be different. As expected, Square disagrees with VeriFone, but in their CEO’s carefully worded letter, makes no assertion as to the security of their application.
I’d compare Square’s solution to running a card swipe terminal on the USB port of an ordinary desktop operating system & reading the card data with an Internet-downloadable application. The operating system must be presumed insecure (we have no evidence that any general purpose operating system has ever been invulnerable to exploitation), the payment card application hosted on the operating system can not (by definition) be more secure than the operating system, and unless the terminal performed some sort of encryption prior to sending the stream to the application, any compromise of the host OS would result in compromise of the card data.
But it is so convenient.
Update 08/05/2011: And insecure.
E-mail from a colleague:
So, within minutes of one another:
"Mobile and banking fit together like chocolate and peanut butter," says Jim Pitts, project manager of the Financial Services Technology Consortium, the technology solutions division within The Financial Services Roundtable.
[ ... ]
"[ ... ] Before their removal, the apps garnered between 50,000 and 200,000 downloads. The apps caused the phone to perform functions without the owner's consent. The Trojan embedded in them used a root exploit to access all of the phone's data and download malicious code.
The publisher has been removed from the Android Market completely, and its apps reportedly have been deleted from phones, but this won't remove code that has been back-doored into a phone's program. Google reportedly is working on that problem.
[ ... ]
Awesome. We are going to bet our financial future on a rootable platform. I wonder how that will turn out.
I’m feeling déjà vu.
It’s Tuesday. My pre-OraBorg Google reader subscription shows a stream of security updates. Looks pretty bad:
Wow – there are security vulnerabilities Mozilla 1.4, ImageMagick, a2ps, umount & a slew of other apps. I’d better kick our patch management process into high gear. It’s time to dig into these and see which ones need escalation.
Clicking on the links leads to sunsolve, the go-to place for all things Solaris. Sunsolve redirects to support.oracle.com. support.oracle.com has no clue what to do with the re-direct.
Bummer… I’d better do some serious research. GoogleResearch, of course:
2004, 2005, 2006…WTF???
Conclusion: Oracle is asking us sysadmins to patch five year old vulnerabilities. They must think that this will keep us from whining about their current pile of sh!t.
Diversion. Good plan. The borg would be proud.
One last (amusing) remnant of the absorption of Sun into to OraBorg.
“There is not a guaranteed 1:1 mapping between backup and recovery performance…” Preston de Guise, “The Networker Blog”
Prestons post reminded me of one of our attempts to build a sane disaster recovery plan. The attempt went something like this:
In the general case, consultants may or may not add value to a process like this. Consultants are in it for the money. The distinguishing factor (in my eyes) is whether consultants are attempting to perform good, cost effect work such that they maintain a long term relationship with the organization, or whether the consultants are attempting to extract maximum income from a particular engagement. There is a difference.
On this particular attempt, the consultants did a reasonably good job of building a process and documentation for declaring and event, notifying executives, decision makers and technical staff; and managing communication. The first fifty pages of the fifty thousand dollar document we generally useful. They fell down badly on page 51, where they described how we would recover our data center.
Their plan was:
To emphasize to the executives how firm they were on the fifty seven hour recovery, they pasted Veritas specific server recovery documentation as an addendum to the fifty thousand dollar plan.
Unfortunately, their recovery plan bore no relationship to how we backed up our servers. That made it unusable.
Reality: at the time of the engagement:
Unfortunately, the executive layer heard ‘fifty seven hours’, declared victory and moved on.
I tried to feed the consultants useful information, such as the necessity of having a the SAN up first, the architecture of our Legato Networker system, the number of groups and pools, the single threaded nature of our server restores (vs the multi-threaded backups), the improbability of being able to purchase servers that exactly match our hardware (hence the unlikelihood of a successful bare metal recovery on new hardware), not having recovery site pre-planned, not having power and network at the recovery site, and various other failures of their plan.
You get the idea.
The consultants objected to my objections. They basically told me that their plan was perfect, and that it was proven so by it’s adoption by a very large nation wide electronics retailer headquartered nearby. I suggested that we prepare a realistic recovery plan, accounting for the above deficiencies, and that plan be substituted for the ‘fifty seven hours’ part of the consultants plan. The declared me to be a crackpot and ignored my objections.
Using what I thought were optimistic estimates for an actual recovery I built a marginally realistic Gantt chart. It looked something like this:
Then (roughly a week into the recovery) we’d be able to start recovering individual servers. When estimating the server recovery times, I assumed:
Throw in a few more assumptions, add a bit of friction, temper my optimism, and my Gantt chart showed three weeks as the best possible outcome. That’s quite a stretch from fifty seven hours.
The outcome of the consulting gig was generally a failure. Their plan was only partially useful. If we would have followed the plan, we would have known whom to call in a disaster, decision makers, communication plans, etc.,but we would not have had a usable plan for recovering a data center.
It wasn’t a total loss though. I used that analysis internally to convince management that given organizational expectations for recovery vs. the complexity of our applications, a pre-built fully redundant recovery site was the only valid option.
That’s the path we are taking.
Last August, Tipping Point decided to publically disclose vulnerabilities six months after vendor notification. The six months is up.
Take a look at the IBM’s vulnerability list and actions taken to resolve the vulnerabilities. If you don’t feel like reading the whole list, the snip below pretty much sums it up:
[08/26/2008] ZDI reports vulnerability to IBM
[08/26/2008] IBM acknowledges receipt
[08/27/2008] IBM requests proof of concept
[08/27/2008] ZDI provides proof of concept .c files
[07/12/2010] IBM requests proof of concept again and inquires as to version affected
[07/13/2010] ZDI acknowledges request
[07/14/2010] ZDI re-sends proof of concept .c files
[07/14/2010] IBM inquires regarding version affected
[07/19/2010] IBM states they are unable to reproduce and asks how to compile the proof of concept
[07/19/2010] ZDI replies with instructions for compiling C and command line usage
[01/10/2011] IBM states they are unable to reproduce and requests proprietary crash dump logs
Tipping Point: Two Thumbs Up.
IBM: Two and a half years. Still no clue.
What IBM’s executive layer needs to know is that people like me read about the failure of their software development methodology in one division and assume that the incompetence spans their entire organization. That may not be fair – IBM is a big company and it’s highly likely that some/most software development groups within IBM are capable/competent. However – if one group is allowed to flounder/fail, then it’s clear to me that software quality is not receiving sufficient attention high enough within IBM to ensure that all software development groups are capable/competent. If some/most software developed within IBM is of high quality, it’s because some/most software development groups believe in doing the right thing. It’s not because IBM believes in doing the right thing.
In other news, IBM’s local sales team is aggressively pushing for me to switch our entire Unix/Database infrastructure from Solaris/Oracle on SPARC to AIX/DB2 on Power.
Guess what my next e-mail to their sales team is going to reference?
A well formed e-mail:
No obvious spelling errors, reasonably good grammar, etc. One red flag is the URL to the Comcast logo, but I wouldn’t bet on users catching that. The embedded link is another red flag:
But one that would fool many. Users will not see that URL unless their e-mail client has the ability to ‘hover’ a link destination.
The ‘login page’ is well formed & indistinguishable from Comcast’s Xfinity login page:
All the links in the bogus login page (except the form submit) go to real Comcast URL’s, the images are real, the page layout is nearly identical. The only hint is that the form submit doesn’t post to Comcast, but rather to[snip].bulkemail4sale.com/Zola.php:
Filling out the bogus login page with a random user and password leads to a “Comcast Billing Verification” form requesting last, middle & first names, billing address, credit card details including PIN number, card issuing bank, bank routing number, SSN, date of birth, mothers maiden name, drivers license number, etc…
The “Comcast Billing Verification” form is very well constructed, generally indistinguishable from normal Comcast/Xfinity web pages. The submit action for the “Comcast Billing Verification” form is:
Hacker.php? This is not going to end well.
This is a very well constructed phishing attempt. Impressive, ‘eh?
It took me a bit of detective work to determine the non-validity of this phish. Ordinary users don’t have a chance.
Where is anonymous when you need them?
…that you are not qualified to decide what content you read on the device you’ve purchased.
If the New York Times story is true, Apple is rejecting an application because the application allows access to purchased documents outside the walled garden of the iTunes app store.
“Apple told Sony that from now on, all in-app purchases would have to go through Apple, said Steve Haber, president of Sony’s digital reading division.”
I keep thinking that there’d have been an outcry if Microsoft, at the height of their monopoly, had exercised complete control over the documents that you were allowed to purchase and read on your Windows PC’s.
“Light-rail service throughout downtown Minneapolis was halted Thursday for about four hours because of a downed wire that powers the trains from overhead…”
Apparently there is no redundancy.
I’m not thinking about this because I care about the commuters who were stranded, but rather because of how it relates to network and server redundancy and availability.
My group delivers state wide networking, firewalling, ERP and eLearning applications to a couple hundred thousand students and tens of thousands of employees.
In that environment, how do you make a cost vs. availability decision?
Years ago (cira 2001) we found a carrier that would offer us OC-3 (150mbps) for what was essentially the same price as the incumbent telco (Qwest) would charge us for two T1’s (3mbps). The new carrier had less experience with data. They were a cable TV provider.
What decision would you make?
I hesitated, figuring that it’d be best to get feedback from the campuses that were currently starved for bandwidth, so I posed the question at our quarterly CIO meeting.
Me: “If I have to make a choice, would you trade availability for bandwidth?”
(At least that’s how I remember it…)
The carrier deliver excellent service and availability. Service was awesome, availability was at least as good as the expensive incumbent. We gained confidence in the low cost carriers ability to deliver, and year by year we migrated more campuses to the low cost carrier.
Life was good. We traded 3Mbps for 150mbps without increasing the budget. Hail to the hero (that’d be me).
Fast forward 4 years. The small town in the path of the low cost carrier’s non-redundant leg decided to build a road. The town cut the fiber, a dozen campuses disappeared from our network map for 8 hours, and to put it mildly, the campuses were not happy.
better worse though. The carrier patched up the busted fiber late that day then scheduled a plow crew to come out a week later, bury a new fiber parallel to the old fiber and move us to the new fiber.
Yep, the carriers crew cut their own fiber. Another half day outage.
It gets even
better worse. The carrier didn’t have facilities all they way to our core where we needed them so they leased a big carrier’s circuit to the big city. A few weeks after the two fiber outages, the big carrier in the big city smoked a network interface.
Outage number three. Let’s all throw rocks at the goat (that’d be me).
The next quarterly meeting wasn’t fun.
Suffice to say that we now make a different calculation on the relative value of bandwidth vs. availability.
Back to the broken train. A four hour outage, construction cost of a hundred million dollars per mile and no redundancy?