Gmail Outage? What Gmail outage?
September 5th, 2009 | Published in Infrastructure, The Cloud
I’m giving talks on cloud computing more and more often these days – the topics of which cover the gamut from the basic educational level – to the strategy and architecture of how a company can best leverage a piece of the cloud. For the introductory discussion, I include a whole section dedicated to benefits and risks. Some of the benefits double as risks, and vice versa and one of those double-edged swords is Availability. Usually, Availability cuts away from the user and opens up access to your resources from anywhere in the world, but sometimes when the backswing returns, you can get nicked. That’s what happened to Google’s Gmail and its users on Tuesday. But if you mitigated your risk correctly, you weren’t one of the ones who got cut.
What? Really?
I’ve been using Gmail for about four years now. I use it for personal email. The company that pays my salary uses Google Apps so I have all my employer mail funneled to me. All the mail from my side ventures are directed to Gmail. Lastly, appointment notices from my clients are also forwarded through the Google infrastructure and into my Gmail inbox. Filters and labels have everything fine-tuned into a system with which I’m quite happy. Some of my pertinent email is forwarded to my phone, marked as read and archived or is checked via IMAP and synchronized with the server so I don’t have to read email twice. If Gmail went down, I would know it pretty quickly. Well, apparently, Gmail went down. But I never knew.
I spent most of Tuesday in meetings, dealing with tweets, the occasional SMS, and email during lulls or when something was broad to my attention. I had messages via all the avenues mentioned above. I responded to those needed responding to, and saved for later those that didn’t. When I got back to the hotel that evening and started reading through my RSS feeds, that’s when I saw the headlines: “Gmail Outage!”. Really?
My first thought was maybe it was contained to certain geographies or was a “down ‘n up” where it was just a small hiccup. Nope, global outage. 100 minutes in length. Strange. Then why didn’t it affect me? Oh well. But I could feel the cloud nay-sayer conversations starting to spin up: “Cloud is not ready for prime-time!”; “Real companies can’t afford outages like this!”. But these same people don’t make the connection that their Exchange server or desktop-to-email connectivity is probably even flakier. Outlook is down again? No? Oh, it was the VPN that was down. In either case, the outcome is the same. Your email is inaccessible.
Plan B
To be honest, Gmail was not down. Its interface was. Mail was still being collected and delivered, but, if you were trying to use the standard web interface to get to Gmail, then yes, for all intents and purposes, your email was “down”. My personal saving grace was my setup. I’m not allowed to access Gmail from work, so my phone becomes my mail client of choice (when your clients sometimes block access to the data you need, you become resourceful so as not to lose productivity). Because the Gmail interface was down, but not Gmail itself (POP & IMAP), if you were reading your mail through some alternate means outside going to gmail.com, then you were happily getting your mail as you normally would. My extended accessibility strategy was also a risk mitigation strategy – not by design, but since the effect is the same, then I’ll just claim it as such.
The lesson with which I’ll augment my next set of talks will be “planning”. The cloud does not absolve you from the things that normally need to do to maintain your personal (or corporate) responsibilities for keeping the flow of information going. You still need to do backups. You still need to do error checking. You still need to make sure you have an alternate plan. If you absolutely, positively cannot have Gmail go down, then maybe you should be forwarding all your mail to a backup provider so that when Gmail goes down (and it will again) – you can still get your mail.
That which is infinitely available is also infinitely expensive.
Google claims 99.9% uptime which means they experience less than 9 hours of *unplanned* downtime per year. To my knowledge, Google doesn’t have planned downtime that affects its users (whereas most companies have at least weekly or even nightly planned outages). So Google’s 99.9% equates to a lot of companies’ 99.99%. And to all those who say: “Google’s uptime needs to be better.” Better than what? You say your Exchange server has been running for three years straight. Great, what about the infrastructure around it? Power, networking, Internet, VPN, the SAN on which all the email resides. How’s the uptime there? All of those links factor into to the 3/4/5 9s calculation. How many millions of users is your “3 year uptime” Exchange server supporting. Ah, that’s what I thought.
In any event, Gmail will improve because of this event. And whether you buy into the cloud or not, adequate planning on your part will keep you accessible and productive.
