Random header image at nexidecimal

Data Center Agility – Execution

Data Center Agility – Execution

October 13th, 2009  |  Published in Agile, Conceptual, Idea, Methodology

The Execution phase of our Agile Data Center move is where all the fun stuff happens. As a reminder from the previous entries about the project setup and preparation, we’re doing a data center move where the schedule is flexible (the client is more tolerant of time, less tolerant of risk) and the requirements are somewhat nebulous. Our current migration strategy has us virtualizing as many physical servers as possible as well as using SAN replication as our data pusher whenever we can. Our basic currency is an application (Migration Element) to be moved as compared to a feature or requirement that you would usually find in a software project. Our iterations (Migration Events ) are a week long and occur during the weekly 6h maintenance window on Sundays. Essentially, our goal is to make the migration as a whole the least disruptive it can possibly be.Migration Event Wheel

If you refer to the classic diagram, aka, the Agile Circle of Everything, you’ll see we have five main work areas that get performed during a Migration Event (I’m purposefully leaving out the “Perform Warranty” piece for the moment). Let’s look at each of these work areas in more detail:

  • Plan Migration Event
  • Begin Staging Activities
  • Perform Event Staging
  • Perform Event Cutover
  • Conduct Lessons Learned

The first task for the Migration Event is to set up a plan of the work that needs to get done over the next period (whatever amount you selected as the duration of each event: week, two-weeks, month , etc.). This essentially is going to be a list of all the Migration Elements that are next in the priority list, or are leftover from previous Migration Events, or even those that have been prioritized up from subsequent Migration Events. If this is the first Event of your overall Migration Plan, you will need to make a somewhat educated guess as to how many points you can get during during the period (your initial velocity). After you complete a few Migration Events you’ll have a better feel for what your actual velocity is, but until then, it’s an estimate.

Select the Migration Elements from your prioritized list, paying close attention to the Migration Groups (this helps assure that you’re getting applications that need to travel together). Based on your velocity (either experienced or estimated), select as many Elements as can fit in the Event. If you have any holdover Elements from the previous Event , those usually go at the top of your list (unless of course they’ve been deleted from the project due to a business change. Hey, it happens.) Once compiled, you have your marching orders for this Event. It’s time to get down to the real work.

Now that you know what it is you’re trying to accomplish, it’s time to start setting it up. In our data center move, we’re anticipating a setup stage where we stand a temporary environment up at the current data center, get the application configured and ready for how it will exist at its destination. Then, in most cases, ship it across the wire though SAN replication or some other electronic other means where it’s stood up in an identical configuration on the other side. There may be some forklifting occasionally, but we’re hoping to minimize it. To manage this process, and the following two: Perform Event Staging and Perform Event Cutover, I’ve created a Migration Element Worksheet that tracks: Element ID, Assigned Migration Event, Assigned Migration Group, Migration Element Sizing, Migration Element Priority, Migration Element Value, Environment, and a progress checklist. This worksheet rolls up to another worksheet which tracks my overall process within the Migration Event, which in turn rolls up to a Burndown Chart.

Migration Plan Execution Tree

The metaphor I’ve been using for the migration is a moving truck where, being the Virtualization Guy, means I virtualize whatever application I need to move, throw it on the truck (aka the SAN) and then wait for it to get to the new data center. Once there, I unpack it (attach it to the new host) and fire it up! Obviously I’m glossing over the details of the configuration for the destination environment and testing of the connectivity and the application itself – in part because I’m more concerned with the process of moving at the moment, and in part because we haven’t really worked out all of the details yet. Nevertheless, what I described, while definitely the “happy path”, is how we hope to get the bulk of machines to the new data center.

For virtualization activities that have to deal with high-availability, SLAs, replication, high-utilization, etc. – I’ll make sure my sizing exercise reflects the extra effort required to move these types of machines, but the overall process will stay the same. If during this process I find I can’t get all of the work done as specified in the Migration Event Plan, then I’ll need to defer those Elements (with consent of the business) and adjust my spreadsheets accordingly (although I’ll probably use Pivotal Tracker when it comes time to do it for real).

The last step in the process, “Conduct Lessons Learned” is one that should never be overlooked, even if you have a large number of Migration Events. This is one of the areas where an agile process excels because it takes the tight feedback loop generated by a short Event cycle and turns it into process improvement. Use this time to tweak your Fibonacci scale, determine what’s working and what desperately needs to be improved (or even improved just a little bit).

What about the Perform Warranty step in the iterative process? Once the work is completed and accepted by the business, the Perform Warranty is actually a separate thread of execution following more of a “help desk” set of processes. In the diagram, I just wanted to show the hand-off from the main line of work to that sub-process. Essentially, as we complete a migration, the warranty clock starts running. If that group uncovers anything that needs to be addressed by the migration group, we would fold it in to the next Migration Event at the top of the queue along with any deferred items from previous Migration Event.

Summary
Over these three posts, we discussed a situation where it is possible to use an agile methodology to perform a data center move. The constraints that seem to work include:

  • A protracted execution timeframe where the client is more apt to avoid risk than either time or money.
  • A strict schedule for performing the migration (in this case during weekly system maintenance windows).
  • A willingness to allow elements to “slip ” from one event to the next.
  • A nebulous set of requirements and constraints.

These are not requirements, although a systematic, repeated schedule is somewhat. If the client were to do a “big bang” or a more sporadic implementation, then getting the rhythm and predictability required for an agile process would prove problematic.  If you’re interested in looking at my spreadsheets for how all of the things I described above fit together, just let me know via Twitter @nexidecimal.

Bookmark and Share

This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.

Comments are closed.

Recent Tweets

Follow @nexidecimal (6 followers)