Data Center Agility – Preparation
October 6th, 2009 | Published in Agile, Methodology
Recently, I talked at a high-level about a data center migration on which I’m currently working, and how it would be possible to use an agile approach for it (or at least parts of it). The high-level process seemed to make sense, so I wanted to dig a little deeper to see if it was still a workable solution once more of the details were considered. Because I’m also using this concept to try and sell my co-project managers on the beauty of agile, I’m integrating standard agile-speak with the project-speak we’re using internally. After all, finding a way to translate concepts is part of the battle when promoting agile to a waterfall crowd. Here is the translation dictionary:
- Migration Element = Feature or Story
- Migration Event = Iteration or Sprint
Some of the characteristics of Migration Elements are:
- The base-level of a migration is a Migration Element
- Migration Elements are assigned relative priorities
- Migration Elements are assigned relative effort (e.g. 0,1,2,3,5,8,13,21)
- Migration Elements are associated with like elements into Migration Groups
For those of you used to agile methodologies or projects, these notions should be familiar even though the names have a more Data Center-ish flavor to them. For the purposes of our detailed discussion, I’m going to use the same major segments of work that we’re using in our project: Preparation, Execution, and Validation. Today, I’ll go through Preparation. It’s definitely not the fun part of the process, but we gotta do it.

The Preparation
Preparing for a data center move can be a huge endeavor. Data volumes are measured in anywhere from dozens of terabytes to petabytes. Mission critical applications, stringent disaster-recovery requirements, and inconsistent backups all factor in to making a thorough and detailed setup for such a project – absolutely key for being able to be successful. In my current situation, we’re handling the preparation as a traditional waterfall approach, trying to know everything that’s knowable before actually getting to work on migrating systems from the old to the new data center. Doing the prep work in a more agile way may be more difficult than it’s worth, given the interconnectedness of the applications. We can’t really take them a company or department at a time and going in we don’t have visibility into all of the systems to know which “wire hangars” we can pull out alone or which ones are hooked together in a big wire blob. In addition to that, the facility won’t be ready for us until mid-November so the only thing we have to do anyway is plan.
Discovery
Ah discovery. This is the part where you know next to nothing at the beginning of the stage, and think you know everything at the end of it. Where Planning will be the phase to create my agile flow, Discovery will provide my raw material. Now, we’ve all done discoveries before, and since the preparation portion of our data center migration project is not really agile, I’m going to gloss over the act of discovery a bit.
For our project, the big things we need to know during this phase are:
- What are we moving/not moving?
- How do systems & applications interrelate?
- What BC/DR and security rules do we need to follow?
- What are the time/budget constraints?
There are other things too, like strategic directions and all the paths you can possibly take to get there. But for the purposes of what we’re talking about, we’ll stop the list here. Hopefully, we are able to put together a lucid set of documents that organizes all of this collected data into information that is usable for our Planning phase because that’s where it all starts to take shape.
Planning
Now that we know everything – at least mostly. We can start figuring out how we need to fit it all together. While we’re producing many deliverables as part of this effort, in order to support the agile stuff I want to do during the execution phase, these are the things I need to know coming out of the Preparation phase:
- Assess Migration Grouping – Which applications and servers go together? What are the upstream and downstream dependencies? Are there any service-level agreements in place dictating which applications have performance requirements, cannot be virtualized, or needs its own box? Most of this info would have been uncovered during the Discovery. Here, we need to coalesce it into something useful so I can better plan my Migration Events.
- Determine Grouping Priority – Which groups are high, medium, and low, priority? Essentially, who does the business say should go first? This will determine the order of the migration events and if I need to defer an application (or promote one) based on time constraints, I’ll have a good idea of the best candidates for that.
- Determine Grouping Effort – How difficult will each migration be relative to other elements? As an example, if an application already sits on a virtual server, it should be easier to migrate than one that sits on a physical server. Also, applications that touch many other applications will require additional steps and testing so would have a higher effort number associated with them. The end goal is to be able to gauge my migration velocity for each migration event thus allowing me to track performance and increase sizing accuracy over time. It also lets me “horse trade” migration elements if the business were to ask me to re-prioritize an element from a later event into my current one. Just like in an agile project, we’re not estimating hours here. Whether you call them: units, points, or rocks, the intent is for them not to be time-based. You need to know how complex a Migration Element is relative to all of the other Migration Elements or some sort of standard baseline. Personally, I use a Fibonacci scale because it gives me more separation in sizing the less I know about it, but feel free to use whatever you wish.
- Create Migration Event Plan (iteration or sprint plan) – With the priority and sizings now known, I can lay out an overall schedule of migration events. This would include the number of total events (iterations), the date each migration element would be delivered, as well as the estimated cost. This plan would be modified over the course of the migrations, being refined by the actualities of the execution, adding or subtracting migration events as necessary (and adjusting the cost appropriately). Since we’ll be using the weekly maintenance window to do our migrations, I’m going to assume that a Migration Event is 7 days. I could easily have picked 14 days for two-week Migration Events, but given that my maintenance window is only six hours per week, I’ll leave it at 7 days until the actual migration velocity proves me wrong.
Assuming a one week iteration, where my maintenance window falls on a Sunday, my week could kind of look like this:
My constraints somewhat force me into a situation like this, where an iteration starts on a Wednesday and ends on a Monday. Whether or not this would be workable depends upon how much migration I can get done in my window. If I can only get 10% done per Migration Event, then I’ve got 10 weeks of Sundays to look forward to. Unless I want to go to an every other week schedule or maybe open up the maintenance window for a longer period of time, or even accept mid-week planned outages. However, if I get 25% done each window, then maybe we can all “suck it up” for that long. In the end, the business, and the IT folks will decide how to distribute the pain.
We should have everything we need now to start iterating through the work. I know how long the project should take, how much effort is going to be required, and how much it will cost. However, the most important thing I know is that I don’t know everything. My knowledge is still based on estimates, and these estimates won’t be tested until I actually start doing the work and measuring the results. Once I start getting real data, I can refine these estimates into something a bit more bankable. I’ll cover that in the next segment.
