Sunday, February 27, 2011

Managing work that doesn't fit

I just finished reading David Anderson's Kanban book, and I've been reflecting on my own experiences with applying Agile on legacy applications, and dealing with work that didn't fit nicely into predictive time boxes.

The assumption when fixing the timeline is that scope should be variable.  But often dropping scope if the work is already started, isn't really an option.   Using feature branches to not add work until its ready flies in the face of continuous integration - delaying defect discovery and increasing its cost, adding additional merging work and preventing refactoring.  Feature toggles can be prohibitively expensive, unfeasible or too high risk to implement.  Sometimes even changing one line of code can be so high risk that it requires a month of effort in regression testing or test repair, or causes a discovery avalanche of unplanned work.  You can't always break work down or drop scope.

In order to release the software, the set of work done that's planned for release has to be in a finished release-ready state.  David suggests the delivery timeline should be fixed, and decoupled from the work pipeline.  Although I think efforts can be made toward that goal, because of the release requirement to have a synchronized 'done' event in the software system, they are by nature, coupled.  Since new work can affect existing work, something that was done yesterday, may not be done anymore with even a small change in code.  Work items are naturally inter-tangled.  Regression testing on legacy applications is usually expensive, and thus commonly batched to cover multiple work items.  And although the cost of 'release-readiness' can be greatly reduced through automation and defect prevention mechanisms, most existing projects are still quite a ways from this not being expensive.  Is it really practical to expect a legacy application to be release-ready by a fixed date without coupled control over the flow of work itself?

If you limit your WIP, at any given moment, you can finish up all work reasonably soon, because you never start too much.  I would think the practical way to deal with the fixed delivery date, start a set of work, and at some point later (possibly immediately), let all WIP drain from the system and reach completion.  Batch any expensive 'release readiness' tasks such as regression testing as it makes sense given the current cost, and prepare a release ready product.  If it's important to be able to hit a fixed timeline in order to build trust and predictability with users, sit on the release until the release date, and keep going with the next release.  Otherwise, you can ship them when you have them done, and let them be variable.   Your fixed release schedule could be adequately buffered to accommodate the variability of the size of work items.

But there is something to be said for work iterations. It creates cadence and routine, a time for reflecting, adjustment, a trial time period for experimenting, a pressure to minimize, and a time to finish a team's work - a synchronization point.   But big pieces of work that don't fit in a sprint shouldn't be skewed to be something other than what they are.  Work that is partially done and incomplete, should be visible as exactly that.  And since this is where I think Kanban shines, I think adding a signal component to the process makes a lot of sense.

Represent large work (that can't be broken down) as an epic in which the parts aren't required to be useful, integrated with one another, or finalized.  Break the work down into parts that can be added to the system without breaking it, and ideally without breaking shipability.  At the least, aim for a committable set of work that won't interfere with other development, and that is as near as possible to what it's final form will be.  Strive to order work items such that important discoveries will be made as early as possible.  And the signal part... limit the number of these epic items allowed to be in progress.   If you need to release, the signal process will prevent you from going too far off the tracks.  Worst case, you'll need to finish up the started epics in order to ship.  Best case, you can ship with partially done work. You can stay continuously integrated,  and can schedule the work parts in sprints relative to the priority of the epic when it first starts.

Thursday, February 24, 2011

Effects of a Time Box

When stories are big relative to the size of an iteration, the time box itself has a huge effect on skewing priorities of the work being done, or hiding work in progress.  The bigger the stories (relatively), the more pronounced the effect.

Suppose I have a large story that consumes 60% of the capacity of my sprint.  Assuming that the work is divisible enough across multiple developers, then this work with 'fit' in the sprint.  But now, having 40% of my capacity remaining, if the 2nd priority story is also large, it will not fit in the sprint, so I proceed down the backlog looking for smaller sized work, even though it's lower priority.  If we extend the time box, so that it is larger relative to the size of the work items, we can more easily work the backlog of items in priority order.    Depending on the relative business value of items in the backlog, not working on the most important items, could be a really bad side effect.

The other common 'fix' I have seen is in breaking down the work into smaller pieces.  There may be ways to break down work into smaller useful deliverables, and these should be explored first.  But often, the work is either just not divisible while still being useful, or the team just can't come up with a vertical breakdown of the work.  On a current project, the work is 'finished', but reliant on temporary scaffolding.

Now, after having worked several sprints, several features are in this scaffolded state, and the application as a whole isn't shippable.  The remaining integration work, and discovering the functionality that doesn't actually work is delayed and invisible.   Until all of the partially done work is either stripped out or completed, nothing can be released.  Basically, the team is now bound by this remaining scope that must be dealt with.  The timeline can't be frozen anymore since scope can't be variable.

Having gone off the 'staying releasable' path before... my way back to sprints was to just do a scope bound iteration to get everything integrated and working, and then go back to sprinting.  The team not being fully utilitized during this effort can be a scary thing, but adding more work just makes closing the gaps take longer.  And if the new work is dependent on an assumption that the old work is functioning properly or won't require drastic changes, it could just be one more thing that gets in the way.  My Scrum Master suggested any one not working on integration go play basketball... or just go home.   Playing ball at work would cost us less.

Getting back to being deliverable has to be the priority.  That doesn't mean you stop planning, but you just plan on demand instead of in iterations... pulling the next pieces of work when you finish up your current work.  Kanban mode.  At least until you can put the pieces back together again.  Depending on the pain you've built up, getting back on track can take a while.  But no matter how long it takes... putting the train back on the rails has to be the priority.  After we worked through the pain, our releases were always short and consistent.  But at that point, we were iterating on working software.