One of the beautiful things about agile planning is the way it manages to translate the informal notion of how much work is being done into a measurable metric. A vague one that only works when averaged over a period of time, but a metric nonetheless. That metric is called velocity.
A quick recap on how it works. You divide the work to be done in stories, pieces of functionality that produce value to users. Each story is assigned a number of points that is an abstract estimation of how hard the functionality is to implement, either in complexity, risk or plain manual labor. The points do not have a direct relation to hours. The only thing you can say about a one point story and a two point story is that the latter is about twice as hard.
One of the reasons for using abstract metric for estimation is the assumption that developers are bad as estimating how long the work will take, but in a systematic way. In other words, they are always a factor X off. (A surprising fact is that the environment has a big influence on X.) So the number of story points that were finished in an iteration will be a reasonable estimation of how many will be finished in the next. This metric is called velocity.
Why is velocity such a beautiful thing? Because of the way it incorporates quality.
Velocity and quality
Quality is a nebulous concept. We know it when we see it, but we cannot directly measure it. The best we can do is to use a derivative metric. The Perl testing community, for example, uses a measure called Kwalitee. It is, to quote from the CPANTS testing service website, “something that looks like quality, sounds like quality, but is not quite quality”. A set of metrics that “are based on the past toolchain/QA issues you may or may not remember.”
But velocity does manage to incorporate quality. How? By only considering stories that were fully completed in the iteration, “done done”. That means they have been accepted by the product owner as functionally correct and by the other developers as technically sound. In this way, the quality question is answered before we take the measurement.
So velocity is a good, although abstract, metric for team performance. Because it is dependent on many factors, related to both the team and the environment, velocity cannot be used to compare performance across teams. It is good practice, though, for a team to try and increase its own velocity. And this is where problems can arise, and we come to the title of this article.
Because story points are an estimation of effort, it seems logical to use them for stories about fixing bugs. After all, we can reason about how hard it is to fix them. A two point story about fixing a bug will probably take about as much time as a two point story introducing new functionality. If we are thinking about how which stories we will work on in the next iteration, we need to weigh those two stories against each other, both in cost and benefits. Assign story points to bugs and you can.
The problem is that by measuring and trying to increasing velocity you become incentivized. Not because it influences your salary, but merely by the fact that is is the only metric that captures the notion of value delivered to users. It has become a proxy for everything that is involved with delivering functionality. Every aspect – technical, functional, organizational – reduced to this one number. Increasing velocity means performing better as a team. Quite an incentive.
And here is the rub: incentives work. They work too well. Not just for bankers or salespeople, but also for developers looking to increase their output. The beauty of velocity is that it encompases quality, but the introduction of bugs is not delivering quality work. In fact, fixing a bug is doing work that should have been done in the story that introduced it. So really, if you assign story points to bugs you are rewarding low quality work. The effect may not be as dramatic as in a classic DailyWTF article by Mark Bowytz, but assigning points to bug stories is like saving money by buying extra groceries on discount: it feels like you are saving money, but you are really spending it.
(A similar point can be made for chores: stories that do not introduce new functionality, but fix architectural problems, add debugging tools, automate manual tasks, etc. These stories represent either refactorings that should have happened earlier, or groundwork for future functionality. Either way, the value of the work should be attributed to stories that deliver functionality and chore stories should not count towards velocity themselves.)
Let me give an example. Say a team delivers stories totalling 10 story points in an iteration, but introduces bugs. The bugs are estimated at a total complexity of 4 story points. In the next iteration, they need to address the bugs, which will take time that is not used for new functionality. Suppose that by going a little slower they would deliver only 8 points worth of functionality, but avoid introducing bugs. Would that have been better? Probably, but if they assign story point to bugs, their velocity would be lower.
A way out
So if bug stories get no points, how do I express the work that needs to be done on them in the coming iteration? I like James Shore’s approach of breaking stories into tasks at the start of the iteration, and estimate those in ideal hours. It may feel redundant, but it is not. First, ideal hours are a much finer metric than story points. During the iteration, progress on tasks can be tracked, leading to much faster feedback when things go wrong. Second, after having broken up the stories into tasks, you may have discovered that the implementation will take extra work because of technical debt. In other words, you may be able to measure and take into account one important factor of the velocity here. If you find that the average number of ideal hours per story is increasing you may need to look at code quality. And when iterations start to include more bugs, this is exactly what should happen.
Velocity is our best measure of software development performance. Resist assigning story points to bugs, separate feature complexity from development time instead.
|illustration:||Photo “Missed segment of the necklace” by Janith Bandara (CC BY-SA 2.0)|
|photo:||Photo “Agile planning poker in person” by Alex Vorbau (CC BY 2.0)|
|photo:||Photo “Bugatti Type 41 (Royale) Coupé Napoleon” by Bugattist (public domain)|
|icon:||Icon “Carrot and stick” by Luis Prado (CC BY-SA 3.0)|
|diagram:||Diagram “Story point example” by author (CC BY 4.0)|
|photo:||Photo “Task board” by Logan Ingalls (CC BY 2.0)|