Jira tip – patrol your project with Subscriptions

A Product Owner I was coaching recently requested that I use Jira Wizardry to change her Jira project’s Permission Scheme to limit the number of people who can create tickets within it. Apparently there were some rogue ticket creators in the organisation who needed reining in.

Now just because you can do something in software doesn’t mean you should. This seemed to be one of those cases where we should apply the XP principle of Do The Simplest Thing That Could Possibly Work. Specifically, most teams don’t want or need the additional Jira Admin load of maintaining/tweaking custom Permission Schemes for all their projects.

Fortunately I was able to re-use a solution that had worked for another colleague in the same organisation; a Compliance Officer who wanted an easy way to check the work being picked up by a development team so she could scan for potential compliance risks.

The Simplest Thing That Could Work in this case was to use a little-known but very useful bit of functionality within Jira, which is that you can have custom search results emailed to yourself at an interval of your choosing. I’ll show you the steps below, though I won’t try to reproduce Atlassian’s detailed step-by-step documentation – I’ll just skim through the main points.

The first step (as almost always in Jira) is to know what JQL query you need to return the set of tickets you are interested in. In this case, something like the following search seemed correct:

project IN (MEM) AND issuetype NOT IN (Sub-task) AND createdDate >= startOfDay(-3d) ORDER BY reporter

…pulling out all the relevant tickets created at some point in the last 3 days (three days because on a Monday morning you’ll be wanting to see tickets created during the previous Friday, which was 3 days ago), ordered by Reporter, which is the slightly odd Jira term for “the person who created the ticket”.

jira-subscriptions-1-search

After running that search and confirming that the right set of results are being returned, the next step is to save the search. In Jira terminology a ‘saved search’ is called a “filter”, and a good tip for managing filters is to preface the Filter Name with your own initials; this makes it easier to find your own filters later when they are presented in a (potentially long) alphabetically-ordered list that includes many filters from other people.

jira-subscriptions-2-save-as-filter

After saving the filter, you may now want to share it so that your colleagues can also benefit from viewing it and/or subscribing to it.

The final step is to ‘subscribe’ to the filter, which is Jira-speak for “having the results emailed to you at an interval of your choosing”. You enter the Subscription dialog box having found the relevant filter in your Manage Filters list (see under the main ‘Issues’ menu item found on any Jira page) and clicking on ‘Subscribe’:

jira-subscriptions-3-subscribe

…and it should be pretty obvious from there:

jira-subscriptions-4-subscription

And you can test it works straight away. You now have the option to edit that newly-created subscription from your My Filters list:

jira-subscriptions-5-subscribe

…and that gives you the ability to run the Subscription (ie send the email) immediately, rather than waiting for the scheduled time:

jira-subscriptions-6-edit-subscription

Both the Compliance Officer and the Product Owner now have an email arriving in their inboxes every morning at 09:00, which takes about 30 seconds to skim. This tells the Compliance Officer whether there are any new risks she needs to address, and tells the Product Owner exactly who has been creating new tickets in her project. Both can then take appropriate action.

Once you’ve done it a couple of times you can set up / amend / delete filters and subscriptions in a minute or two, a beautifully clear example of applying the Do The Simplest Thing That Could Work policy.

Jira subscriptions are a useful trick, and I’d recommend having a play around with the functionality even if you have no immediate need – you never know when it might come in useful.

How much should you plan?

There is a mass of literature about how, how much, and how often teams should plan. Most people recognise that we should avoid the extremes of overly heavy processes (the Sin of Waterfall) and the anarchy of pseudo-Agile, which often manifests itself as “hey, Agile means you don’t have to write anything down!”.

The most commonly used pattern that is meant to guide teams between these two extremes is that of Scrum, which mandates a single, two-hour Sprint Planning meeting, once per sprint, which produces in one go the correct amount of planned work for the whole team, which would be 100 man-days of work for a 10-person team (all of these figures presume a two-week length sprint).

In practice there is a large amount of unavoidable variability in the execution of a block of 100 man-days of effort, and work is constantly being discovered or reprioritised during the course of the sprint, meaning that the output of a single Sprint Planning meeting at sprint start is often wholly inadequate for the rest of the iteration.

The better, alternative approach is to apply to planning the same principles that almost all software development teams already apply to deploying – that it is best done little and often. You need to be able to decide on any given day whether you have planned enough, or need to plan some more (possibly that very day), and the good news is that a well-functioning Agile Board is constantly giving you this information, if you know where to look.

The key concept is that there is a status – in software development teams usually called Ready for Dev, aka Ready to Start – which represents a buffer state between the end of the planning/preparation process, and before the start of the delivery/execution phase. This column holds your stock of planned work.

agile-board-ready-for-dev

So how much stock should you keep? Well, it is possible to plan too much, and to plan too little, and between those extremes lies the Goldilocks Zone where you have planned the right amount.

The problem with planning too little – ie there is a small number, or even zero tickets in Ready for Dev – is hopefully clear; the delivery part of your pipeline will be starved of fuel, and will grind to a halt. Yes, you can do some emergency planning there and then, but this is an inefficient and stressful way to work.

Less obviously, it is also possible to plan too much. Perhaps it is tempting to get 100 tickets to Ready for Dev, and then you won’t have to do any more planning for six months? The problem here is that planning has a shelf-life; all plans are predicated on assumptions about the state of the overall codebase, the product and the company at the moment delivery starts. If the gap between planning and execution is small, these assumptions are more likely to be correct.

But if that gap is large – say you plan a piece of work now but don’t start it for a year – it’s very likely to need replanning from scratch (eg because the tech landscape has completely changed) or to be thrown away entirely (eg because that feature is no longer a priority). So you invested a large amount of time and effort in that huge planning exercise, and in the end most of it was waste – that is the consequence of planning too much.

So how do we get to the Goldilocks Zone where we constantly have the right amount planned? This amount is contextual – it varies for every team – but the principles are universal.

Begin by picking an approximate period of time for which you want a stock of already-planned tickets. You might for example want “planning” to be 2-3 days ahead of “delivery”, meaning that if you stopped all planning activities immediately, it would be 2-3 days before Ready for Dev runs out of tickets.

How do you know what period of time is appropriate to pick? That depends on the availability of the people who do the planning. If the people required for planning are all highly available (ideally, co-located) people who are dedicated to this one team, they should be able to switch to planning activities at very short notice (ie they can plan on the same day a deficit is noted). This setup requires a smaller buffer of planned work. Different setups need a bigger buffer to make up for the extra time it will take to get new planning scheduled.

This period of time then needs to be translated into a desired number of planned tickets. Here the main factor is team throughput; the quicker your team gets through tickets, the more you need in your Ready for Dev buffer, so if a team gets through 3 tickets a day and you want a 3-day buffer, that’s 9 tickets.

This calculation works fine if your team is very cross-functional, meaning anybody can pick up any ticket, but if you have specialists on the team who can only work on specific kinds of tickets, you will need to factor that in, and ensure that each separate specialism has enough of their kind of tickets in Ready for Dev.

Having thus calculated an approximate target for ‘planned tickets’, the deciding factor – as ever – is experience. If you observe a team’s Agile Board regularly over a small number of weeks, it should become clear what the right number of tickets in Ready for Dev is for that team.

Teams applying this method may end up cancelling all regular planning sessions; instead they decide during each morning’s Stand-up meeting whether or not a planning session is needed for later that day. This is a truly Lean/Agile team, where processes have become highly dynamic and responsive, and ‘rolling wave’ planning has been truly implemented.

Rolling wave planning can apply at the strategic level too. Just as a delivery team can get its planning down to a snappy half-hour meeting every other day or so, a company can get its detailed roadmap planning down from six-monthly or quarterly (which is often regarded as a triumph) down to something closer to a monthly rolling wave. The same preconditions – an accurate portfolio-level Agile Board, and good availability from the planning resources – are of course also needed to achieve this goal, which produces the same rewards of more responsiveness and less waste.

A checklist for a well-organised team

I was coaching a software development team recently, and following the day’s Stand-up I asked the TPM* of the team a series of questions about how the Stand-up had gone, and how her team is currently working. It then struck me that these questions are not a bad checklist for judging the level of organisation of a team, so here they are:

  • Does everyone in the team know & agree the priority order of all its tickets?
  • Does everyone instantly understand what work a ticket describes?
  • Does everyone instantly know where that work has got to?
  • Does everyone give crisp, clear answers to what’s the next step and who does it questions?
  • Does everyone understand the whole work queue so that they know what they should be picking up next?
  • Does the board clearly and simply tell the truth about the team’s work such that anyone, inside or outside the team, can get a pretty good idea of What’s Going On in a 5-second skim?
  • Does the team have a good grip of the incoming work, and continually manage to process that work into tickets smoothly, efficiently & in timely fashion?
  • Are the often complex decisions & assignations around tickets that are made in the Stand-up (& at other times) clearly recorded somewhere for future reference and to avoid endless arguments, confusion and waste over what was actually decided?

The TPM’s answer was: “definitely ‘no’ to all of those, and in no team that I’ve worked on was that the case (well, maybe one, but they were so unhappy and under so much pressure that I wouldn’t call them a successful team)”.

This answer saddened but did not surprise me. My own experience is that it is absolutely possible to get teams to a state of self-organisation where such questions are asked, and easily answered; and moreover that such teams tend to be happy ones, since attaining this improved level of organisation involves reducing levels of waste, streamlining processes down to only those that actually help, and seeing the team’s work makes an actual, real-world difference. These things are enormously motivating.

So how to achieve this nirvana, asked the TPM. My short answer was that the preconditions for achieving this level of organisation can be divided into two categories.

The first category is attitudes within the team, where the following is required:

  • discipline
  • common sense
  • a desire to change
  • a willingness to try out new ideas & properly evaluate their impact
  • a commitment to thinking your way through problems
  • a commitment to continuous improvement

Secondly, there are some non-negotiable tactical tools that must function well in order to get to this good place:

  • an Agile Board that works right
  • a Stand-up that works right

As for the long answer – well, that answer is here 🙂

*TPM=Technical Project Manager. Also known as Agile Project Manager, or (if you’re running Scrum) the Scrum Master.

A simple, generic Agile Board workflow

There’s an art and a science to creating good Agile Boards. As with any truly Agile endeavour, your board design should evolve over time in response to a continual inspect and adapt process, so in theory, whatever your starting point is, you’ll eventually end up in a good place, where the board is a decent model of your team’s work.

But of course the better your start point, the fewer iterations it will take to get into that sweet spot where your board really works, so let’s examine what a good start point might be.

Many teams are somewhat intimidated by more than a few columns on the board (especially to start with), and might not feel comfortable beginning with the kind of detailed workflow that I outlined here and here. And that workflow was for managing software development, which you might not be doing.

So here I offer a simple generic workflow that is a worthy start point for any team; and I really mean any team – there’s nothing in what follows that references software development, or any other technical work, so this could work for a Finance Team, for example.

agile-board-generic-starter-workflow

Here’s how the lifecycle of a generic piece of work maps to these columns:

  • A ticket describing a piece of work is created. This ticket has not yet been properly evaluated, and for all the team knows it could be anywhere on the value scale from “brilliant and/or extremely urgent idea”, to “complete nonsense that deserves immediate deletion”. Tickets in this “they’ve been created and nothing else meaningful has happened to them yet” state are in the Candidate column. Other names that work for this column are Inbox, Idea, the Jira default To Do – and you may have better names still.
  • This next thing that needs to happen to a ticket is to flesh it out such that it is ready to be worked on in earnest. That fleshing out is captured by a column called Defining. At a minimum, an adequate ticket definition might be as simple as a one-bullet-point Definition of Done. It might require more, maybe much more – detailed Business or Technical Analysis, accompanying files of various types such as Spreadsheets, Process Flows, Mockups or Wireframes… it’s all contextual, and in this simple workflow it’s all done under a heading of “Defining”.
  • When a ticket has been properly fleshed out – meaning that representatives of all the downstream functions (the roles that will need to work on the ticket later in its life) have agreed that what’s written in the ticket enables them to efficiently take on their portion of the work when the ticket lands with them – then the ticket is Ready to Start. This, technically, is a “buffer state” – it represents a gap between activities, rather than an activity itself.
  • When the real meat of the work is under way, the ticket is In Progress. I actually don’t much like this term for a column, as every state between the first and last is some kind of “in progress”. For software development teams a better term for this state is In Dev, but for non-tech teams I’ve yet to think up a better name.
  • Any non-trivial work should always have a second pair of eyes on it before it is presented back to the people who requested it; you don’t want to look a fool in front of your customer, after all. This is the Reviewing step. It’s might be performed by a peer of the person who did the work, perhaps by someone in the same role (eg one developer checking another developer’s code), or it might be done by a specialist function (eg QAs/Testers). As ever, such decisions are contextual; but that non-trivial work should be properly reviewed/checked is as close to a universal truth as you get, and that’s what this column represents.
  • Finally, whoever wanted/requested the work (the Customer, or their proxy) needs to agree that they’ve got what they wanted (or near enough, dammit) before you can declare the piece of work to be “Done”. This is the Sign-off step, and in a well-functioning team should be unproblematic. After all, all that’s required for Sign-off is to compare (i) what’s been delivered with (ii) what was agreed would be delivered, and since (ii) is definitely written in the ticket (right?), and since you make every effort to regularly update the Sign-off people on progress (perhaps they come to your Stand-ups), there ought to be little in the way of controversy or surprises in this step. If a team is having problems signing work off, the root cause(s) are usually to be found upstream, often all the way upstream at the Defining stage, and need to be solved there.
  • Hopefully I don’t have to go into great detail about what Done means.

So there you have a simple, generic workflow that’s a decent starting place for any team representing its work on an Agile Board, and may turn out to be fine longer-term too. Remember; this is all empirical, so inspect and adapt, keep what works, and change what doesn’t!

The Perils of Project Parallelisation

In previous articles we have been at pains to stress the key point that a team only provides value to an organisation when it finishes work, which for a software team generally means ‘having working and tested code in the production environment’. To maximise this value we need to focus ruthlessly on getting stuff finished, and preventing unfinished work from piling up.

A common and seemingly logical response to such delivery pressures is to try to bring forward the start date on important work (and there’s never a shortage of that – sometimes almost everything seems important). After all, the sooner you start, the sooner you’ll finish – right? In simple situations this may be true, as this Gantt Chart illustrates:

Gannt Charts 1 - start earlier, finish earlier

What about more complex and realistic situations? If we have three projects, is the optimal approach to bring all the start dates forward as far as possible and run them concurrently?

Gannt Charts 2 - too much parallelisation

It turns out that in these situations, the adage “the sooner you start, the sooner you’ll finish” may NOT be true, however intuitive it seems*. In fact, starting too much too soon can lead to catastrophically non-productive teams, and a huge waste of time and money.

Let’s illustrate this with a fun example. We are building Christmas Trees, and it is a four step process. Only when all four steps are complete do we have a finished product that we can sell, generating value for our organisation – before that point nothing can be sold, and no value at all has been generated.

Xmas Tree 0 - steps to complete

Now imagine we have four projects, to build four differently coloured trees: one yellow, one red, one green and one blue. Across the four products that makes a total of 16 steps. The temptation is to start all four projects as soon as possible, working on all of them in parallel, in the hope that this will bring forward their end dates (the thing that matters) as well. Let’s see if that’s what actually happens.

First let’s visualise doing the projects in series, with a smiling customer greeting each finished tree:Xmas Tree production line 1 - working in series

Now let’s compare what happens if we get all four tree projects started as quickly as possible, and keep all four going in parallel until all are completed. Given a fixed resource base, this means people will have to continually switch between projects as we go:

Xmas Tree production line 2 - working in parallelAll we have done is swapped the order of the same 16 steps, so the total time to complete all four projects is the same. Did we achieve anything by parallelising? Well yes, but nothing good – we have in fact massively delayed the delivery points to our customers of three of the four projects (yellow, red and green), while blue remains unchanged. In other words, we have made 75% of our projects take longer by over-parallelising.

Xmas Tree production line 3 - series and parallel compared

And in fact reality would be much worse than this. Writing code is an intense intellectual activity, where true productivity comes in those precious periods where you manage to ‘get in the zone’, for some people an almost trance-like state where you are juggling multiple variables and possible paths in your short-term memory. It is perhaps comparable to playing chess, or writing a novel. It can take a long time to get into that mental state, and interruptions can be fatal, especially long ones. Post-interruption, you can’t simply hop straight back to where you were in that complex mental process; you may have to start again from scratch, and might never get that inspiration back in quite the same way.

This means that context switching has a big efficiency penalty. Let’s reflect that in our diagrams; the parallel example has a ton of context switching, further delaying all the deliverables. Compared to working in series (where there will be context switching only once a project is complete), we can see the terrible effects of excessive parallelisation on delivery timelines. Now all four projects are seriously delayed by working in parallel:

Xmas Tree production line 4 - series and parallel compared + context-switching

And we are ignoring other factors still. The biggest learning opportunity in product development occurs when you deliver a working product to your customer and get real market feedback. Working in series, we bring these learning opportunities forward, and the learning is compounded each time. In contrast, in a ‘release everything at the end’ big-batch approach, the learning opportunities are deferred, and any mistakes made in manufacturing the first tree will already have been repeated in the other three.

Let’s factor this into our example. A team working in series has, through some mix of customer feedback and introspecting on their experiences, discovered feature improvements (a plant pot, star decorations) and simplifications/savings (a tree that can be manufactured in three steps rather than four):

Xmas Tree 6 - product evolution

Including this into our diagram shows even greater benefits of working in series; not only is the speed-to-market gap bigger than before, as a process improvement is leveraged in subsequent production runs, but now there is a quality gap too thanks to the faster learning cycle. Working efficiently in series, you get stuff sooner, and it is better stuff:

Xmas Tree production line 5 - series and parallel compared + ideas

These examples illustrate the real life penalty of not focusing ruthlessly on getting things done and delivered in a sensible, priority order. All of these gains come through focus and discipline, not working extra hours; they are the epitome of “working smarter, not harder”. They are also a direct realisation of key Agile Manifesto principles: deliver working software frequently, and working software is the primary measure of progress.

Of course, none of this is meant to imply that no parallelisation whatsoever is permitted – as ever, the extremes of the argument are absurd. A measured level of parallelisation, where resources that are no longer needed for a first goal drop naturally down to a second, is often optimal:

Gannt Charts 3 - just right

But of the two extremes, it is a rare thing to find an organisation that works too much in series. It is far more common to find teams obsessed with starting but not finishing, where focus is not maintained, attention is allowed to wander, and precious resources are frittered away for little reward. Try the following cartoon out on your tech team: the chances are that the reaction will be a somewhat rueful smile…

Multi-tasking - new framework side project

In a future article we will see how the concepts discussed here apply not just at the strategic level (roadmaps, programmes, projects), but also at the tactical level (bugs, stories etc – tickets that appear on an Agile Board), where limiting WIP (Work In Progress) is critical to delivery success.

*see Essential Kanban Condensed

An Agile Story – a guide to this website

The number of articles here on Agile Fixer is growing all the time. This article is designed to help you navigate to those that may be of particular interest.

Once upon a time, in a wonderful far-off land called “Agile”, there lived a man, who told a tale…

So I work in a biggish company, and we have quite a few software teams. I know some of the tech guys personally, and it’s my job to know how much they collectively cost – and it is a lot. We’ve had a patchy history of building apps and websites in this organisation. Software development seems very hard to do well, with many teams struggling with their processes, and unable to give convincing answers to even the most basic questions about how much work something involves and when it will be done.

In the last few years a lot of our teams have started doing a thing called ‘Agile’, which apparently is a better way than whatever they did before. But it’s certainly no magic bullet – it seems to introduce a lot of confusing jargon; for example, some of these teams now say ‘when something will be done’ using a made-up unit called ‘Story Points’, which doesn’t seem to be any real improvement if I’m honest. Perhaps part of the problem is all the big, long meetings these expensive techie people spend their time in (many of the developers actually complain bitterly about this) – I sure hope they’re using this time wisely.

Agile heaven - it doesn't fix everything

Actually these techies have one meeting that seems a bit different from the others – it’s a quick one in the morning that is often held in an open space, with a lot of people standing up. You only see the techies doing this, nobody else in the organisation; it’s a bit odd, really.

When you watch them in these morning meets nobody seems to take notes, and they’re almost always gathered around a whiteboard, or a big TV screen. Apparently they put a visualisation of the team’s work up on that screen; some teams reckon this visualisation is really critical and think carefully about how to do it well, while others seem to be much more slapdash about it. One of the senior tech managers told me that the teams that take these visualisations more seriously seem to work much better than the rest, both internally and with the rest of the organisation. I wondered why they don’t just make all the teams follow best practice, but there seems to be an ideological reason behind these very different attitudes.

Curious, I chatted to a guy from one of these ‘better’ teams. He gave me a run-through of what’s on their big screen, which is called an ‘Agile Board’. It’s a 2-dimensional grid, basically. There’s a whole bunch of columns, which show you how work is progressing. I asked how they choose the columns, and was told it varies from team-to-team, but that one way to think about it is you first need the steps to get work ready for the developers to start, and then you need the steps for the coding itself, and the testing, until it’s all delivered to the customer. Makes sense.

I told them that in our business unit we don’t have any of that. But perhaps we should – we’re often in a bit of a mess ourselves, truth be told. Our managers try to prioritise our work, but it just seems to create more noise rather than helping. “That’s why we use these Agile Boards – to make it really clear what everyone should be working on” said my tech friend.

So I went and had a closer look at these boards. They’re not just divided into columns, they have rows (called ‘swimlanes’) as well – this seems to be a key part of how they sort out what their real priorities are. One of their key concerns is to make sure people get stuff properly finished before picking up something new (they have some some techniques to help this) – a bit of that discipline would help in my area for sure.

I’m not saying we’re bad or lazy workers, mind – often it’s our bosses’ fault, since they cancel or change work after we’ve already started on it, wasting a ton of time. We should be a lot stricter about what work gets started and what doesn’t. My tech friend said his team have an excellent way of doing exactly this, which they call a ‘triaging process’.

The more we talked about these boards, the more interesting they seemed. The tech guys reckon that work spends most of its life just sitting around, waiting, rather than being acted upon, and if you highlight and clamp down on these waiting periods, you become a more effective team overnight. This really is starting to sound like “working smarter, not harder”.

I’m gonna have to have a think about how to apply some of this thinking to my own team. Apparently recruitment are using a similar system to track the state of their hires, so it’s definitely not just for techies!

To be continued (as more articles are added)…

Agile Board design – buffer states

You’re a day away from a release deadline, and disaster strikes; two of your four QAs, half the team, are struck down with a nasty stomach bug. We need to sort out immediately what that means for our release; what testing work has already started and now needs to be reassigned, and what hasn’t even been started yet, and will therefore probably get dropped out of this release entirely. Does our Agile Board help with such decisions by making it instantly obvious what has started, and what hasn’t?

This is the kind of common real-world situation that can be helped by the good use of “buffer states” on your board, and that is the topic of this article.

As we know from previous articles, an Agile Board shows the workflow of a team, the series of steps its work goes through on the journey from conception to finished. Along the way the work is passed along a virtual production line, with various different combinations of people engaging in different value-adding activities, until we get the finished product.

The ideal is that the delays between (and within) each of these value-adding activities is minimal or even nil, the appropriate metaphor being a slick sprint relay race:

baton-passing-in-relay-race

Unfortunately the real world is rarely anything like this. A good team might approach this ideal in dealing with an Emergency, where everything is sacrificed for speed whatever the cost in disruption elsewhere, but for normal work there will most certainly be gaps where work sits idle. In fact, when measured, most work spends a shockingly high percentage of its life sitting around unattended, waiting to be picked up. The metrics of work idleness*, ‘touch time’ and ‘flow efficiency’, will be the topic of a future article, but for now it should be clear that reducing such idleness to a minimum is one of the main objectives of a good team, and visual management of idle work is essential to that goal.

This means separating out such idle states and making them visible as columns on an Agile Board, appearing as ‘in-between’ columns or ‘buffer states’ that separate out the columns for the main value-adding activities.

If you’ve ever worked with an Agile Board you have likely come across buffer states, even if you didn’t think of them that way. Even the simplest board of all, “To Do — Doing — Done” is really nothing more than a couple of buffer states around a single activity, and it can be rewritten to a format that makes this explicit:

agile-board-simplest-with-equivalent-expressed-as-buffer-states

Let’s apply this thinking to the example workflow sketched out across two previous articles (here and here), to see what we can learn. After such an exercise we will have a deeper understanding of this workflow, and will have a conceptual framework for further tweaking and improvement. Here it is again, a complete workflow across a dozen columns:

agile-board-complete-workflow

These columns tell us that we started with an idea, and then applied the following list of discrete activities to it:

  1. Triaging
  2. Ticket Prep
  3. Development
  4. QA
  5. Sign-off
  6. Deployment to live

So what happens if we take this list, and just wrap each item in buffer states? Let’s do that for the first two adjacent activities, Triaging and Ticket Prep, and see what results:buffer-states-around-triaging-and-ticket-prep

The first thing to notice here is that we end up with two buffer states next to each other, which makes no practical sense. A ticket which has completed Triage simply is ready for Ticket Prep, so the two columns Triage Complete and Ready for Ticket Prep are duplicates. One of the redundant columns must be removed; it doesn’t matter much which one goes, but it’s nonsense to have both.

With this in mind, let’s see what happens if we apply just a pre-activity buffer, ‘Ready for X’, before each of the six main activities in the workflow. This gives us the 14 column workflow below, with the buffer states highlighted; collectively, they now show all the states where no value-adding work is being done.

buffer-states-added-to-whole-workflow

Comparing this to the original workflow, two useful steps have been lost, Handshaking and Code Review. These are neither buffer states, nor a main activity state. Instead, both of these are reviewing steps – Handshaking is a review of the Ticket Prep activity (to see if it was done to the required standard), and Code Review is the same for the Dev step. Let’s insert these two states back into the workflow, renamed to be consistent with the other steps as “Ticket Prep Review” and “Dev Review”. This gives us the 16-state fully-buffered workflow below:

buffer-states-and-review-states-added-to-whole-workflow

Now let’s compare this to the original 12-state workflow, with its (three) buffer and (two) review states highlighted:

agile-board-complete-workflow-with-buffer-states-marked

So which of the two is better? The differences come down to column naming (which is a matter of taste), and whether every discrete activity is buffered. Is it actually better to buffer every activity?

Arguably it is. In a previous article on triaging, we saw that that buffer states make it easier to describe the precise meaning of each column, as you are no longer munging both waiting and doing into a single column. Is that sufficient to declare we should always buffer every activity on a board? Not quite; there is a countervailing pressure, that Agile Boards with too many columns can become cumbersome and hard to scan and read, whether they are physical or implemented in software.

Physical boards with 16 columns are relatively rare; you need a mighty wall to display it on, and there aren’t too many of those available in most offices. For software boards, the limitation is the legibility of a screen with many columns on it. If everyone in your office has monitors with huge resolution, or (perhaps soon!) Minority Report style VR screens, you won’t be affected by this limitation. But most of us will sadly run out of wall inches or readable pixels, and so there is some reason to limit the number of columns where possible, unless you wish to split the board in two (which can be done – I will cover this in a future article).

Let’s return to our original 12-state workflow, and see what buffer states are and are not included. It turns out that none of Triaging, Ticket Prep or Sign-off are buffered, and Deploying to Live is effectively unbuffered as well, with one column (‘Ready for Live’) representing both buffer and activity itself. Activities that are buffered are Dev and QA. What are the reasons for treating different activities in this way?

In the real teams where this very Agile Board setup worked well, the unbuffered activities of Triaging, Sign-off and Live Deploy were relatively quick, uncontroversial and unlikely to go wrong, and so were able to function with a less detailed board representation. Coding (Dev) and testing (QA) are almost always, by contrast, multi-factorial and highly complex activities, drawing on the bulk of the team’s manpower and budget, and with equivalently high need for transparency, hence the decision to buffer those columns.

Between Triaging and Ticket Prep a buffer might have been helpful, but both activities were owned by the same Product sub-function, and so tended to be run by the same small group of people. The resulting good interpersonal communications and the ease of ‘handing work over to yourself’ meant that extra detail on this part of the board was unnecessary to smooth team operations.

So it turns out that the buffering decisions on this Agile Board were well fitted to its context; there is a rationale for buffering some activities but not others. A fully buffered Agile Board may be a wonderful thing – but if you can reduce column count without compromising the flow of work, that is also fine. And the only test that matters is the empirical one of whether your setup works well in practice.

Finally, I will show a slightly different kind of board setup which you sometimes see in the literature and in the workplace. Many Agile Board software systems do not allow sub-columns (which is why my examples show simple columns that are the height of the whole board), but some software systems do, and of course on a physical board you can have any setup you like within the available space. Allowing sub-columns leads to slightly different board layouts such as the following, taken from this book:

agile-board-with-sub-columns

Whatever you think of the choice of just three main activities, this board design does neatly buffer each activity with a sub-column titled ‘Ready’, and is a common layout that you should be aware of as a variation.

In conclusion, we have drawn attention to important ways that Agile Board columns can be classified into principal activity, buffer and review states. We have seen that where activities are complex and expensive, you may expect a greater benefit from going into more detail on your board by adding buffer and possibly review states. Where activities are relatively quick and straightforward, you may wish to forgo such details and see if a single column for the activity will serve your needs. As ever, it is the context that decides, and it is up to you to find what setup and tweaks work optimally for your teams.

*Note: this is idleness of work, emphatically not the quite different concept of the idleness of people. The former must always be cracked down on, but the latter is a more nuanced thing; indeed, having some slack built in to people’s normal work practices is essential to allow for responsiveness. This will, again, be a topic for a future article. 

Triaging Part 2 – practical details that make the difference

In a previous article we introduced the core concepts of triaging: the origins of the term in emergency medicine, what it means to have triaged a ticket in the world of software development, and the basic metrics around doing it well. In this follow-on article, we will look at a few of the subtleties in this very first part of a team’s workflow, the steps from ticket creation through to finishing triage.

So what are the ways these steps can vary between teams? Let’s consider a few:

Who can create tickets?
Teams can and should formulate (then publicise, and enforce) their own rules around this. At one extreme, it could be “anyone at all”, even people outside the organisation. Or you could be more restrictive: only people inside the organisation, only people in this particular department, only members of this team, only selected members of this team, or only one particular person.

There’s a trade-off here: the smaller the number of people who are allowed to create tickets, the easier it is to standardise the ticket creation process around your optimal set of requirements (eg what data fields must be filled in so a ticket is triageable), but the more you need to work to establish good comms methods for allowing outside people’s work requests to be translated into tickets by these Creators. If, on the other hand, Ticket Creation is open to many people, the team needs to carefully police the quality of created tickets, and publicise guidelines around doing so. There’s no right answer here – each team has to figure out their own arrangements by an empirical process of trial and error.

Who can triage?
Again, this is contextual, but a typical arrangement is that the Triage Team is a combination of internal team management roles, people with such titles as ‘Tech Lead’, ‘QA Lead’ and ‘Product Owner’.

What needs to be recorded?
Agile doesn’t mean chaos or never writing anything down, but there’s no point being wasteful and requiring unnecessary documentation either. With triaging it is often sufficient to take less than a minute to make just a single comment (eg in a Jira ticket) along the lines of “Ticket triaged by Rob and Jenny – bottom of the backlog”. That simple audit trail can save a mountain of pain later.

How do you keep on top of your triaging?
Given that we want Triage Time and Untriaged Count to be as near to zero as possible at all times, we need untriaged tickets to show prominently somewhere, as a reminder that there is triaging to do. Jira, arguably the most powerful work management system commonly used by software teams, tends to dump minimally-filled-out new tickets right at the bottom of the backlog, where they might not get looked at for a very long time. If you are in that situation, you need to have a robust team process to combat Jira’s tendency to hide these new tickets away.

When I run tech teams, I often take on myself the task of scanning the bottom of the Jira backlog several times a day for recently-created, untriaged tickets. I would then move such tickets into the ‘Triaging’ state, stick them (temporarily) high up the backlog so they get noticed by all on the team’s Agile Board, I comment/assign them so my Triage Team are additionally notified by email to make the priority call, and as appropriate use other comms methods (F2F, IM etc) to make sure they know of the incoming wounded. None of this is very onerous if done “little and often”, and it has worked very well in multiple teams.

What options are there for representing triaging on an Agile Board?
In an earlier article we looked at one simple pattern for representing the early part of a team’s workflow, where triaging itself is represented by a single step, with no buffer states on either side:

agile-board-basic-triage-setup

The meaning of the columns in this case would be:

Idea: a ticket has been created in the system, nothing more. It may or may not be fit for the next step in its life. If not, this could be because it contains mostly garbage, or perhaps because it hardly contains anything at all. The creator, with or without help, must rectify this before the ticket can progress.
Triaging: in the absence of any buffer states, this column contains tickets that have been assessed as ‘fit to be triaged’ but where that process has not yet started, and those currently undergoing triaging.
Ticket Prep: in the absence of any buffer states, this column now represents tickets that ‘have been triaged and no more’, and tickets where in addition ‘more detailed ticket prep has already started’.

An alternative Agile Board setup, with buffer states either side of the main Triaging activity, might look like this:

agile-board-advanced-triage-setup

This time the column semantics are as follows:

Idea: a ticket has been created in the system, nothing more. Nobody other than its creator necessarily knows what’s in it.
Ready for Triage: this “pre-state buffer” means that someone has checked the information in the ticket, and verified that it is sufficient for a triage decision. If such a check reveals that a ticket has failed to meet this standard, that ticket stays in the Idea state and is appropriately reassigned and commented.
Triaging: the triage itself is happening now.
Triage Complete: a “post-state buffer”. Triaging has finished, meaning that the ticket has been placed in an appropriate slot in the team’s priority-ordered list of work. Nothing more has yet happened.
Ticket Prep: the next major step in the ticket’s life after triaging has now begun

This extra detail on the board has a cost and a benefit. Having a large number of columns can cause issues both with physical boards (are they wide enough to fit all the columns?) and software boards (does your monitor have enough pixels to display all columns legibly?). The benefit is that you are more accurately able to represent what is really going on – see how much easier the column semantics were to explain in the second example above. Teams must figure out by trial and error what columns work best for them, but I often find that you can pare columns down around the activity of Triaging, but definitely need buffers around some downstream activities such as Dev and QA. Buffer states will be discussed in more depth in a future article.

How exactly do I collect Triage metrics?
The two metrics introduced in the previous articleTriage Time (average time to get new tickets triaged) and Untriaged Count (how many current tickets haven’t finished triage yet), are read off your Agile Board, and so of course depend on how the board is set up. In the two examples we are discussing, the detailed setup contains three columns where tickets would count as untriaged, and the simple setup just two, as illustrated below. The diagram below shows how the columns map to each other across the two variants, and the “Triage Line” indicates the point at which triaging has been completed in each case:

triage-complete-shown-on-different-board-setups

To conclude, if you care about smooth and efficient software delivery, there’s much important work to be done well before the first line of code is written. It’s notable that many teams focus only on modelling and tweaking the delivery part of their workflow, and sorely neglect the upstream steps, such a triaging. This neglect only stores up problems that come to light during delivery itself, when they are much more costly to fix.  Hopefully between this article and the previous one you have the guidance you need to put a robust Triage Process in place in your teams; the rewards for doing so will be felt throughout the rest of your work.

Triaging Part 1 – why it matters, and how to do the basics

Triaging is a critical step in the lifecycle of any piece of work. I touched upon it in a previous article, and here we go into it in more depth.

The term has a medical origin. Deriving from the French triager, meaning ‘to separate out’, it is best to imagine either a military hospital near the front (known to many as a MASH unit), or the Emergency Department of a civilian hospital. New wounded are brought to the unit at unpredictable times, in unpredictable quantities.

The initial processing of these new arrivals is urgent, and must be done quickly and efficiently; there are, after all, limited resources of medical care to go round, so what is available must be used as efficiently as possible, where you will get the most reward (lives saved) for the effort. To achieve this goal this the wounded are ‘triaged’, which means divided into three categories:

  • the worst cases – they will die anyway. These ones should be made as comfortable as possible, but they are actually not the highest priority, as medical intervention will make no difference.
  • the minor cases – they will survive anyway. Obviously these are not the highest priority.
  • the ones where immediate action will make the difference between life and death. It is these that are the highest priority – they must be prepped for surgery (or whatever is appropriate) as fast as is humanly possible. Every second counts.

Thus medical triaging amounts to a quick prioritisation exercise. It is done on the basis of limited information (you can’t spend hours investigating each new wounded case, you have perhaps minutes or seconds), but you obviously can’t do it on the basis of no information. The skilled triager knows exactly how much information is needed to make the triage call; he gathers that amount (no more, no less), makes the decision, then the patient is moved on to the next stage of their care, and the triage team moves on to the next wounded person, until all have been triaged.

Note that the triage team does not administer treatment – that is the next step. In some cases, that next step might follow one second later, but it is still conceptually a next step, possibly performed by a different person in a different place. The triager has to keep going with the rest of the triage as their highest priority, because the next case could be even worse – until you’ve looked, you just don’t know.

Triage is therefore complete when every single new case has been sorted into the correct category, in other words when the untriaged count = 0. Note that even if the helicopter brings in just one single wounded soldier, you still can’t afford any delays in starting triage, since you have no idea what state that person is in. That single guy could be your top general, right on the boundary between life and death, and a lackadaisical triage process means the chance to save him could be lost.

All of this has parallels in the world of software development, except – hopefully – there’s less blood. Instead of wounded soldiers being helicoptered in to base, the raw material for triaging is new tickets. Starting as soon as possible after new tickets are created, the triaging itself means sorting them into their correct position in the priority-ordered list that is the product backlog. That ticket is then triaged and the triage team moves on to the next one, until untriaged count = 0, when triaging is complete (for now), and the next steps can begin.

The one disanalogy between medicine and tech triage is that a new ticket might not have enough information to be triageable – some people write absolute gobbledigook in new tickets, or barely write anything at all. In that case you send it back to source for more information, and try to triage it again when it returns. There’s no medical equivalent of that – you wouldn’t send a soldier back to the Front to get their wound explained more clearly.

What is the tech equivalent of the three medical triage categories above? One way to think about it is that you are choosing between 6 broad slots in the backlog for a new ticket. For each new ticket, you need to decide whether the right place for it is…

  1. …right at the very top of the backlog, ie it is the most important thing of all. It is quite likely to be an emergency to have dropped in this position, so your team needs a way to deal with emergencies.
  2. …a specific slot in the top quarter of the backlog. If so, pick the precise slot in this quarter with some care, as much of the work this high up will have been started already, and is progressing across your Agile Board. This new ticket might need some extra attention to get it to Ready for Dev quickly, and once there you might then put it right at the top of the R4D column, meaning it’s the very next thing the devs pick up whenever one becomes free.
  3. …somewhere in the second quarter of your backlog. If so, don’t bother being so precise about exactly where in this section it drops – it’s likely to be a bit of time before this work starts advancing across your board, and the backlog will probably have been revised several times before that happens.
  4. …somewhere in the third quarter of your backlog. Even less precision is merited here. Some of this stuff will never get built at all, frankly.
  5. …somewhere in the last quarter of your backlog. Most of this stuff will never get built, and you should consider carefully whether this ticket is even worth preserving at all. If not, you are at the next option…
  6. …the rubbish bin, ie the ticket is worthy only of being thrown away. Maybe it’s a dumb idea, maybe it’s not such a bad idea but there’s just no chance of this happening on any timescale that people remotely care about. Depending on what ticket management system you are using, you may actually delete it, or perhaps just resolve as invalid, or won’t fix, or similar.

These 6 choices are shown in the pictorial example below. In this case, a new ticket ends up being slotted into the bottom quarter of the backlog, option 5:

triaging-process

Once triage is complete (well done! untriaged count = 0), your next duty is to make sure all interested parties know of this triage decision, including whoever requested/created it. In some systems (eg Jira) it may be sufficient to add them as Watchers to the ticket so that they automatically receive email notifications of updates; in other contexts, other comms may be necessary. When these parties do find out how you’ve prioritised their work, they may be fine with it, or may vigorously disagree, but it’s best to have those discussions out in the open immediately, rather than to leave potential future conflicts stewing away in the background.

It’s in everyone’s interest – the team, suppliers, customers, stakeholders – to have a good triage process – there may be some horror lying in wait in that unopened ticket! The metrics around this process are a) ‘Triage Time’, the time it takes for newly created tickets to be triaged, and b) ‘Untriaged Count’, the number of tickets in the system that have not been triaged yet, and are therefore of unknown priority. The aim is to get both metrics as low as possible. Oddly, I have never seen the Triage process specifically called out as a key measure of how well a team is running, and yet in my experience there is a very strong relationship between being a good team, and doing the basics well – and triaging is one of the basics for sure.

In the next article we’ll examine the steps from ticket creation to triage completion in more detail. For now we can summarise the best practice lessons for triaging as: do it quickly and efficiently, don’t let untriaged tickets lie around, aim to get your untriaged count back to zero daily if possible, and let the concerned parties know the results of your triage as soon as you have done it. If you do all this, you’ll be one of the relatively rare teams that have a tight Triage Process, and you’ll have taken a big step towards becoming a good team.

Agile Board usage – why tickets shouldn’t move left

So a ticket has failed QA, and the fail’s a bad one, it needs to be fixed for sure. A dev (probably the one who coded it in the first place) needs to change the code, and then it needs re-reviewing, re-testing, and hopefully will pass second time round.

This is one of the most common situations in a software development team (unless you’re the one team in the world who manages, without cheating, to produce no bugs at all). How should these steps in the life of this ticket be represented on an Agile Board?

There are two basic patterns that can be used. We’ll start with the most commonly used pattern, the ‘Looping Method’, which can be summarised as ‘move the ticket left, back to the In Dev column, for the code amends, and then to the right through Code Review, Ready for QA etc, just as it did on the first iteration. In fact, every time a developer needs to work on that ticket, it jumps left to In Dev for the new code changes, then moves across again to the right’.

Note that there’s nothing special about the In QA column we are using in this example. The Looping Method applies equally if the bug had been found in different column, such as Sign-off, or Ready for Live, it’s just that the ticket would just have to jump that much further to the left to land back in ‘In Dev’, so the loop is bigger.

A pictorial representation is below. The ticket reaches In Dev for the first time and is assigned to a developer to write the initial code. It moves rightwards through Code Review and Ready for QA, then a tester picks it up at ‘In QA’, assigns it to themselves and starts testing. The nasty bug is found, and the ticket jumps three columns to the left and is reassigned to a Dev. This ticket will now loop between columns until every nasty bug is fixed, and it can finally move on beyond QA. At that point more bugs might be found, and more looping will start.

tickets-move-left

It’s easy to see why teams might work in this way –  it’s just doing the same thing on the second, third or fourth iteration of coding as happened the first time. Is there any problem here?

In fact there is. Due to this looping, tickets are jumping horizontally whenever a different sub-team (often, but not always, Dev) needs to work on them next. As a result you are now unable to use the triangulation method of reading a board, as this method uses both horizontal (‘Done-ness’) as well as vertical (‘Priority’) planes to work out the correct order of tickets. But there is no alternative to triangulation to reliably order all tickets on a board, so the Looping Method destroys a key property of a good Agile Board, that it broadcasts to everyone what to pick up next.

how-to-read-fail

In fact the triangulation method tells us that moving tickets left is functionally equivalent to de-prioritising them, as moving left means a ticket is ‘less Done’, and therefore less urgent. But a moment’s thought tells us that it is simply not true that finding a bug deprioritises a ticket, so on the Looping Method our board is Not Telling The Truth – a big problem.

Let’s consider which of the following options we would prefer an available Dev to pick, other things being equal: a) fix a bug on a ticket they had previously coded, b) work on other code that hasn’t reached QA a first time, or c) start something entirely new? The answer is clearly a), to fix the just-found bug. There’s a good chance of getting such a ticket finished and off the board with relatively little extra effort – it genuinely is closer to Done. Remember, it is only finishing tickets that provides any value to the organisation, so it would be crazy to deprioritise a ticket just because it needs some dev attention.

In fact the behaviour we generally want to encourage on finding a bug is for developers and testers to work in close collaboration, with tight feedback cycles of testing, recoding, reviewing and retesting, until the bug is fixed. If each step suffers delays, even relatively minor bugs can take an absolute age to fix, with both parties context switching each time, and quite possibly the underlying codebase changing considerably as other tickets flow to the right. Deprioritising tickets will lead to loss of focus and increase in delays, so that’s the very last thing you want to do to a ticket when it fails QA.

Indeed, if you allow or encourage Devs to start on new code in preference to fixing already-written code that just failed QA, the team will quickly end up with a mountain of unfinished, buggy work piled up in the middle of their board. Such a team – great at starting things, but lousy at finishing them – can only be very unproductive.

Of course, not all teams using the Looping Method sink that far. Good practices outside the Agile Board – strong leadership, effective meetings and internal comms – might compensate for a deficiency in how the board works. But good board usage is designed to make all this far, far easier. What is the alternative practice we can use here?

The alternative is the ‘Tickets Don’t Move Left’ method (TDML)*. It says simply that when a ticket requires the help of some other part of the team to move onwards, you leave it where it has reached on the board and reassign. This way the board continues to represent Done-ness accurately, so the triangulation method works and you can always use your board to figure out the Most Important Thing To Do.

Let’s see this method in diagrammatic form.  The first two parts are exactly as before, but when the ticket reaches QA and the bug is found, it stays in the In QA column and is reassigned to the Dev for re-coding:

tickets-dont-move-left

And then when it’s time for a second round of Code Review on the recoding, the ticket again stays where it is, but is reassigned to a different Dev for that review. And then it will go back to the QA for retesting, again, without moving column:

tickets-dont-move-left-2

This is really a trivial change in board usage. Team members just have to get used to relying on the Assignee field to know what to pick up, and understand that they might need to pick up from any column on the board, rather than only ever bothering to check one particular part of the board (for Devs this would typically be the In Dev and Code Review columns).

I have moved teams from the Looping Method to TDML method in a single day. It really isn’t that hard, and the payoff is enormous; we stop working in a way that continually destroys a board’s ability to clearly represent the priority of work. The triangulation method works, so you get the huge benefit of easily reading your board and knowing what to pick up next.

how-to-read-success

On top of this, there is a more subtle attitudinal change when you become accustomed to looking across the whole board to find your work. A large part of the Agile mindset is to have everyone working and thinking as a team, rather than the siloed mentality of a Waterfall process, where in strict order Analysts do X, then Devs do Y, then QAs do Z, and the success of the thing-as-a-whole is always someone else’s problem.

The correct team attitude is that the entire workflow is a shared responsibility. In a truly Agile team it’s not your narrow job title that defines what you do; your job is to use all your skills to help the team get its highest priority work all the way to ‘Done’. And TDML is the perfect visual representation of that approach, where everyone is co-operating fluidly with everyone else, as need dictates, to maximise the value of the whole, with a collective sense of ownership of the flow of work.

In that sense the board usage recommendation of this article is very powerful. Though TDML is on one level no more than a minor tweak to how teams use an Agile Board under some circumstances, it has a strong resonance with how team members need to think and act such that they are genuinely a team, working together constructively towards a common goal.

*a caveat to the TDML rule: as with all rules, there is a superior rule which takes precedence: ‘Common Sense Always Wins’. So if a ticket is ever simply in the wrong place on a board, you should always correct it and Make The Board Tell The Truth, whether that involves the ticket moving North, South, East or West. Even the best rule can be taken too dogmatically; it doesn’t make it a bad rule, it just means you need to use Common Sense at all times.

Note: I am indebted to David Shrimpton for introducing me many years ago to the key concepts in this article.

EDIT: this article from June 2021 makes many very similar points:

Don’t move items backwards on the kanban board: “Let’s imagine a car production line. If a defect was detected at the stage of installing the door but it was related to the earlier phase, the car is not moved back to the source of error, but rather kept in the place where the defect was detected. Instead, people swarm around it and try to fix the problem.”