How to set a WIP limit
A colleague of mine is venturing into the magical world of Kanban, and one of the essential parts of Kanban is setting a "work-in-progress (WIP) limit".
A WIP limit dictates the number of items of work you can have in progress at a time. It dictates your maximum work capacity.
In this example, the team can only have 2 items at play at a time. Even if you have a "spare dev", she cannot start a new piece of work until one of the items in progress moves to Done.
You're misinformed if you're not setting WIP limits and claiming to be doing Kanban.
But what should your WIP limit be?
The Goal (highly recommended reading), describes a poorly performing manufacturing part. Imagine an assembly line, where parts of items are assembled from raw materials, which then get passed down to the next part of the line to be further assembled.
It could look like this:
Steel -> Bonnet -> Car
Other parts of the plant will deal with building wheels, seats, fuzzy dice, etc.
The performance of these parts of the plant is tightly micromanaged and appraised based on their output, e.g, how many bonnets can we produce per day?
Most of the units could describe how they were performing well, yet the overall output of the plant was expensive and slow, constantly missing targets, and the profits were tumbling.
When you look at the plant, you could see piles of inventory, waiting to be converted into real things that could actually be sold. The plant was producing lots of stuff but not delivering real value to customers.
The moment you buy some raw materials and start converting it into something you intend to sell sell, you've taken liquid cash and turned it into inventory.
Inventory is waste until it delivers value.
This is where the plant's management had lost sight of The Goal. There's no point efficiently producing thousands of bonnets if they're not for the right car, or there are other bottlenecks in the plant for making wheels or whatever. The bonnets are dead weight, waste.
So what if instead of worrying about local efficiencies, you started focusing on the actual company goal? What if you said you'd only produce the number of bonnets you needed for a particular day, or better still just-in-time?
Surely this is terrible?! Our bonnet-making machine is just sitting idle! Remember, there's no point churning out tons of inventory if there's no immediate need for them. It's actively harmful, you're making more illiquid assets that need storage, insurance, security, etc.
So how do you make more money? You must identify the bottlenecks, and improve them. Only once you've improved the capacity of your bottlenecks, can you increase the output of the other silos; and only if market demand dictates it wants more of your product, the market is also a bottleneck.
One of the main messages of The Goal is this. There's little point worrying about local efficiencies, it can actually make things worse sometimes! Only The Goal is important; you need to look at the system as a whole and identify where best to improve.
Any improvements made anywhere besides the bottleneck are an illusion.
Lean is all about making work smooth, with flow, minimal bottlenecks and reduced waste.
Setting a WIP limit puts a hard constraint on what work your teams can do at once.
You might be asking yourself
Seriously Chris? So if a developer is working on a ticket, but she can't finish it because deployments are blocked because another team's API is down, she has to sit there idle?
As a manager, your job is not to ensure everyone looks busy. Your job is to fulfil The Goal. Every time you park a ticket and pick up a new one, you choose to avoid dealing with a bottleneck. This bottleneck will keep harming your team and the company and slow its progress.
People love talking about breaking down silos in meetings; this is what should drive it. What if your engineer climbed out of the artificial play den you constructed and helped the other team relieve the bottleneck? Setting WIP limits give you clear indications as to what your bottlenecks are.
So much of "the DevOps movement" (which has its roots in lean manufacturing) is centred around automation, shift-left and breaking down silos. By making it, so a team owns not only the writing of the code but also deploying, testing and running it, you increase flow and reduce inventory.
The Phoenix Project writes about how not adopting this approach is harmful. The company invested millions of dollars in the project, writing code for years. Then it was thrown over from the development team to the operations team to release to the customers. The execution was catastrophic, but the subtler point was that the company had piled up millions of dollars worth of inventory over many years without delivering any value.
Inventory: Something "paid for" not yet put to use (delivering value). If we write something in a backlog, write code, do testing, think about the thing, or do anything, it is something we have paid for. If we have not yet received value for it - it's inventory
~~ Woody Zuill
When you start writing code, it is inventory until it delivers value. It is something that costs to write, maintain, test, etc. When teams have piles of PRs waiting for approval, all work is invested in but not delivering value.
It's inventory. What's worse is in software, we typically don't really know if our work is valuable until it is in the user's hands, and we want to collect feedback loops so that we can improve it to extract more value from the work. The longer your work sits as inventory, the longer you delay that crucial feedback loop.
If your team views work as "done" when a PR is raised, you are not looking at the overall system, just like the plant in The Goal. The back-and-forth review, perhaps manual testing, and all the steps between that and it being used by a customer are all bottlenecks you're choosing to ignore.
Monitoring pull requests raised/closed is akin to measuring how many bonnets you produce.
Mature teams that achieve a good flow of value will review code quickly (ideally on-demand with mobbing or pair-programming), automate all tests and have a reliable deployment pipeline. Hence, inventory never piles up, and they can deliver value quickly.
Definition of done
This is why it's essential to clearly understand what "done" means, and a story shouldn't move into the "done" column of a Kanban board until it meets your done criteria.
These criteria should include deployed to production, used by real people and generating value. It should include other things, which mean it's done to sound quality, such as testing, observability, etc.
When something is done, it should not require any more planned work from the team. Otherwise, it's still a work in progress.
But Chris, the marketing team want us to hide it behind a feature flag for a few months before they get a campaign ready.
It's still WIP! It is still unvalidated ideas and work that is not delivering value. Why is marketing a bottleneck? What could you do as an organisation to be more nimble concerning marketing? You're still thinking locally. Think about The Goal!
Too often, software development organisations celebrate the amount of work they've done; it may even be deployed to production, but if no one uses it, it's still inventory and waste.
Be honest about what "done" means; don't be content with being an efficient bonnet-making silo within the plant. Thinking about "done" will make you answer some tough but essential questions about the way your organisation tries to deliver value.
How to keep WIP low
- Reduce toil. A system that is simple to work on will have a smooth flow of work. Flaky tests and slow builds all contribute to work stalling.
- Automate all the things. You will always suffer from bottlenecks if you have to wait for someone to test something or do a security review manually. See: "shift left".
- Reduce scope. Work in small, achievable steps. Think of ways to frequently deliver small amounts of value rather than "big bang".
Focus on the goal, and work as a team. Encourage a culture within your team of helping each other keep WIP low. Once you finish a task, rather than starting a new one, help your colleagues finish their tasks too. If you find dependencies on other teams are causing bottlenecks, try to help them too.
The perils of high WIP
There is a very real sense of chaos. Many plates are being spun, lots of context switching. Your team will work hard but feel inadequate regarding the actual value (The Goal) being delivered. This will hurt morale. Having high effort but low outcome/reward is commonly linked with burnout.
A recent, sad tale
I've seen in the past year how a combination of socio-technical, and organisational issues have caused subtle, but very harmful bottlenecks for the team I've managed.
Writing this stuff is simple, but I won't pretend it's easy to solve! Nonetheless, you must try.
I needed to communicate these issues more effectively and get sufficient buy-in to deal with them appropriately.
Capturing data sooner would've helped build a more coherent case. When planning work, identify potential bottlenecks and find ways to measure them so you can find the problems quickly with concrete data.
So what number should you set?
As with all things process and methodology, you should not cargo-cult and get lost in a particular implementation, but instead, understand why the practice is essential. This will allow you to tailor it to your specific context, and I hope this post has given you an understanding of what setting a WIP limit accomplishes.
The specific number is insignificant; what's important is the flow of work and listening to the signals the limit provides you, mainly what your bottlenecks are.
If I'm starting with a new team, my rule of thumb is roughly
(Number of devs / 2) + 1, and I take it from there.
If work is sitting in progress or parked for long periods, your "inventory" is high, and you likely have bottlenecks you're not addressing. Setting a WIP limit lets you "see" these bottlenecks more clearly. If you choose to ignore these bottlenecks, though, your team will suffer.
If work goes from idea to done (and remember, actually done and delivering value) smoothly and quickly, you probably have your WIP limit at the right level.