Demonstrable Progress Checkpoints - More Approachable Kanban SLAs
Service-level-agreements (SLAs) are a core Kanban tool drastically improves delivery confidence; shifting how we think about them can improve adoption on delivery teams
If you find this post useful, please select the❤️ button so other people can find it more easily in Substack!
Brief Kanban Primer
If you are already familiar with Kanban, you can skip to the next section. If Kanban is new to you, it can be thought of as a continuous agile delivery method (no sprints) that balances demand against capacity. Kanban is laser focused on how to optimize the completion of project priorities with available resources. This post covers techniques for setting expectations around how long a Kanban ticket should take to reach a demonstrable state.
Why Do Lead Times and SLAs Even Matter?
Service-level-agreements (SLAs) are a core Kanban practice that helps free team members from micromanagement and allows managers to spend more time focusing on client impact. Before we dive into what SLAs are and how they help projects, we must first define lead time, which represents the average time a Kanban ticket takes to be completed. We typically use an eight-week moving average.
On some teams, all tickets are measured against the average time. That is, the team will expect most of their tickets to be completed within the average time. This approach, however, is flawed. With an average as your goal, you should anticipate 50% falling below and 50% exceeding the average. SLAs help us improve this situation and the predictive nature of lead times.
Dave Anderson’s book, Kanban: Successful Evolutionary Change for Your Technology Business, is an excellent primer on both Kanban in general and SLAs. It represents something of a “Kanban Bible” status amongst many practitioners and is on top of CodifyIQ’s recommended reading list. In his book, Anderson highlights the importance of SLAs with the following:
The service-level agreement allows us to avoid costly activities, such as estimation; low-trust activities, such as making commitments; and to spread risk by aggregating a large collection of requests and promising only aggregate performance in the form of a percentage due-date performance. By avoiding making promises we are unlikely to be able to keep, we avoid the danger of losing the trust of our customers. Therefor, it’s important to communicate that the target lead time is… just that, a target!
Completing a task within SLA, in general, helps enable the core Kanban value proposition: maximizing the impact of a project’s ultimate resource - people. Kanban does this by showing consistent process against the overall goal of the project rather than measuring against unreliable guesses made early in the effort with minimal realistic information available. The key to this approach is ensuring the team can consistently demonstrate progress against their aggregate goals - and that’s exactly what SLAs accomplish.
How SLAs Enable Consistent Demonstrable Progress
As we noted earlier, lead times are a poor way to measure how often a team deliverers as expected because their Due Date Performance (the percentage of tickets completed in an expected duration) against an average will inherently be about 50% by definition. SLAs fix this false expectation by representing lead time plus a statistically derived plus-up. If a standard deviation1 is used as the plus up, we go from having a 50% probability of a task being delivered as expected to an 84% chance (assuming a normal distribution).
Compare SLAs with old school, big-upfront estimation techniques using Wideband Delphi / Monte Carlo estimations. These approaches would take days to weeks to hopefully achieve a targeted 80% level of accuracy2. Even worse, the detailed analysis that fed these approaches often became the basis of Gantt Charts or other equally “precise” sets of milestones that clients and project managers used to complain that everything was behind schedule. Managers and clients often fall in love with these big-up-front schedules, in large part due to the amount of work that went into creating them in the first place. Once you have experienced a manager stopping by your desk multiple times a day with a nonsensical schedule in hand asking if you are done yet and if not, when will you be done, it’s clear there must be a better way.
With the simple approach laid out for SLAs above, we can exceed the delivery expectations of the most robust estimation techniques without all the hassle, time, and negative side effects. Additionally, we can actually replace the time that would have been used to complete those estimation techniques with work that actually moves the needle towards the project’s desired end state. This is the essence of the Kanban approach.
Hold On, SLAs Did Not Work for My Project
While SLAs have been extremely helpful on projects in our experience, they come with some challenges of their own. Our journey started with Dave Anderson’s suggestions and ended up with something conceptually similar, but a bit different. This is normal and probably will be the case on all projects as any useful process must be tailored to the team and client.
Our major focus is to strive to show progress on all our work once a week. This has proven to be a generally approachable timeframe for most technical work on web and AI applications alike. Despite general success with this approach, our projects always seemed to have one ticket that would not just exceed SLA but would blow it out. As we calculated our lead times on a weekly basis, there was generally one ticket at 13, 20, or even 40 days (43 days is the record)!
When we focus on the aggregate, it’s easy to lose sight of outliers. However, these outliers erode trust when they are repeatedly not making progress. Anyone that has told a client “Item X is just running a little behind, but we’ll have an update for you next week” knows how hard that message gets on week three or four of delivering it with little to no update.
To resolve these long-lived tickets, the team started to highlight when a ticket hits “SLA” and then “double SLA” (double SLA being when the ticket exceeds lead time plus two standard deviations). Double SLA tickets becomes team-centric efforts. The team rallies and team members pause other tasks to help complete the ticket as soon as possible. We also put a “thou shall not pass” restriction on double SLA tickets such that no other ticket can advance past the double SLA ticket’s current column on our Kanban board. This was an excellent motivator and quickly solved our challenge of long-lived tickets as team members with tickets blocked by the double SLA ticket now have time and a desire to help free the blockage.
One persistent challenge has remained, varying by project and team. On teams with more senior members and more familiar tech stacks, we almost never see or talk about double SLA tickets - though we continue to track and discuss the limits each week. On teams with more junior members and newer tech stacks, we find a consistent portion of the team sees SLAs as deadlines rather than checkpoints to help ensure we are making demonstrable progress. This creates a stressful situation for these team members and often makes them feel like “cogs in the machine” rather than valuable members of the team. Even if most team members see SLAs for their intent or quickly adjust to that view, we must challenge ourselves to do better to ensure a consistent understanding across the entire team and unlock the team’s full potential.
An SLA with a Different Name
SLA is a term that is focused on how we deliver to external clients. However, many projects end up delivering more to internal customers (including team leadership) than external customers, at least in the short run. Especially on these projects - if not all projects - rebranding SLAs to better reflect our aggregate-level delivery can help reduce the appearance of SLAs as deadlines and help refocus the team on continuously showing our overall progress.
To do this, the term Demonstrable Progress Checkpoint can rebrand SLA to focus on our desired intent. This term more precisely communicates the intent of SLAs and removes the need for the team to interpret the concept. Simply put, it says what it means and means what it says.
Team members can now more naturally approach our Demonstrable Progress Checkpoint (DPC) by focusing on showing progress rather than stressing about meeting an initial, coarse-grained task estimate. As we hit DPC 2, the entire team helps reach a demonstrable point so we don’t end up with a never-ending ticket and keep client trust high.
In practice, when a team delivers visible progress weekly on 84% of their activities and ensures that the tickets that don’t meet that initial DPC show progress the following week nearly 98% of the time3, it’s easy to build and maintain trust in how your team executes. When we stop focusing on artificial or unreasonable deadlines and instead focus on maximizing the only truly fixed constraint on a project (money that funds the team’s labor), teams and clients win.
Try DPCs out on your project - outside of board-level Work In Progress (WIP) limits, it is the next most impactful tool you can add to your Kanban process!
Standard deviations are calculated using the same data set as lead times
80% is the traditional gold standard for estimation accuracy. Less than 80% is not considered accurate enough to maintain confidence in the estimate. More than 80% requires so much padding to the schedule, it also erodes confidence. Otherwise put, even traditional approaches were intended to ensure trust in the estimates rather than be perfect predictors.
Lead time plus two standard deviations, assuming a normal distribution