söndag 15 juni 2014

SAFe and different scaling dimensions

(Since I was asked in English about some of my views on the Scaled Agile Framework, I will try to blog in English, even though it is not my first language.)

I have spent the greater part of my consulting time the last three years in larger organisations that are trying to achieve more agility in their systems development, so questions of how one handles different scalability issues when it comes to lean and agile has been quite important to me. One can not mention the words "agile" and "scale" without thinking of Dean Leffingwell's Scaled Agile Framework ("SAFe"), and when trying to be agile at a scale you have to deal with one or more of the issues that he touches. So it doesn't matter if you choose to follow his advise in a particular question or if you choose do the opposite: you need to have an informed opinion of what the SAFe says and why it is wise to follow or go against in your particular case.

First, you need to understand what you mean by "scale". The context of the Scrum definition is a product where a single Product Owner can prioritize among customer values - often in the form of features -, and a integrated cross-functional team can implement at least one feature within the boundaries of a couple of weeks. As soon as your reality transcends that, you need to employ some strategies in order to cope with it. Some of those strategies might be of the sort that they break the rules of Scrum.

Here usually comes the first set of criticism against attempts to scale: "This smells of component teams!", "You shouldn't have to coordinate among several teams, why can't they self-organize?", "You should be able to show an integrated piece of functionality after each sprint!" To which the thoughtful reply always is: "Well, yes, but if we can't?" Certain problem and solution domains are so complex that we need to take time, to use many teams, to build up a small testing debt once in a while. It is not by choice, we have inherited the complexity, and now we have to do the best we can with what we have.

Scaling is basically done along the dimensions of either the number of people necessary to create the value, or the time neeed, or both. We might need more than one team, and we might need more than one sprint. When you think of it, both the Scrum team and the sprint are in themselves scaling techniques along these dimensions: if you need more than one person, use a cross-functional team; and lock the requirements for 2-4 weeks (and call that period a "sprint") before integrating, since it will probably take more than a day to implement. SAFe is basically an extension of those Scrum strategies.

Even the single team organisation has in fact a scalability issue: that of time. Even where you can develop single features within a sprint, features are often meaningful only in contexts together with other features, for instance to support a whole scenario for some user. So even in the small scale you will probably have to coordinate implementation activities over time, aggregating sub-goals towards a more valuable bigger goal. The act of identifying important larger goals and decompose them into smaller but meaningful slices is called "refinement", and the backlog very much resembles a refinery where crude ideas enter at the bottom (like crude oil) that will be clarified and broken down further up.

While cracking and refining crude oil is a continuous process, refining backlog items is best done in distinct levels. Dean Leffingwell's book Agile Software Requirements (which would later be developed into the SAFe) proposes three distinct levels of the backlog, expressed as three distinct backlogs: the portfolio level, the program level, and the team level. The levels are typically coordinated with time frames of the different planning horizons so that the team level coordinates sprints, the program level coordinates releases, and the portfolio level coordinate the investments either they are done continuously (which SAFe and beyond budgeting suggests) or on a year-by-year basis.

That approach is in my experience very fruitful. In fact, the backlog is a value stream map of the development. Development is the generation, refinement and codification of knowledge; the backlog tracks exactly that; and when implementing lean (and agile is a lean implementation) you need to organise (and thus scale) around the value stream. In my experience, you often need a bit more granularity in the backlog levels: I often propose the horizons/levels of: within a sprint (2 weeks), an iteration (3 sprints), 2-4 iterations (2-6 months), and a "project" level that tracks initiatives that lasts for half a year up t two years (even if the usefulness of initiatives of that size can be doubted). While the division of the backlog needs to fit the particular situation, this approach of dividing the backlog into sections is fruitful.

SAFe strategies can however be problematic when it comes to how you organise work. The depiction of the backlog levels as if they were separate backlogs owned by separate departments tempts people into organising handoffs instead of using soft handovers involving members from different departments swarming around the items when they need to be refined. While Leffingwell himself always stresses the use of such lean practises instead of wasteful formalism, the presentation of the SAFe easily lends itself to a bureaucratic interpretation as I have encountered many times.

Another problematic area is that SAFe, just as Scrum, doesn't scale well when we need many different cadences - rythms - in parallel. Scrum talks about only one cadence, the sprint, and SAFe expands with adding the concept of iteration (a period of a couple of sprints in which you can develop several parts and integrate them), and a more elaborate description of how you manage releases. Scrum explicitely states that the sprint should rather not be replanned, which is a way of saying that the sprint is the shortest cadence you should allow for, and SAFe talks about the need of having all involved teams using the same takt of sprints and iterations. SAFe even mandates that the different teams should employ the same strategy when calculating velocity and making forecasts.

In my experience, this is where both SAFe and Scrum breaks. First because the domain of them both is limited to just one small part of the product/service lifecycle: the development. When doing development and development only, you might have the luxury of plans that can remain static for two weeks. But I have yet to see even a development team that never have to respond quickly to some external requests. Therefore: every Scrum team I have seen have had to be able to handle several cadences. On a side note, it is also often a bad practise to have isolated teams doing only planned development. The more responsibility and understanding of the lifecycle they can have, the better they understand the user's actual needs and the business value they provide. DevOps are good for a reason.

So since almost every Scrum team need to handle issues on both a daily, weekly, and bi-weekly basis, maybe using Scrumban or some other Scrum/Kanban hybrid, the organisation need to be polyrythmic in more complex figures than what both Scrum and SAFe assumes. A team where 80% of a week's work isn't possible to plan at the beginning of the week, will need to employ different strategies for doing development in a predictable way compared to the team where only 20% of the work has to be unplanned. They need different methods of forecasting, different jidoka conditions, and thus different ways of working. One or two sizes can't fit all situations we end up in. We need to bring in more practices and techniques than what is suggested in Scrum and SAFe in order to make it work well.

So in short: this is why in my experience SAFe suggests many good ways of coordinating the enterprise backlog among several teams and disciplines; but is too limited and can be harmful when it comes to coordinate the actual work.

Inga kommentarer:

Skicka en kommentar