June 3, 2011
The previous post we discussed control in complex adaptive systems. We examined the different categories of constraints that shape emergent behaviour, and highlighted the dangers created by failing to recognise the limits of human control. In this post I would now like to examine the extent to which it’s meaningful to think about lean product development as being ‘natural science’. In doing so, I intend to avoid getting drawn into philosophical debate about the scientific method. Rather, taking my cue once more from David Snowden’s LSSC11 keynote, I’d like to start by examining the narrative context of appeals to natural science (i.e. claims that field F ‘is science’). Historically, such appeals have commonly been made for one of two reasons:
A long tradition exists of appeals to science in order to assert the reliability and validity of new or poorly established fields of intellectual inquiry. That tradition has been based in academic circles, and can be traced back through management and education science; political science, economics and anthropology; through to linguistics and famously even history.
Whether this is something lean product development needs concern itself with, I would question. As a discipline based on practical application rather than theoretical speculation, we can rely on natural selection to take care of that validation for us: if the methods we use aren’t effective then we will simply go out of business. Economic recession makes this process all the more reliable. Earlier this year I came across a really great risk management debate between Philippe Jorion and Nassim Taleb from 1997, where Taleb makes the following point:
“We are trained to look into reality’s garbage can, not into the elegant world of models. [We] are rational researchers who deal with the unobstructed Truth for a living and get (but only in the long term) [our] paycheck from the Truth without the judgment or agency of the more human and fallible scientific committee.“
For me this summarises lean practice. Yes, we are performing validated learning – but simply because doing so is rational behaviour, and more effective than taking a blind punt. Beyond that, whether the techniques employed constitute science seems to me an unnecessary diversion.
That would be of arguably little consequence were it not for one issue: namely, risk. Reminding ourselves that risk is a function of value rather than an entity in its own right, the value that science strives for is truth (however that might be defined). However the value that complex systems risk management strives for is survival. This difference creates a diametrically opposing attitude to outlier data points. Risk management in science is concerned with protecting the truth, and so all outlier data points are by default excluded unless they are repeatable and independently verified. On the other hand, risk management in complex systems is all about managing the cost of failure. It doesn’t matter if you are wrong most of the time, as long as in all those cases the cost of failure is marginal. What counts is being effective when the cost of failure is highest, as that creates the greatest chance of survival. As a result, outlier data points are by default included and are only ever discounted once we are highly confident they are not risk indicators.
The financial crisis has demonstrated the consequences of confusing these two different perspectives: of risk management techniques which are right most of the time but least effective when the cost of failure is greatest.
The other common motive for appeals to science has been power. Going back to Descartes and the birth of western science, its narrative has always been one of mastery over and possession of nature – in short, the language of dominion and control. This takes us back to the themes of the previous post.
Such a perspective has become so deeply embedded in our cultural consciousness that it now influences our comprehension of the world in all sorts of ways. A recent example has been the debate in software circles about avoiding local optimisation. As a technique for improving flow through systems, the principle is sound and highly valuable. However it is entirely dependent on the descriptive coverage of the system in question. Some systems, such as manufacturing or software delivery pipelines, are amenable to such complete mapping. Many others however, such as economic markets, are not. A technique commonly used to describe such systems in evolutionary biology is the fitness landscape:
The sectional view through such an environment might be as follows, leading the unwary to highlight the importance of avoiding local optimisations at points A and C and always evolving towards point B.
The problem here is that for landscapes such as economic markets, the above diagram represents the omniscient/God view. For mere mortals, we only have knowledge of where we have been and so the above diagram looks simply like this:
Whilst it is a valuable insight in its own right simply to understand that our product is a replicator competing in a fitness landscape, as much as we might like to avoid local optimisations doing so is impossible because we never know where they are (even at maxima, we can never tell if they are global or local).
It is for these reasons that I think it is unhelpful to think of lean as science. The narrative context of lean should be not one of arrogance but humility. Rather than succumbing to the illusions of mastery and control in the interests of appeasing our desire for certainty, we are performing validated learning simply because it is the most effective course of action once we recognise the actual limits of both our control and understanding.
May 23, 2011
It’s been a couple of weeks since my return from LSSC11 in Long Beach, California. My congratulations to David Anderson and the team for putting on a great conference. I was particularly impressed by the diversity of content and choice of keynote speakers. I’m sure the continued adoption of such an outward-looking perspective will keep the lean community a fertile breeding ground for new ideas. For me, personal highlights of the week included an enthusiastic and illuminating conversation about risk management with Robert Charette, meeting Masa Maeda who is obviously both very highly intelligent and a top bloke (thanks a lot for the blues recommendation btw!), a discussion about risk with Don Reinertsen after his excellent Cost of Delay talk (more on this in another post) and catching up with a number of people I haven’t seen for too long.
I think the talk I found most thought-provoking was given by David Snowden. I’d read a couple of articles about the Cynefin framework following a pointer from Steve Freeman and Suren Samarchyan a couple of years ago, but I’d never heard him speak before and the prospect of complexity science and evolutionary theory making it into the keynote of a major IT conference had me excited to say the least. Overall he presented lots of great content, but by the end – and to my surprise – I was left with a slight niggling ‘code smell’ type feeling, something I’d also experienced in couple of the Systems Engineering talks on the previous day. Reflecting on this during his presentation, I realised the cause was essentially two concerns:
1.) The lack of acknowledgment that often we have little or no control over the constraints in a complex adaptive system
2.) The extent to which it’s meaningful to think about lean product development as being ‘natural science’
The first of these will be the subject of the rest of this post. The second I will discuss in my next post.
Constraints in Complex Adaptive Systems
For any agent acting within a complex system – for example, an organisation developing software products and competing in a marketplace – the constraints of that system can be divided into the following core categories:
a.) Constraints they can control. These include software design, development and release practices, financial reward structures, staff development policies, amongst others.
b.) Constraints they can influence. These include how a customer perceives your product compared to the products of your competitors, and the trends and fashions which drive product consumption.
c.) Constraints they can do nothing about and must accept. These include competitor activities, legal constraints, economic climate, and more typical market risks such as exchange rates, interest rates and commodity prices.
Each type of constraint then requires a specific management approach, as follows.
a.) ‘Control’ constraints: this is the domain of organisational and capability maturity models.
b.) ‘Influence’ constraints: this is the domain of marketing and lobbying (both internal/political and external/advertising): for example, one of the most effective growth strategies is to promote ideas which cast the market-differentiating characteristics of your competitors in a negative rather than positive light. However such techniques are only reliable to the extent that no-one else is using them. They also create risk because in such circumstances they create an illusion of control. Once competitors adopt similar techniques then that illusion is revealed, and influence is lost until more sophisticated strategies are devised.
c.) ‘Accept’ constraints: this is the domain of risk management and resilient program management practices.
If an organisation mis-categorises the constraints within which it operates, it can have consequences which are terminal. Almost always this happens because of the illusion of or desire for control, where constraints are mistakenly identified as the first category. The illusion of control is commonly created when complex adaptive systems are in temporal equilibrium states, and so behave predictably for limited (and unpredictable) periods of time. Applying control management techniques in such situations is worse than doing nothing at all, as it creates the illusion of due diligence and hides the real levels of risk exposure. Thinking about the recent financial crisis in these terms, it can be seen as the misapplication of the mathematics of natural science (e.g. Brownian motion in pricing models) in an attempt to manage capital market systems constraints that were actually outside the domain of full human control.
A key to this is something I have blogged about previously: failure handling. Snowden discussed a strategy employed by a number of large organisations where they preferentially select delivery teams from previous failed projects, as the lessons learnt they bring to the new project increase its chance of success. He likened this approach to treating exception handling as an integral part of software design. This idea was a recurring theme across the conference, to the point where a number of people were tweeting suggestions for a ‘Failure’ track at the next LSSC. However I don’t buy this argument, and in his closing remarks I was pleased to hear David Anderson announce that it won’t be happening. The reason I don’t buy it goes back to the software design metaphor. If your application was brought down by a exception, then clearly failure handling was not treated as a first order concern in the design process. Similarly, if resilient failure handling had been designed into your program management process, then rather than your project failing you should have conducted the necessary risk mitigation or performed a product pivot.
In terms of the categories described above, failure related to the first type of constraint generally indicates an immature delivery capability: it is failure that was avoidable. On the other hand, failure due to the third type of constraint indicates a lack of understanding of resilience and the principles that Eric Ries and others have been promoting for a number of years now. It is failure that was absolutely inevitable. Neither of these are characteristics I would want to bring to a new project. Failures due to the second type of constraint are arguably more interesting, but for me they quickly descend into subject areas that ethically I find questionable. Finally and of most interest, are failures due to constraints unique to the particular business context. The uniqueness of the constraint means that the failure is necessary, and not just an attempt to find a solution to an already solved problem. In this case, the failure becomes part of the organisation’s learning cycle and a source of highly valuable insight going forwards. However, even here it could be argued that such lessons should be possible via effective risk management processes rather than requiring full-blown project failure.
I had a brief chat to David Snowden after his talk, and regarding the extent to which systems constraints can be controlled he had the slightly disappointing answer that ‘C-level people get what I’m talking about, it’s only middle managers who don’t understand it.’ Whilst that may or may not be true, afterwards it put me in mind of Benjamin Mitchell’s excellent presentation about Kanban and management fads. I think a large part of the history of management fads reduces down to the exploitation of CxO denial regarding a.) the actual limitations of their control and b.) the difficulty and cost of high quality product development. I think a key task this community can perform to help prevent Kanban going the same way is to stay faithful to ‘brutal reality’, to borrow Chet Richards fantastic phrase, by remaining transparent and honest about the true limits of achievable control.
April 13, 2010
Although I have been making use of them over the last 18 months in various presentations and the real options tutorial, I recently realised I’d omitted to publish graphs illustrating the optimal exercise point for real options on this blog. As a result, here they are:
The dotted blue line represents risk entailed by the agilist’s Cone of Uncertainty, and covers technical, user experience, operational, market and project risk – in short anything that could jeopardise your return on investment. The way this curve is negotiated is detailed below, by driving out mitigation spikes to address each risk.
The dotted brown line represents risk from late delivery. At a certain point this will start rising at a rate greater than your mitigation spike s are reducing risk, creating a minimum in the Cumulative Risk curve denoted in red. Remembering that
Feature Value = ((1 - Risk) x Generated Value) - Cost
this minimum identifies the Optimal Exercise Point.
One point worth exploring further is why Delayed Delivery is represented as risk rather than cost. The reason is because Cost of Delay is harder to model. For example, let’s say we are developing a new market-differentiating feature for a product. Let’s also say that there are X potential new customers for that feature in our target market, and that it would take Y weeks of marketing to convert 80% of those sales. Providing whenever we do launch, there remains Y weeks until a competitor launches a similar feature then then the cost of delay may be marginal. On the other hand, if we delay too long and a competitor launches their feature before us then there will be a massive spike in cost due to the loss of first mover advantage. However the timings of that cost spike will be unknown (unless we are spying on the competition), and therefore very hard to capture. What we can model though is the increasing risk is of that spike biting us the longer we delay.
I have found it very helpful to think about this using the insightful analysis David Anderson presented during his excellent risk management presentation at Agile 2009 last year. He divides features into four categories:
- Differentiator: drive customer choice and profits
- Spoiler: spoil a competitor’s differentiators
- Cost Reducer: reduce cost to produce, maintain or service and increase margin
- Table Stakes: “must have” commodities
Differentiators (whether on feature or cost) are what drive revenue generation. Spoilers are the features we need to implement to prevent loss of existing customers to a competitor, and are therefore more about revenue protection. And Table Stakes are the commodities we need to have an acceptable product at all. We can see how this maps clearly onto the example above. The cost of delay spike is incurred at the point when the feature we intended to be a differentiator becomes in fact only a spoiler.
This also has a nice symmetry with the meme lifecycle in product S-curve terms.
We can see how features start out as differentiators, become spoilers, then table stakes and are finally irrelevant – which ties closely to the Diversity, Dominance and Zombie phases. There are consequences here in terms of market maturity. If you are launching a product into a new market then everything you do is differentiating (as no-one else is doing anything). Over time, competitors join you, your differentiators become their spoilers (as they play catch-up), and then finally they end up as the table stakes features for anyone wishing to launch a rival product. In other words, the value of table stakes features in mature markets is that they represent barrier to entry.
More recently however, I have come to realise that these categories are as much about your customers as your product. They are a function of how a particular segment of your target market views a given feature. Google Docs is just one example that demonstrates how one person’s table stakes can be another person’s bloatware. Similarly, despite the vast profusion of text editors these days, I still used PFE until fairly recently because it did all the things I needed and did them really well. Its differentiators were the way it implemented my functional table stakes. The same is true of any product that excels primarily in terms of its user experience, most obviously the iPod or iPhone. Marketing clearly also plays a large part in convincing/co-ercing a market into believing what constitutes the table stakes for any new product, as witnessed in the era of Office software prior to Google Docs. The variation in the 20% of features used by 80% of users actually turned out not to be so great after all.
So what does this mean for anyone looking to launch a product into a mature market? Firstly, segment you target audience as accurately as possible. Then select the segment which requires the smallest number of table stakes features. Add the differentiator they most value, get the thing out the door as quickly as possible, and bootstrap from there.
March 26, 2010
The phrase “Agile in the Large” is one I’ve heard used a number of times over the last year in discussions about scaling up agile delivery. I have to say that I’m not a fan, primarily because it entails some pretty significant ambiguity. That ambiguity arises from the implied question: Agile What in the Large? So far I have encountered two flavours of answer:
1.) Agile Practices in the Large
This is the common flavour. It involves the deployment of some kind of overarching programme container, e.g. RUP, which is basically used to facilitate the concurrent rollout of standard (or more often advanced) agile development practices.
2.) Agile Principles in the Large
This is the less common, but I believe much more valuable, flavour. It involves taking the principles for managing complexity that have been proven over the last ten years within the domain of software delivery and re-applying them to manage complexity in wider domains, in particular the generation of return from technology investment. That means:
- No more Big Upfront Design: putting an end to fixed five year plans and big-spend technology programmes, and instead adopting an incremental approach to both budgeting and investment (or even better, inspirationally recognising that budgeting is a form of waste and doing without it altogether – thanks to Dan North for the pointer)
- Incremental Delivery: in order to ensure investment liability (i.e. code that has yet to ship) is continually minimised
- Frequent, Rapid Feedback: treating analytics integration, A/B testing capabilites, instrumentation and alerting as a first order design concern
- Retrospectives and Adaptation: a test-and-learn product management culture aligned with an iterative, evolutionary approach to commercial and technical strategy
When it comes down to it, it seems to me that deploying cutting-edge agile development practices without addressing the associated complexities of the wider business context is really just showing off. It makes me think back to being ten years old and that kid at the swimming pool who was always getting told off by his parents “Yes Johnny I know you can do a double piked backflip, but forget all that for now – all I need you to do is enter the water without belly-flopping and emptying half the pool”.
January 27, 2010
Towards the end of last year I was lucky enough to meet up with Chris Matts for lunch on a fairly regular basis. To my great surprise, he did me the amazing honour of capturing some of what we discussed concerning memes in one of his legendary comics. They have just been published on InfoQ together with a discussion piece here: http://www.infoq.com/articles/meme-lifecycle
September 3, 2009
I am currently taking some time out on a mini mid-West roadtrip area after a most enjoyable and thought-provoking Agile 2009 in Chicago. I will post more on that later, but in the meantime here are the the slides and reading list from my conference session.
June 2, 2009
An interesting article in the Wall Street Journal by the father of modern portfolio theory, Harry Markowitz, discussing the current financial crisis. He emphasises the need for diversification across uncorrelated risk as the key to minimising balanced portfolio risk, and highlights its lack as a major factor behind our present predicament. However to my mind this still begs a question of whether any risk can be truly uncorrelated in complex systems. Power law environments such as financial markets are defined by their interconnectedness, and the presence of positive and negative feedback loops which undermine their predictability. That interconnectedness makes identifying uncorrelated risk exceptionally problematic, especially when such correlations have been hidden inside repackaged derivatives and insurance products.
In systems that are intrinsically unpredictable, no risk management framework can ever tell us which decisions to make: that is essentially unknowable. Instead, good risk strategies should direct us towards minimising our liability, in order to minimise the cost of failure. If we consider “current liability” within the field of software product development as our investment in features that have yet to ship and prove their business value generation capabilities, then this gives us a clear, objective steer towards frequent iterative releases of minimum marketable featuresets and trying to incur as much cost as possible at the point of value generation (i.e. through Cloud platforms, open source software, etc). I think that is the real reason why agile methodologies have been so much more successful that traditional upfront-planning approaches: they allow organisations to be much more efficient at limiting their technology investment liability.
May 21, 2009
A few times recently I have come across discussions where complex systems have been described as self-organising. Whilst being strictly true within the limits of logical possibility, I believe such descriptions to be significantly misrepresentative. I hope you’ll forgive the frivolity, but I find it useful to think about this in relation to the infinite monkey theorem (if you give a bunch of monkeys a typewriter and wait long enough then eventually one of them will produce the works of Shakespeare). I think describing complex systems as self-organising is a bit like describing monkeys as great authors of English literature. Perhaps the underlying issue here is that evolutionary timeframes are of a scale that is essentially meaningless relative to human concepts of time. Natural selection’s great conjuring trick has been to hide the countless millenia and uncountably large numbers of now-dead failures it has taken to stumble upon every evolutionary step forwards. In monkey terms, it is a bit like a mean man with a shotgun hiding out in the zoo and shooting every primate without a freshly typed copy of Twelfth Night under its arm, and then us visiting the zoo millions of years later and thinking that a core attribute of monkeys is that they write great plays!
So, are complex systems self-organising? For the one-in-trillions freak exceptions that have managed to survive until today the answer is yes, but in a deeply degenerative and entropy-saturated way. For everything else – which is basically everything – then the answer is no. An example of this is being played out around us right now in the financial crisis. The laissez-faire economic ideologies of recent decades were primarily an expression of faith in the abilities of complex financial markets to self-organise, as enshrined in the efficient market hypothesis championed by Milton Friedman and the Chicago School. The recent return to more Keynesian ideas of external government intervention via lower taxation and increased public spending together with the obvious need for stronger regulation are indicative of a clear failure in self-organisational capability.
When people in technology talk about self-organisation, what they are normally referring to are the highly desirable personal qualities of professional pride, initiative, courage and a refusal to compromise against one’s better judgement. A couple of years back Jim Highsmith argued that “self-organisation” has outlived its usefulness. I would agree, and for the sake of clarity perhaps we might do better now to talk about such attributes in specific terms?
May 21, 2009
A guide to putting into practice a number of ideas that have been discussed on this blog over the last 18 months, especially with regards to MMF valuation (already incorporating a number of feedback iterations from different projects).
Download: Options Analysis User Guide 1.2
[update: uploaded latest version of PDF]
May 20, 2009
In the last post we argued for a more rigourous, quantitative approach to featureset valuation over the conventional, implicit and overly blunt mechanisms of product backlog prioritisation. We borrowed a simple valuation equation from decision tree analysis to give us a more powerful tool for both managing risk and determining the optimal exercise point for any MMF:
Value = (Estimated Generated Value * Estimated Risk) - Estimated Cost
Estimated Risk =
Estimated Project Risk *
Estimated Market Risk
A few comments are worth noting about this equation:
1.) It contains no time-dependent variable. The equation simply assumes a standard amortisation period to be agreed with stakeholders (typically 12 to 24 months). Market payback functions and similar are ignored as they introduce complexity and hence risk (as described in more detail below). We are not seeking accuracy per se, but simply enough accuracy to enable us to make the correct implementation decisions.
2.) It is very simplistic. Risk management must be reflexive: at the most basic level, project risk can be divided into two fundamental groupings:
- Model-independent risks
- Model-related risk
The former includes typical factors such as new technologies, staff quality and training, external project dependecies, etc. The latter includes two components: the inaccuracy of the risk model and the incomprehensibility of the risk model. We start incurring inaccuracy risk as soon as the simplicity of our model is so great that it leads us to make bad decisions or else provides no guidance. MoSCoW prioritisation is a good example of this. On the other hand we start incurring incomprehensibility risk as soon as the risk model is so complex that it is no longer comprehensible by everyone in the delivery team (which will clearly be relative across different teams). The current financial crisis is a large-scale example of a collapse in incomprehensibility risk management. If financial risk models had been reflexive and taken their own complexity into account as a risk factor, then there is no way we would have ended up with situations where cumulative liabilities were only even vaguely understood by financial maths PhDs: if we take a team of twenty people, it is clear that a sophisticated and accurate model that is only understood by one person entails vastly more risk than a simplistic, less accurate model that everyone can follow. We can generalise this in our estimation process as follows:
Total Risk = Project Risk * Market Risk * Model Incomprehensibility Risk * Model Inaccuracy Risk
or as functions:
Total Risk = Risk(Project) * Risk(Market) * Risk(Incomprehensibility(Model)) * Risk(Inaccuracy(Model))
Furthermore, given our general human tendency towards overcomplexity for most situations this can be approximated to
Total Risk = Risk(Project) * Risk(Market) * Risk(Incomprehensibility(Model))
3.) All risk is assigned as a multiplier against Generated Value, rather than treating delivery risk as an inverse multiplier of Cost. I have had very interesting conversations about this recently with both Chris Matts and some of the product managers I am working with. They have suggested a more accurate valuation might be some variation of:
Value = (Estimated Generated Value * Estimated Realisation Risk) - (Estimated Cost / Estimated Delivery Risk)
In other words, risks affecting technical delivery should result in a greater risk-adjusted cost rather than a lesser risk-adjusted revenue. This is probably more accurate. However is that level of accuracy necessary? In my opinion at least, no. Firstly it creates a degree of confusion as regards how to differentiate revenue realisation risk and delivery risk: is your marketing campaign launch really manifestly different in risk terms from your software release? If either fails it is going to blow the return on investment model, so I would say fundamentally no. Secondly, I might be wrong but I got the feeling that part of the reticence to accept the simpler equation from our product management was a preference against their revenue forecasts being infected by a thing over which they had no control: namely delivery risk (perhaps a reflection of our general psychological tendency to perceive greater risk in situations where we have no control). However that is a major added benefit in my opinion: it helps break down the traditional divides between “the business” and “IT”. As the technology staff of Lehman Brothers will now no doubt attest, the only people who aren’t part of “the business” are the people who work for someone else.
For me, this approach creates the missing link between high-level project business cases and the MMF backlog. We start with a high-level return on investment model in the business case, that then gets factored down into componentised return on investement models as part of the MMF valuation process. These ROI components effectively comprise the business level acceptance tests for the business case. The componentised ROI models then drive out the MMF acceptance tests, from which we define our unit tests and then start development. In this way, we complete the chain of red-gree-refactor cycles from the highest level of commercial strategy down to unit testing a few lines of code. The scale invariance of this approach I find particularly aestheticly pleasing: it is red-green-refactor for complex systems…