On Failure

April 23, 2009

(Firstly, a quick apology for the radio silence of late – regular updates here will now be resuming.)

So, in the last post we examined incentivisation in its broadest sense with the aim of shedding some light on the way selective pressures are created within organisations. We saw that the alignment of intra-organisational pressures with those of the external market is a fundamentally important factor in the health and efficiency of a business: put simply, it is critical to ensure that a.) employees really act in the long-term interests of the business and that b.) the business really acts in the long-term interests of the employees. We also highlighted the subtle and far-reaching consequences of how behaviour is rewarded (something the investment banks are now coming to terms with), in particular with respect to the conventional benchmarks of “on time, on budget”. We noted that this has driven the maturation of the IT project-centric world view, not because projects are fundamentally the best way of delivering business value but because they are the best vehicle for demonstrating benchmarks achieved and charging clients. More insidiously, such benchmarks can actually create selective pressures against innovation and new features (i.e. the very drivers of competitive advantage in the wider marketplace): anything innovative may be percieved as the risky option for a project team, whereas the safest way to deliver on time and on budget is to ensure they only ever undertake projects they have essentially delivered before. Indeed I have seen this happen: project after project shipped on time and budget whilst senior management sat around scratching their heads at the fact that they kept losing market share. As a wise man once said, “be careful what you ask for”.

The fundamental flaw with such approaches is obviously that they elevate the removal of uncertainty (i.e. perceived risk to timely delivery) to the primary objective, whereas the generation of new business value and innovation is normally all about working _with_ uncertainty. I think its enlightening to flip things round for a moment, and take a look at project timelines in reverse. So starting at the end point, we have the project being decommissioned at some point hopefully a number of years into the future. Roll back a bit and we might have major release upgrades interspersed with patch releases, users making use of x% of the available functionality, ongoing user training, the initial roll-out, iterations of analysis/test/build, etc. I find this “termination-oriented” perspective highly instructive. I haven’t been able to find stats concerning the lifespan of “successful” IT projects, but I’d be suprised if the average was more than five years. That implies nothing negative about the business value they generate. It simply reflects the fact that as long as a business case implementation is generating more value than it costs then it should be continued, and once that is no longer the case then it should be terminated. 

Where the damage occurs is when projects/business cases despite no longer being cost effective or financially viable. In other words it not project failure that causes the worst problems but business cases that are perpetuated beyond their useful lifespan. These become the living-dead “zombie” memes, that drain the financial, staffing and morale lifeblood of an organisation. Interestingly, a former Clinton aide has just published The Tyranny of Dead Ideas about exactly this subject. What matters most is identifying such instances as early as possible and terminating them. Unlike their gamekeeping equivalent however, such project culls should also have a life-giving compliment, whereby dormant “before their time” business cases are activated in response to changing favourable market conditions. In this way, both ends of the useful lifespan of a business case are managed effectively. This is the essence of working in acceptance of uncertainty.

There is an obvious parallel here in software design. Unchecked failures/exceptions escalate. Good architecture treats error handling as an integral part of the design, and promotes defensive programming practices in order to create failure isolation boundaries that isolate and contain the consequences of the intrinsic instabilities of complex production environments. Similarly good program management should treat project failure handling as an integral part of the methodology, and promote defensive program management practices that create failure isolation boundaries that isolate and contain the consequences of the intrinsic instabilities of complex market environments. If a project failure escalates and causes significant problems to an organisation then that is not a project problem – it is a fundamental failing of program management methodology. This is why 37signals are correct in stating that organisation failure amongst start-ups is overrated. Any company that allows failure to escalate unchecked to the point where higher-level market isolation boundaries are invoked and it collapses, would signify to me a lack of management understanding concerning the essential nature of business value generation in unpredictable market conditions.

As a final thought, this idea also creates an additional viewpoint in the debate about regulation and state intervention in free-market economies. An evolutionary virtue of the free-market economy is that it entails a failure isolation boundary/ceiling between organisation and economic sector such that organisational failure is always culled before it escalates higher (the collapse of Soviet communism arguably being an instance of unchecked failure that bubbled up beyond through economic spheres and into the political). However this begs the question of whether such culling could possibly be enforced by some other means (e.g. corporate taxation structures??). If so then regulatory interventionist policies may not intrinsically be a bad thing…