Pages

Wednesday, June 13, 2012

Smart Planning & Dumb Luck

Managing software projects requires that one make estimates about the level of effort real people will require to solve abstract problems with concrete software.

There are many boundaries in there: abstraction, concreteness, dates, level of effort, human actors and ideas. Business, and much of academia, abhors honest ignorance. Business requires staffing levels, target dates, deliverables, milestones and budgets. The most common approach I see to resolving the abstract nature of programming with the concrete nature of business is reduction-to-known: how is this project like other (preferably successful) similar projects?

In my experience, this approach is good when there is little novelty involved and is more-or-less a crap shoot when the project is highly novel. Many of our competitors handle novelty by assuming the worst and jacking up the price to cover that worst case scenario. We try to break down the project into novel and known and to do basic research so that we can turn the novel into the known as quickly and safely as possible.

But rarely does anyone, including us, talk about luck. Dumb luck. Pure chance. That sad truth is that sometimes you get lucky, sometimes you don't and sometimes you get unlucky.


(A related topic is inspiration: sometimes divine inspiration strikes, making the difficult simple and the giant task almost easy. But I prefer to think that I am divinely inspired sometimes because (spiritual version) God loves me more than He loves other programmers or (secular version) my superior neurology works this way, so I am reluctant to lump this in with luck. Although I understand that lots of other people do lump inspiration in with luck. I suspect that they are not often inspired, but that is another topic.)

I am aware of the "Luck is the residue of design" school of thought (nice context-setting here), and I mostly agree with it, but sadly sometimes I get lucky without regard to my planning or ability or anything else. Sometimes stuff just works out.

For example: sometimes I mention to a friend over a beer that I am stuck on a particular technical problem and he casually points me to just the right person to ask. Sometimes I find that I need to accomplish something and quickly and easily find just the right technical documentation on the Internet. Sometimes I find that an orphaned AC adapter fits a crucial piece of equipment. Sometimes the device driver I need only runs on an old kernel that I happen to still be running somewhere. Sometimes you get lucky: you crush the project, you beat the deadline, you come in under budget, the users are happy and the client is happy to write that final check.

Sometimes you don't get lucky: things proceed mostly as one would expect, with easy parts and hard parts and unexpected gains mostly offset by unexpected losses. These projects plod along and are delivered on time and get paid for and contribute the illusion that luck is not a concern.

Sometimes you get unlucky: critical personnel have family emergencies, supposedly supported devices don't work in *your* server, someone else's piece does not work as advertised or is late, your design is thwarted by the particular deployment environment, the client's server is plagued by gremlins, etc. These projects fray nerves, eat profit margins, sully reputations. We would all avoid them if we could.

(Beware, though, the development team that never has a bad day, or an unlucky project: it is quite possible that they pad their estimates or grossly underpromise. The only way I know to have a perfect track record is to be chronically unambitious.)

How a team or group or organization handles bad luck is very telling. The first hurdle is recognizing when you hit some bad luck and when you didn't. Very few seem able to distinguish between a failure of their team or process and bad luck.

I am not saying that blithely accepting "we were unlucky" is a good way to deal with failure. I am not defending failures of team members or of management or of process: such failures should be evaluated and corrected, either with retraining or retasking people or refining processes.

Neither do I subscribe to the position that good people never have bad luck. It would be nice if that were true, since it would greatly simplify the task of evaluating failure: there would always be blame.

I am saying that realistic appraisal is critical: neither of these simplistic approaches is a good idea.

Having determined that you got unlucky, what then?

Often the response is to avoid repeating this horrible experience by controlling variability. When practical, controlling variable in the software development process is nice, but it not a good primary goal. This is because controlling variability works best with respect to codified processes expected to produce a consistent result. It is a good way to brew beer or make steel. It is not a good way to solve problems in creative or novel ways.

Similarly, if you turn to scheduling as your protection against bad luck, you may be disappointed because if you want on-time over excellent, then you won't get excellent.

Sometimes you get unlucky. All you can do in that case, in my opinion, is pat your team on the back and let them know you feel their pain--but that you expect a return to excellence the next time.

No comments:

Post a Comment