Wednesday, April 25, 2012

Programming by Any Other Name

These days it is hard to find even moderately sized systems which do not claim to offer rule-based options for configuration. These systems claim to provide the flexibility of encoding business rules without the hassles or requirements of actual programming.

Ha! say I. Call it what you like, computer programming is computer programming.
"What's in a name? That which we call a rose
By any other name would smell as sweet."
Romeo and Juliet (II, ii, 1-2)

Inspired by the great Shakespeare I say that programming, by any other name, still smells of sweat.

Note that I am not saying that rule-based systems are not useful, or powerful, or neat-o. I am saying that translating real-world polices into absolute statements in some rule-definition language is not a common skill and is so similar to most kinds of computer programming as to be indistinguishable.

Dress it up however you like: if you are not good at programming you likely will not be good at writing rules, let alone debugging rule implementations or grasping the cumulative effect of a large rule set on a large and complex application.

I suppose one could divide up computer programming into "predicate logic" and "computer stuff" (RAM, file I/O, databases, communications, etc). Using this division, I can imagine a person who excels at predicate logic but has no feel for the computer stuff, making this person uninterested in computer programming.

If such a person ended up as a business executive instead of a formal logic academic, (or APL programmer) such a person might be quite good at writing rules. That person might even be able to debug her rules, to cope with the often-wide chasm between the abstraction presented by such systems and the concrete reality presented by the working environment. But how many such people can there be in the world? Certainly not several in every moderately sized company in the world.

So when sales people tell you that, at last! you free of the tedium of dealing with IT professionals, finally free of the tyranny of computer programmers, stop and ask yourself this: are you really ready to eat your own cooking?

As every even junior programmer knows, predicate logic in action is rife with  unintended consequences which are the inevitable-but-unforeseen results of rules as applied to the real world.

My personal favorite example of this lateness and snow storms. Most or all of our clients above moderate size use Kronos Time & Attendance products. Most or all of our clients, in the frenzied enthusiasm of the original roll-out, go overboard with the rules, often with amusing unintended consequences. One client decided to implement a "better never than late" policy: if its employees were more than one hour late for work, they could not swipe in at all. This was intended to enforce punctuality. I do not know how it worked in that regard, but I do know that when I arrived mid-morning after a large snowstorm, I was greeted with a stream of employees leaving work: the snow had delayed them and they could not swipe in; unsure that they would be paid for the day, they were going home. Not much got done that day, but what did get done was mostly done by upper management, who were exempt from that rule.

In this case in particular, I like the "silver bullet" trope because even though there is no magic solution to the problem of translating human policy into machine-readable form, the problem can, werewolf-like, sudden get very ugly very quickly under the right circumstances.

So see a doctor for medical problems, go to an accountant for accounting and get your rules from a machine-readable logic expert. Or don't, but at least go into your decision with your eyes open.

Wednesday, April 18, 2012

The Utility of Bad Examples

Today I am reflecting on lessons I have learned in my long career and how I learned them. The obvious approach is to remember the exceptional individuals who were great at their jobs while being pleasant and consistent and generally a pleasure to know. But who needs to be reminded that we can learn from paragons of excellence? Instead, I am contemplating what I learned from the deeply incompetent, the arrogant and the people I've met who cannot seem to learn from their negative experiences.

Many of us really love a good story about some fathead and his or her hideous blunders but that is what I present here. Like the poor, fatheads will always be with us. Avoiding them is the best course of action, if you have that luxury. Failing that, there are strategies for mitigating the ill effects of fatheadedness, but that is entirely different story. Besides, we often have no choice but let fatheadedness run its course. In those all-too-frequent cases, what is the benefit?

The benefit is experience, precious useful experience and the attendant opportunity to learn from someone else's mistakes. In post, I contemplate the utility of bad examples: what do the actions of fatheads teach us, to make us better at what we do?

In my experience, the truly spectacular fathead (SFH) is arrogant: so arrogant that he or she (usually a he, so I will use that pronoun for convenience) is above mere conventional wisdom. Sometimes "thinking outside the box" is valuable, but most of the time thinking outside the box is a waste of time or worse: for conventional situations, conventional wisdom usually suffices. In order to be worth it, thinking outside the box must present some clear additional benefit to compensate for the greater effort. I try to keep in mind that in today's IT environment, human attention is the most precious resource. Use it wisely.

(So why the endless praise of out-of-the-box thinking? Because it is the employment situation that gives kudos for doing the obvious, even if the obvious is the right choice. And, every once in a great while, thinking outside of the box saves the day. The trick is to pay attention so that you notice when your situation is abnormal, but that is another post.)

Since the SFH is arrogant, he is often an object lesson in understanding conventional wisdom: why do we never do that? Oh, THAT's why. Rather than recount amusing examples of stupid people doing stupid things, I offer this suggestion: keep track of instances of bold, innovative thinking. Follow up months or years later: did the bold innovation do better or worse that conventional wisdom expects? Why? I have learned much about why certain rules of thumb exist this way--mostly as I sat amid the smoking wreckage of some IT disaster, but at least I had something useful to do while I was sitting there.

Often the SFH makes the same kinds of mistake over and over again; after all, it is almost a requirement that the SFH be unable to learn from his mistakes if he is to remain the SFH. Once you see that patterns, the SFH is useful as the embodiment of a certain kind of habitual error: when faces with certain kinds of decision, ask yourself what the SFH would do and then DO SOMETHING ELSE. Ideally, do something that makes sense in light of what your SFH has inadvertently taught you.

Now we come to the most painful methodology: seeking useful feedback. I find that a simple description of an SFH strategy, preferably to someone outside the SFH's organization to avoid accidental embarrassment, often nets useful results. A very good outcome of bouncing your observations off of someone you respect is that they may explain levels to the problem you had not seen. The most useful, but least pleasant outcome of this exercise is the casual observation from your respected sounding board that YOU are guilty of the same bad judgment. If true, this feedback is invaluable in improving yourself.

I say "if true" because some people feel that it is either appropriate or polite to accuse the speaker of any fault the speaker finds in others. But this is not a get-out-of-jail card: not all unpleasant feedback is some kind of tit-for-tat reflex. This means that really have to think about that you said, what they responded and how their insight improves your understanding of your situation. If your understanding is not improved, then discard the feedback--but discard it discreetly and politely. You might need feedback in the future and remember: most people won't be willing to give it at all, so sometimes reflexly negative feedback is better than no feedback at all.

Ignorance is bliss--until it isn't. Think about why bad decisions turned out to be bad and you are on the path to making better decisions in the future.

Wednesday, April 11, 2012

Reasoning From Worst Principles

I have nothing against reasoning from first principles; in fact, I really enjoy it. Reasoning from first principles is an excellent alternative when experience and expertise fail us. I particularly like reasoning from first principles when helping people debug large, complex systems: I find that going back to basics very helpful in avoiding the prejudices and preconceptions that are usually at the heart of really thorny debugging sagas.

However, there is a situation in which I find reasoning from first principles to be very frustrating: when management values their reasoning from first principles over all other input, including experience and expertise.

I understand that modern IT decisions are complex and cover such a vast array of possibilities that making a truly informed decision is often impossible. But we should not let the fact that we can't be perfect provide us with an excuse to be terrible.

I find that in the face of uncertainty only flexibility offers a high probability of success.

To a recent example from my own consultancy, we are in the midst of adding disk space to our core server. This is a very common situation, although there are a few complicating factors:
  • Our server is a High Availability Linux cluster with a shared RAID
  • Our server is a bit long in the tooth
  • We really prize our server uptime
 In consultation with our sys admin experts, I was presented with the following options:
  1. Get a new server, which would be more current and have more capacity
  2. In-place hard disk upgrades
  3. External disk upgrades
I evaluated the options thusly:
  1.  New server
    1. Pros: effective, easy fall-back, modernized environment
    2. Cons: expensive, housing old & new, possible environment issues, not indicated except for disk space
  2. In-place upgrade
    1. Pros: leaves most of server intact
    2. Cons: always a risk to crack open a case, requires lots of human attention, process is complex
  3.  External disk upgrades
    1. Pros: cheap in hardware cost and human time, minimum disruption to servers
    2. Cons: probably mediocre performance, another point of failure, chance to trip over cable, etc
I chose option 3 as being the cheapest, fastest, safest option.

Given what happened next, I was probably wrong:
  • the first external SATA enclosures did not play well with the kernel
  • the second external SATA + disk combination takes too long to wake up
  • in rearranging the enclosures showed up an issue with the outdated internal disk on the primary node
So now it looks like we will have to do option 3 for now and look into option 2 as well. Oh, well: that's life when you don't know everything. At least the server has stayed up and useful, we haven't more than we can afford and I see a path to my goal. I had to change course with the first set of enclosures and now I will have to change course again. Responding to real-world consequences of actions is how adults deal with reality.

Sadly, in our information systems consulting, we are running into magical thinking with ever-greater frequency. Specifically, we are running into the following template:

We had a problem, P, which we needed to solve. To solve P, we had to decide on a solution. That decision is decision D. In order to make decision D, we over-simplified P and declared that system S would completely solve P. We paid so much for S that we reason, from first principles, that problem P is solved. Since this is the real world and therefore complicated, there are aspects of system S which are imperfect, but we declare that since system S cost so much, it must do so much and it must do enough.

This is not simply a case of my being nit-picky and snarky about the usual inability to think clearly. In this case, the managers have defined the problem as not existing. Fixing a problem that does not exist is either impossible or unnecessary  or both.

When this magical thinking is in place, the tension that our customers live under is huge and is caused by the fact that they live with problems but are declared to be without problems.

There is usually the added tension of the implication that not being able to use the expensive solution to solve their problems makes them incompetent; there is often the complication that someone in their chain of command had to consciously over-promise in order to get the money for the expensive solution, so trying to get help is viewed as disloyal.

As an outsider trying to solve the problem, it is very uncomfortable to be told that whatever else we do, we may not name the actual problem since the actual problem is officially nonexistent.

This gives us, the outside consultants, two options: either look for related solutions which might help the actual problem, or be heretical and name the actual problem. Sadly for us, many professionals view consultants as a kind of Foreign Legion: expendable troops. If we poison our relationship with their bosses by saying what they cannot, so what?

We understand that only we care about how uncomfortable or unpleasant our working conditions are and there is good money to be made as IT Foreign Legionnaires, so we cannot really complain about that. But what I can do is point out that once you enter the realm of magical thinking, of reasoning from worst principles, you make success almost impossible and you make your own working environment tense and unproductive. Who needs that?

Wednesday, April 4, 2012

Benign Neglect & Tech Support

In general, I am not a fan of the "benign neglect" concept, but I make an exception for some tech support questions.

(Psst, I'll tell you a secret if you promise to keep it to yourself: I am not the only one. But more on that below.)

By this, I mean that when I am providing tech support for products, which I do on occasion, I sometimes deliberately take my time responding. I sometimes take a long time because I am really busy, but that is an entirely different kettle of fish.

I should point out that my software company, as opposed to my consulting company, is a no-frills operation: we offer a high-quality product at a rock-bottom price. We do this by skimping on the human interaction: we provide a web site, a tech blog and a free on-line manual. What we don't provide is a call center filled with bodies to talk to you on the phone; we also provide email-based tech support and a cheerful, prompt, full refund if you would rather pay literally ten times as much to one of our competitors, who will happily talk to you on the phone.

So we have a severe market-segmentation device; them that like us love us and them that don't like us REALLY don't like us. Those that love us love the fact that when we get back to you, you get a thorough, thoughtful answer. But before you get that thorough, thoughtful answer you may get a thorough, thoughtful silence.

And not all tech support requests trigger the benign neglect policy: in fact, most do not. Most are reasonable questions from reasonable people which are a pleasure to answer. Some are valid criticisms that I cringe to acknowledge, but look forward to fixing. Then there are...the others.

There are two query categories to which I apply this policy: the breathless and the clueless.

By "breathless" I mean the really excited highly caffeinated poorly punctuated query that runs on and on and interrupts itself because the writer is so darn excited he-or-she cannot contain his-or-herself and just goes on and on with super-detailed but indiscriminatingly detailed accounts of some tale of horror involving my technology behaving in ways that it does not cannot never does behave.

By clueless, I mean the complaint or question reveals such a profound ignorance of the general topic and specific issue that it is hard to know where to begin to answer.

I give the breathless a chance to catch their breath, which they often do: the initial babbling request for tech support is often followed by the equivalent of "oops, sorry, I figured it out, I was doing something stupid." In other cases, I at least get a follow up that says "I figured out X and Y, but I still don't get Z." That is a query to which I can respond: I feel that we have a shared basic understanding and a clear context for a reply to issue Z, instead of a headache and desire to forget.

I give the clueless a chance to get a clue: to reconsider what they are trying to do or how they are trying to do it. Or to decide that they want one of those cheerful, prompt and full refunds. Which we are delighted to provide: spending time arguing with people about how a $250 product can't come with hours of hand-holding is a great way to lose money.

PS It is my experience that every tech support organization does something similar: they may have a polite ignoramus contact you to acknowledge your request for support, but the actual tech firepower is probably waiting for you to catch your breath or get a clue.