Wednesday, February 29, 2012

Prose Before Code

I am going to criticize my fellow software developers in particular and engineers in general: I claim that as a group we write boring, confusing, rambling prose.

I realize that I am inviting criticism of my writing by doing this and I welcome the scrutiny: my claim is that writing adequately is within almost anyone's grasp and that I follow my own advice. Alas, if you find this post boring, confusing or rambling, then I have a problem.

I will confess at the outset that while I make my living writing software, I like to write expository prose. I always have. I like writing speeches, I like writing toasts, I like writing presentations, I like writing white papers. You name an expository prose form and I have probably written at least one of it.

Of course, enjoying pursuing an activity is not a reliable indicator of how competently you pursue that activity. Sad, but true: you can love doing something and still be terrible at it; witness my attempts to play Mario Kart on the Wii with my daughter. How can I be that bad? It is an enduring mystery. As my daughter often asks, "why can I beat you when I can't even drive yet?"

Sadly, unlike a video game, writing often does not provide adequate feedback, mostly because we do such a bad job of soliciting it. (Hint: asking a colleague to review an awesome thing you just wrote awesomely does not provoke honest feedback.)

I am even interested in the teaching writing; I don't it myself but I know many who do and I have had many enjoyable conversations with professionals about the trials and tribulations of teaching writing at various levels.

While I am in a confessional mood, I will admit that I was both an English major and a hardcore computer programming student in college. I am, unsurprisingly, also a big fan of reading. I interested in critical reading, recreational reading, skimming technical documents, serious reading of literature, referring to references and even reading advertising copy.

What I do not generally enjoy reading is expository prose written by engineers of almost any stripe, computer programmers included. My prejudice is shared by all the teachers of writing I have ever queried, from the ardent high school teacher to the burned out English-as-a-second-language coach who had just retired from helping a major American car manufacturer get more-or-less comprehensible prose from foreign automotive engineers.

I am so baffled by the extreme badness of so much engineer process for the following reasons:
  1. Being a terrible writer takes ambition. You have to go for it, you have to attempt the bold and broad, the complex and the complicated. If you stick to short sentences and use standard vocabulary, you may well be boring, but you won't be cringe-inducingly awful.
  2. Being a terrible writer requires that you flout the rules, that you go your own way, that you find your own special drummer and follow him or her without regard to where he or she takes you.
  3. Being a terrible writer usually requires that you ignore your reader and do whatever you want to do, that you make assumptions about what the reader knows, or finds amusing, or finds clever.
But engineering and its training are all about the rules. No one expects you to rediscover the laws of Physics or the principles of circuit design or best practices in software development. Instead, you are expected to know these already and adhere to them. So why do so many engineers feel that picking up a pen, or sitting at a keyboard, is license to just go for it? After all, the compiler or interpreter or circuit board or laws of motion have no sense of humor whatsoever. We should be used to leaving the flourishes for our hobbies and our social interactions. (I know, I know, we are not a group with whom most people rush to party.)

When I corner a hapless professional writing teacher at a dinner part and run through my standard rant, the most common explanation offered is the student empowerment movement in American education. Be free of the rules! Express yourself! Be original! Let your natural talent soar! Find your voice!

What twaddle. Most of us have precious little natural talent; fewer have a pleasant writing voice that appeals to most people.

Furthermore, the forms and conventions are very effective. They give your reader a sense of orientation, of familiarity and of confidence that they know what is coming. Do I want to be startled by the originality of a white paper on how to make database calls from a given programming language to a given database management system? No, I do not: I want to acquire the desired information quickly and effectively. I have searched for this kind of information many times before and will do so many times again before I retire: excitement, suspense or high style are neither desired nor required. Just the facts, ma'am.

For example, the Unix "man page" format frees the writer from having to figure out how best to document a given function and instead lets the writer concentrate on writing the actual documentation. On the other end of the equation, the format means that the reader knows what is coming and how best to read the document.

I have been guilty of this desire to run amok myself: the first time I wrote a business plan, I was appalled at how outdated and stupid the classic form seemed to be. I knew that I could do better. I yearned to do better. I grit my teeth and half-heartedly adhered to the form and the plan was not a big hit with the investors and possible recruits at whom it was aimed.

I asked a friend who is in the business of reading business plans for some advice. Instead of undertaking a critique of my plan in particular, he gave me a sense of the audience in general. He said that he had a stack of a couple of hundred business plans on his desk, which he was supposed to quickly whittle down to a dozen or so. That dozen or so were to be passed along to the next, more senior, reviewer, and so on. In order to review hundreds of plans, the plans had to adhere rather strictly to the format: non-standard plans were usually tossed aside immediately; very very very occasionally, the non-standard plans were so awesome that he read them later, retrieving them from the discard pile. But almost always, the non-standard plans were removed as part of the first pass without really being considered.

This underscores a painful lesson I learned early on in my writing career: very few of us have readers who are obligated or deeply motivated to read whatever we write. Most of us have readers who will plow ahead only so long as they are getting more out of the writing than they are putting into it. One of my secondary school writing teachers used to put a red line in the left margin. I asked him why he did that and I will never forget his answer "to mark the point at which I stopped reading."

Sadly, many engineers fail even when the deck is stacked in our favor: we often write documents other people feel obliged to read: manuals, implementation notes, etc. And still we abuse our readers to the point where they give up part through, leaving them to flail with whatever piece of technology the documentation was supposed to illuminate.

There are many good books on writing. There are many tips and tricks. I will not attempt a mini-recap here. Instead, I will beg my fellow engineers to seek honest feedback from readers, to consider readers and to find a set of rules to which they can adhere. Just because we can get away with self-indulgent and awful prose does not mean that we should.

Wednesday, February 22, 2012

The "I Don't Know" Factor

In our consulting practice, we often commiserate with each other about our clients' poor internal communication. We spend much of our time navigating our clients' internal expertise trees, running down answers to questions that arise as we try to provide service.

We usually shake our heads in wonder at how often we have to plant ourselves, physically, in people's offices before they will answer our questions. We often note, with some bitterness, that the answer is some form of "I have no idea" and then we have to move on to the next node in the expertise tree.

Last week I was struck by an admittedly obvious-in-hindsight thought: what if these two phenomena, the long chase and the unsatisfying answer, are actually two facets of the same underlying issue? What if many of the various issues I have encountered--and about which I have written--are all related? What if, for various reasons I will recapitulate below, the answer is often "I don't know" and this ignorance is why no one ever wants to get back to us?

It seems to be taboo in our business to say "I don't know" which is a shame: if you don't know and don't feel that you can cop to that, then you don't have any good options. In fact, the only option most people have is to then become a lying weasel, frequently resorting to rudeness as a diversionary tactic.

In my experience, true experts say "I don't know" often and quickly. They usually add "but I can find out" or "here is how you can find out". If you are confident in your competence, then not knowing something is rarely shameful. In fact, as a domain expert who is rarely asked a question to which I do not immediately know the answer, I can attest that being stumped by a question is a mildly exciting change of pace and a chance to learn something.

I should point out that my rarely being stumped has less to do with my innate awesomeness and more to do with the fact that I have almost 30 years of experience in basically the same small field. At this point, I damn well ought to know 99% of what I encounter all day. And I do.

I should also point out that I claim that my deep expertise in my own small area leads people to ask me questions wildly out of my area, as though experthood were separate from subject matter, like height or eye color. When this happens, I am no more likely to have a useful answer than anyone else, but depending on how important the client and how deep the boredom, sometimes I go looking for the answer. I like learning things. Sadly, this trait is not universal.

In my experience, posers struggle valiantly to avoid saying "I don't know" and either evade or dissemble. Does this fool anyone? No, it does not. But at least people eventually stop asking you questions, which is a victory of sorts, I suppose.

But why is it not acceptable to say "I don't know" in the workplace, at least the IT workplace? Is this related to the cultural shift away from knowledge and toward opinion? Perhaps in the future, we will all have our own truth and wonder why no technology works properly.

While I am at it, why isn't it ok to say "I have no opinion" either? Must everyone care about everything? I have an iPhone, I really like it, but I have no experience with Android, Droid, Blackberry or Windows Phone and so have no opinion about them. I just don't. Why does that annoy some people?

Lest I appear to be merely an arrogant would-be know-it-all, I offer the following list of reasons an IT person might legitimately simply not know something:

But sometimes we are just insecure jerks who don't like dealing with people when we can deal with nice, quiet, unemotional and unjudgemental technology instead. But you didn't hear it from me.


Wednesday, February 15, 2012

Modern Modularity & Iterative Development

I recently wrote an app in about two hours: a highly tailored, very powerful tool for a clerk. And I did it as a Web CGI app using straight Perl without the benefit of a rapid application development environment or even a framework like Mason.

Hurray for me. Except, of course, I didn't really do that. It is true that I wrote the app in about 45 minutes and then spent a little over an hour in a tight user feedback / iterative development cycle, resulting in the finished product. But this is the movie version, the marketing hype, the storyline that ignores all the boring ground work that went before.

It would be more accurate to say that I spent about dozen hours over several weeks on various stages of this project: still fast, still cheap, just not spectacular (or utterly unbelievable).

From my perspective, the process went like this:
  1. The client defined a problem: clean up a server's disk area by removing redundant files stored by a system of ours.
  2. I defined the reconciliation algorithm, using md5sum to confirm that files exist in both places. Along the way we encountered some horrific special cases, but that is par for the course in the real world.
  3. I wrote a audit program to put a random sample of these files in a web page for review; our clerk reviewed several batches of these files spread out over the entire repository. We found that the issues were highly concentrated in time windows, so we tailored our approaches to fit the eras that we found.
  4. Based on her feedback, I reused much of the audit program to create a web-based UI to support her review and I wrote a couple of small helper utilities to automate the review process that the clerk had used. She confirmed her earlier era-based findings.
  5. I added a database table to hold the information that the clerk was tracking on her paper pad and which she had figured out was what she needed to know. This allowed us to break down the repository into manageable chunks and to start tracking the clerk's progress more accurately.
  6. I wrote a daemon to do prep work in the background and another one to automate as much of her review as I could. This cut the number of items requiring human review by 90% and allowed us to start automatically correcting that 90% while she worked on the ugly 10%.
  7. Once all that was in place, I wrote the final app, a UI to review the contents of the database as set up the by daemons and to add editing functionality to support saving the results of the review. Then I added a few links to those helper apps to support the reviewing and I was done.
This is how I write good user-intensive software: a design phase which moves quickly to practical applications; we figure out the ideal process and the software trails quickly behind until we have a system that we all trust and that does the job.

Note that my utilities and previous code are highly reusable in both senses:

  • I can use the code with little or no modification in different apps, which means that I only have to develop and validate once but use many times.
  • The stand-alone web helper apps are easy to string together to enrich the environment provided by the main app. This is an homage to the Unix "string of pearl" philosophy which I find works so well at the O/S level.
In theory I should get both of these benefits from the Object Oriented programming paradigm, but in practice I don't seem to.

In this day and age, the web technology allows us to have separate-but-easily-integrated pieces and modern programming languages allow us to reuse chunks of code easily so there is no excuse for not producing polished and power apps rapidly and cheaply.

Wednesday, February 8, 2012

When You Can't Read The Code

As the large enterprise solution becomes more and more common, I find that the keepers of such systems seem less and less interested in putting in the time to become domain experts.

As IT systems consultants, we are often interacting with these enterprise solutions. Some percentage of the time, we encounter behaviors in these systems which we find unexpected or undesirable. These behaviors may or may not be bugs, but we feel that we should understand them before we ignore them or program around them.

Back in the day, when more apps were homegrown and more systems were provided with source code licenses, the people we asked about unexpected or undesired behavior were the client's own technical people. If they did not know the answer, then they could find out by reading the code.

Increasingly, we find that our client contacts are either non-technical people or out-of-date technical people. Only a small minority of contacts are technically competent, helpful providers of deep background and technical information. We often have to bypass the client and go straight to their large system vendors because there is no local technical expertise.

To our dismay, it is our technical bretheren, our peeps, who are the hardest to deal with: now that they cannot read the code, they tend to either shrug off our questions or give us shallow guesses off the top of their heads. After all, if you can't be a hard master then why not just give up?

To our delighted surprise, the non-technical people are generally easier to deal with because they know that they don't know and they look for the answer. They will actually contact the vendor for information, or find documentation, or give us the benefit of the user's experience.

While we don't like treating systems as black boxes or as puzzles into which we pour various inputs and monitor the outputs, at least these methods actually turn up results. Getting half-baked guesses from formerly expert techies just wastes our time as we discover just how half-baked the guesses are.

Just recently I spent an afternoon running a series of experiments as we tried various inputs into a large system to see if we could figure out how to accomplish a particular upgrade for the user. We started with a half-baked guess from a supposed expert techie, proved that this techie was wrong about how the large system worked and got down to the business of figuring out to do what must be done.

It seems to me that technical people have to be able to operate in the absence of hard data. My checklist for answering technical questions goes like this:

  1. Query experts if you can
  2. Read the code if possible
  3. Read the technical documentation if there is any
  4. Query super-users if you can
  5. Read the user documentation if you can
  6. Query experienced normal users if you can
  7. Write code to grok databases, logs, etc
  8. Try to reason from first principles
  9. Guess because there is absolutely no other option

And remember that when talking about technical matters, which are not generally matters of taste or subjective opinion, trust but verify.

Wednesday, February 1, 2012

VMs and O/Ses and Bears, Oh My!

IT Nirvana?

Computer systems offer more bang for the buck than ever before. There are more and better options than ever before. Storage is cheap and plentiful and processors are mighty and don't catch fire when you use them.

In theory, the size of data set or processing problem that is now "easy" should be so big that just about anything I do should be easy. In fact, since there this so much computing resource available to help me do it, just about anything I want to do should be simple as well. But I am finding that this implicit promise of simplicity is not being fulfilled, at least not at the system level.

I Am Not Immune

Consider my humble desktop at work. It is an Ubuntu box which is backed up automatically and offsite by the mighty boxbackup utility. Thanks to Virtualbox, I use virtual machines, of various types: a windows7 VM for development and to host my iPhones; various Linux VMs for various special purposes.

Recently we lost power while I was out of the office. My trusty desktop shut down gracefully because I run apcupsd. Hurray! apcupsd even told me when the power went off, although in true open source geek fashion, it reported the time in GMT.

However, the VMs did not shut down gracefully when their host did. Apparently, this is something that I have to set up myself using launchd. One of the VMs was trashed when the host shutdown; the other two were fine.

So on to restoring the trashed VM: the good news is that I found that my boxbackup repository was up to date; the bad news is that it contained a copy of the crashed VM in it. I am a belt-and-suspenders kind of guy, so I have a manual local back up to check: yes, I have an image in the on-site backup which is a few months old, but that is ok: these VMs do not change much over time. So, problem solved in the short-term but not the long-term.

In the long-term, I need to get some professional sys admin time or I need to change into my super tech costume and chase down these issues myself. I strongly favor the first option: the more I know, the more I value the knowledge of others. There is also the factor of money, though: the longer the Great Recession goes on, the less inclined I am to shell out real money unless I have to.

The 21st Century Data Center So Far: Boo

I bring up the plight of my desktop to emphasize that the rant that follows is  not simply screed against any particular sys admin but an observation about the environment in which most sys admins have to do their jobs.

As we struggle to deliver software and service on our client's hardware we constantly run into misconfiguration of hardware, virtual hardware, operating system software and services such as web servers and database servers.

We also run into poorly implemented policy and self-contradictory policy which doesn't help and somehow offends me more: can't we at least agree on a usable definition of what we are trying to do?

As VMs become more and more common, and the ability to deploy them correctly more and more rare, we are being forced to return to the "my software, my hardware, my responsibility" model of the past. Especially when we find that off-brand Unix distributions such as AIX seem to lag so woefully behind current.

In the 1980s and 1990s, we used to drop "departmental servers" into our client's work areas because the company mainframe was too expensive, too dedicated and too central to use. In what is now known as the Apple model, life was good: the client had one contact point and we had a known, stable environment.

In the 2000s, we tried to get with the program and use existing infrastructure such as database servers, DNS and DHCP . This was a nightmare: for one thing, every new IT administration seemed to want to do things differently: Microsoft! Open Source! Back to Microsoft, but maybe running some Open Source software on the windows server! Ok, how about thin clients which were sort of Windows? Oh, were you still using the old DNS? Sorry about that--wait, let's use Active Directory for authentication! Is it set up correctly? Who knows!

A Computer System of One's Own

Now we are worn out debugging other people's hardware configurations and system software deployments. We are looking to provide software-as-a-service on our hardware. We are currently mulling over the following options:

  1. We charge a monthly usage fee for access to a working system that is on our premises, under our control and accessed over the wild and wooly Internet via a VPN
  2. We charge a monthly usage fee for access to a working system on a host or hosts dropped by us into the data center: we set it up, keep it up and back it up: you provide power, A/C and a lack of ambient water. Seriously. From experience, we can say that the flooded machine room is a no-go.

We are cautiously excited about The Cloud; we might very well end up using Amazon's offerings in this area, once we are sure that privacy-conscious, mostly-health care clientele can dig it.

Internal IT have been very resistant so far, but we hope that accounting issues and procedural clarity with triumph. We tried to play nicely with the other children, but they kept peeing into the sandbox.