For the sake of brevity, let us choose an example of a hot new development technology: Ruby on Rails (http://rubyonrails.org/). This assumes, of course, that some thing can still be considered "hot" if I have heard of it.
If you hire a group of neophytes who "grew up with" Ruby on Rails, what do you get? You get a group of programmers who can hit the ground running in your chosen development technology, which is good. What do you lack? You lack domain knowledge in your target domain, you lack real-world project experience, you lack experience in all that a working, commercial system must do other than meet a technical spec; in fact, you lack just about all relevant experience outside the narrow domain of Ruby on Rails development.
The only scenario I can imagine being well-served by a neophyte team of Ruby on Rails programmers is a contest of some sort where the goal is to create a Ruby on Rails app in a short time that will impress a panel of college students.
Organizations keep going down this path and keep being disappointed that the software they get isn't very good. In fact, it is often worse than the software it replaces. And there isn't usually a progression within a domain, of software getting better and better, and feature sets evolving to keep the best features and lose the apparently-cool-but-actually useless ones.
This is because evolution is a cumulative process: if you keep starting from scratch, you keep making the same mistakes. This is the unavoidable truth even if you are using ever-more-hip environments to make those mistakes: perhaps you will be able to make those mistakes faster and cheaper and on more platforms than you could before, but you will still make those mistakes.
I know that there are software Mozarts out there, able to write great code from an early age, but "great code" does not mean "great software." "Great code" usually means "code that impresses other programmers." All too often users of the software built with that code are left wondering how the author of an irritatingly hard to use program is considered so highly by his technical peers.
As a practical example from my own experience, I offer this common task: loading data into a database. I am an expert in "moving data around" so I am often called in to assist with this particular job. I often find a young, inexperienced team of programmers who are expert in a particular technology, but not computer system experts per se. Increasingly, the home team are all experts in some interactive-oriented technology, such as Ruby on Rails or Visual BASIC, and not at all comfortable with batch-oriented technology such as SQL transaction sets.
The common mistake for the young and inexperienced is to try to force a batch operation into an interactive paradigm, to write code to take the input data and force it through the system as if a superhuman typist had entered it at the keyboard. This mistake has two possible failure modes; sometimes it fails in both ways at the same time:
- The loading does not work reliably
- The loading takes an unusably long time
When the loading is unreliable, the issues are almost always exception-handling and/or data processing.
Experience tells me that usually the right way to handle a batch update is to apply the transaction one-by-one, putting aside the failures for later human review, correction and reapplication. Only rarely does the user want an entire data set rejected because of a typo in one of the fields of one of the records.
Users generally are far more comfortable with a UI to tell them how many record were in the batch, how many went applied to the database and how many were not applied. They really like being able to inspect and correct the bad records and re-process them until the batch is closed. Does this fit the textbook SQL COMMIT & ROLLBACK paradigm? No, it does not. Sadly for our team of neophytes who will have to learn the hard way why the users are so unhappy with their work. Maybe as our team gets more experience, they will get better, if they can find a job after they hit 28.
When the loading is slow, the issue is almost always trying to put a large number of transactions through a pipe designed to service human input.
The overhead in servicing human input is vast, but the human data rate is so low that this overhead is quite acceptable. Turn that equation on its head--fast data rate, no user to react to the various edit checks--and you have a very poor computing paradigm.
Instead of trying to put data through the human channel, experience tells me that it is often quick and easy and reliable to process the incoming data directly into SQL INSERT, DELETE or UPDATE statements and let the database manager do its thing. If you need to make multiple passes over the data to establish relationships, then do that. If you need to sort through transactions to weed out duplicates or just take the most recent change, do that too.
When you are done fine-tuning your batch process, you can schedule the job to run in the background and voila! yesterday's data is loaded before the start of work today. The computer doesn't mind the extra work, so make the process redundant and have it try over and over again so that delays are not a problem and so that accidental resubmission of data is rejected.
Experience tells me that these operational aspects are what makes a data loading process robust and useful. I don't see how being expert with a given development environment, but ignorant of real-world issues, would make me better at my job.
Now if I could only convince the ever-growing cadre of number-crunching, budget-watching managers who came up through the management track that this delightful fallacy is rarely a good strategy. Maybe I need to find a Ruby on Rails book and add some more buzzwords to my CV.
No comments:
Post a Comment