What People Don’t Understand About Software

To celebrate Independence Day weekend, with all of its cold beer, grilled meat, fireworks, and patriotic remembrances, I offer the following list of fundamental ways in which most people misunderstand software:

Software is Machinery

It’s common for non-engineers to miss the most fundamental characteristic of software, because what comes out of the development pipeline isn’t “real” stuff. It’s just text, and files, and numbers in a database. Nevertheless, software is machinery. Real machinery, in the sense that it is comprised of individual parts or components, each of which has to be engineered to relate and interact properly to other parts, such that the whole machine functions according to its design. In fact you could view software as simulated machinery, but this distinction doesn’t offer any comfort. Is a simulation of a lawnmower less complex than a lawnmower? In fact it is more complex, since you can’t simply machine each part from some material and set it on a shelf, certain that when you pick it up again it will still work. In simulations the parts are… well, simulated. This implies that they can be defined and redefined. Anything that can be redefined can change.

Most machines, and thus most software systems, are closed-state devices: this is just a fancy way of saying that they are designed to do a certain thing, reliably, and don’t suddenly go off and develop new capabilities. Push the eject button, the tray comes out. Push it again, the tray goes back in. Continue doing this and the same thing happens over and over. A well-designed machine does not have failure conditions that send it off into realms of unknown behavior. In the real-world of metal and plastic you can be relatively certain that your machine will not fail because two parts suddenly decide to interact in an undefined way. Instead, assuming the basic design is correct, you are looking at problems that might arise from physical failure, wear and tear, outside tolerances, and the like. Conversely, in software we don’t suffer physical failure (hardware failure happens at a meta-level outside our application’s realm of responsibility – it’s more analogous to an automobile failing because the laws of physics suddenly stopped working), but instead have to deal with the mutability of our parts. Much of software engineering over the last three decades has been focused on finding better and more controllable ways to define and maintain parts.

The Devil is in the Semantics

I love the word “semantics.” It’s a nicely pedantic way to speak about “all the knowlege of what something is and what it is supposed to do.” Lately the word has come to mind when I read about Service Oriented Architectures and XML, and I am sure I feel a little like the structured programming guys felt back when I was one of a legion of fresh-faced young C++ programmers prattling on about how OO was about to usher in a world of replaceable componentry. Well it did, for all the stuff with stable semantics (i.e., semantics that don’t change from application to application). Microsoft’s .Net framework is a great example of the current state of the art in layering simplicity on top of complexity, and I like it.

So why is it still so hard to build software systems? Because all that stuff that is captured in frameworks, the stuff with stable semantics, is the easy part. It’s the domain semantics that are hard to get right; the stuff that is unique to each business process or environment. If you are setting out to build an America’s Cup-winning 12-meter sailing yacht, it’s nice that you can buy winches off the shelf, but the winches aren’t what makes your yacht special; they aren’t the hard part. And they aren’t where the risk lies.

Not surprisingly, domain semantics tend to express themselves in the interfaces between systems, and between pieces of individual systems (distributed or otherwise). The stable stuff, like how to create a database connection, is buried in the individual apps. This is worth bearing in mind when people wax rhapsodic about XML and the power of self-describing data. XML is cool, but in terms of it being self-describing… to whom does it describe itself? Systems that receive XML messages still have to know what to do with them. Suppose I finish my dinner at a restaurant, and place my credit card on the table. The waiter takes the card, and brings the ticket back for my signature. I can accomplish this transaction in virtually any restaurant in the world without describing the data involved to anyone. The semantics are understood by both parties. On the other hand, what if I have an accountant who handles all my bills (I don’t), and I write his number on a napkin? I am not likely to get my bill paid without further describing the semantics of what I am trying to do (and possibly not even then), even if the information is clearly labelled as <acct_who_pays_stuff>xxx-xxx-xxxx</acct_who_pays_stuff>.

Here’s a statement to liven up your next watercooler discussion. Tell your colleagues there is no semantic difference between an XML message and a parameterized function call. The only advantage in the XML message is that the receiver can figure out the data format from the message: it doesn’t need to know how many bytes of account information are followed by how many bytes of address information. That’s it. And it ain’t a new idea. A SOAP message is an incredibly verbose way to format the parameters to a function call. Nevertheless, it can be a good thing in the right applications. XML messaging can greatly reduce the coupling between interacting applications and systems at the boundaries of an enterprise, and between units within the enterprise. But it won’t lead to applications that are able to figure out all by themselves what needs to be done with the information in the messages they receive. XML and SOA won’t free us from the devil in the semantics of the problem domain.

Risk is Fundamental

If you accept that software is machinery, and that each new application entails specific semantics from a problem domain, then you have to come to the conclusion that each new software project is essentially the design, construction, and testing of a new and complicated piece of machinery. All of the neat stuff we get in frameworks like .Net is nothing more than solenoids, switches, bolts, and nuts. I don’t mean to suggest that the availability of these parts has not sped up the development of software systems, because previously we were making each bolt and nut by hand, without standards. Having standard bolts and nuts is a good thing. But as I pointed out above, they don’t reduce the real risk of development.

Have you ever had a house built? Did the contractor have to use custom parts, or was everything he needed pretty much available off the shelf? Unless you made a mint in the dot-com boom you probably don’t have much custom stuff in your house, and yet I would wager that it: a) wasn’t done on time; b) didn’t follow a predictable timeline with respect to milestones; and c) had a fair number of errors in it at completion. Highly customized houses are even riskier and less predictable. Scale that up to an office building, or a skyscraper, or a bridge, all of which typically have a lot of custom content. The Brooklyn Bridge was pretty custom for its time, and was designed and executed by John Roebling and his son, who had long experience building bridges. It was proposed for execution on a five year timeline, and took nearly fifteen years to complete. Along the way they invented new technologies, discovered new risks, and eventually got the thing put together.

I leave it as an exercise for the reader to judge where on a continuum from, say, a lawnmower at one end to the Brooklyn Bridge on the other, most software projects lie in terms of risk and complexity. I have my own opinion, but any such measurement is horrendously subjective. Well, maybe not completely. There are good complexity metrics out there, and applying them to most non-trivial software projects will show a very high degree of complexity and risk. An equities trading system is more complex, and more risky, than a marketing information system, and that is worse than a meeting room scheduling system. Being able to judge the relative complexity of a system is key to judging the level of risk. But one thing is certain: software is not immune from the inherent complexities of every large-scale engineering project. There is no way, at present, to take the risk out of it. What this tells us, I think, is this: don’t undertake the development of new software systems unless the potential business benefits return significant multiples of (a multiple of) the estimated development costs.

Business is Proprietary

The only way to get broad benefit out of self-describing data markups is to standardize schemas. That is, it is a little help to be able to easily find an account number in a message, but it is a lot of help to know that every message concerning an account will have an account number, an address, the name of the institution, the name of the account holder, etc. But businesses are notoriously temperamental about engaging in the process of debating and validating standard definitions of business data. They are most likely to be willing to do so as regards data they exchange with suppliers and partners, and least willing to do so when it involves their interfaces to customers. The bug in the ointment is that you are your supplier’s customer. Why should they invest in making it easier for you to switch vendors?

A few years back Microsoft and Intuit collaborated with a few banks and other partners to create Open Financial Exchange (OFX). OFX is a set of XML messages that standardize information about a bank account. Both Microsoft and Intuit had selfish motives: they wanted to be able to easily support bank account data retrieval in their personal financial management products: Money and Quicken. These two programs, however, have only about 15 million or so customers. Think that is a lot? There are a billion credit card accounts in the U.S., and tens of millions of checking and savings accounts. Many banks have put OFX capabilities in to place, essentially deeming it cheaper than the loss of the few customers who use it, but none vigorously promote it, and why should they? Banks are no more likely than other businesses to want to de-brand themselves and turn their customer relationships into standardized messages that can be exchanged with any of their competitors.

If you want to judge the potential acceptance, or take-up rate, of a proposed messaging standard, then judge to what extent it intervenes between a business and its customers. If the data being standardized represents the core of the relationship, as account data does to banks, then the chances of an aggressive take-up rate are low, and the chances of eventual failure high. Businesses want proprietary relationships with their customers, to the extent that their markets will allow it. Service oriented architectures will continue to find applications within large enterprises, perhaps their native habitat, but adoption will be slow for external interfaces, and we’re not likely to see an era, ever, where most businesses present themselves to the world in the form of standardized service protocols. Nobody wants to be a component; they want to use others as components.

The moral of the story is… well there isn’t one really. This is just the way I see the world, and I offer it as no more than that. Have a great Fourth of July weekend.