The Accuracy of Future Predictions

One thing we know about software design is that the future is important. However, we also know that the future is very hard to predict.

I think that I have come up with a way to explain exactly how hard it is to predict the future of software. The most basic version of this theory is:

The accuracy of future predictions decreases relative to the complexity of the system and the distance into the future you are trying to predict.

As your system becomes more and more complex, you can predict smaller and smaller pieces of the future with any accuracy. As it becomes simpler, you can predict further and further into the future with accuracy.

For example, it’s fairly easy to predict the behavior of a “Hello, World” program quite far into the future. It will, most likely, continue to print “Hello, World” when you run it. Remember that this is a sliding scale–sort of a probability of how much you can say about what the future holds. You could be 99% sure that it will still work the same way two days from now, but there is still that 1% chance that it won’t.

However, after a certain point, even the behavior of “Hello World” becomes unpredictable. For example, “Hello World” in Python 2.0 the year 2000:

print "Hello, World!"

But if you tried to run that in Python 3, it would be a syntax error. In Python 3 it’s:

print("Hello, World!")

You couldn’t have predicted that in the year 2000, and there isn’t even anything you could have done about it if you did predict it. With things like this, your only hope is keeping your system simple enough that you can update it easily to use the new syntax. Not “flexible,” not “generic,” but simply simple to understand and modify.

In reality, there’s a more expanded logical sequence to the rule above:

  1. The difficulty of predicting the future increases relative to the total amount of change that occurs in the system and its environment across the future one is attempting to predict. (Note that the effect of the environment is inversely proportional to its logical distance from the system.)
  2. The amount of change a system will undergo is relative to the total complexity of that system.
  3. Thus: the rate at which prediction becomes difficult increases relative to the complexity of the system one is attempting to predict the behavior of.

Now, despite this rule, I want to caution you against basing design decisions around what you think will happen in the future. Remember that all of these happenings are probabilities and that any amount of prediction includes the ability to be wrong. When we look only at the present, the data that we have, and the software system that we have now, we are much more likely to make a correct decision than if we try to predict where our software is going in the future. Most mistakes in software design result from assuming that you will need to do something (or never do something) in the future.

The time that this rule is useful is when you have some piece of software that you can’t easily change as the future goes on. You can never completely avoid change, but if you simplify your software down to the level of being stupid, dumb simple then you’re less likely to have to change it. It will probably still degrade in quality and usefulness over time (because you aren’t changing it to cope with the demands of the changing environment) but it will degrade more slowly than if it were very complex.

It’s true that ideally, we’d be able to update our software whenever we like. This is one of the great promises of the web, that we can update our web apps and web sites instantaneously without having to ask anybody to “upgrade.” But this isn’t always true, for all platforms. Sometimes, we need to create some piece of code (like an API) that will have to stick around for a decade or more with very little change. In this case, we can see that if we want it to still be useful far into the future, our only hope is to simplify. Otherwise, we’re building in a future unpleasant experience for our users, and dooming our systems to obsolescence, failure, and chaos.

The funny part about all this is that writing simple software usually takes less work than writing complex software does. It sometimes requires a bit more thought, but usually less time and effort overall. So let’s take a win for ourselves, a win for our users, and a win for the future, and keep it as simple as we reasonably can.

-Max

9 Comments

  1. Well, you *could* spend a lot of effort on designing and implementing a system to adapt to any possible change of circumstances that might occur in the future. Which is probably the only possible way of making sure that such a change will never happen… 🙂

    • Nope, that is impossible. The future is infinitely complex, and software cannot be. Attempts to do this are one of the primary causes of overly-complex, unmaintainable systems.

      • What you really end up with, when you try to do that, is a lot of extra complexity for a future that doesn’t arrive, and a ton of work to now adapt it to the future that really did show up.

  2. Of course the Python 2 code will still work in Python 2, including 2.7.3. There are a few things in Python 2.0 that don’t work in 2.7, but this isn’t one of them. Python 3 is deliberately incompatible.

    I think that the culture of language implementors and designers has gone a little wrong since the 1980s – there used to be a very serious concern about backwards compatibility. For instance any C compiler worthy of the name had better compile “ANSI C” without any semantic changes. I write “ANSI C”, technically ambiguous, because that is *still* the shorthand used by C programmers to mean ISO/IEC 9899:1990, which was basically settled in about 1987. C compilers mostly also do a fine job of “K&R C”, vintage 1973, although there is more variation because the semantics were not so clearly nailed down. Ditto implementations for a number of other languages from the 1970s and 1980s (e.g. Common Lisp: the Lisp implementations I use are strictly compliant with the 1994 Common Lisp standard, which was the result of a convergent standards process begun in the mid 1980s). The implementations have moved on, and can *also* handle other languages (e.g. C11), but have not dropped or impaired in any way their support for languages defined more than 20 years ago. This is a huge strength that the implementors of languages such as Python should emulate.

    • So you know, once upon a time, I wrote two blogs posts about almost this exact subject. (Backwards compatibility in general, but it’s relevant to what you’ve said.)

      http://www.codesimplicity.com/post/ways-to-create-complexity-break-your-api/

      http://www.codesimplicity.com/post/when-is-backwards-compatibility-not-worth-it/

      In truth, I think that the level of backwards-compatibility that C has maintained is likely a weakness at this point. The more and more you nail down a piece of software, the more and more it will degrade over time by not being able to adjust itself or improve itself as time passes. You’re certainly correct that backwards-compatibility is a boon to developers, and it should never be broken without reason. But as I try to get across in the two articles above, the ultimate decision should be a balance between simplicity for existing users and simplicity for future users.

      -Max

    • By that criterion, Microsoft Visual C++ failed for quite a while. Only recently did it get into line with the iterator of a for loop having the scope of the loop it enclosed (i.e., the following was not legal in VC++ but is perfectly good ANSI):

      for (int i = 1; i < 10; i++) {
      // do something
      }

      for (int i = 1; i < 20; i++) {
      // do something else
      }

      In VC++ it used to complain that you were trying to define "i" twice in the same scope.

  3. You’re absolutely right more mistakes will be made by trying to predict things you will need to do in the future. It is much more important to get it right for what you need now – not what you may need in years to come.

    Great article!

  4. I disagree (respectfully) with the premise of this article. After 20 years in software, I’ve watched the same mistakes get made over and over and over again, with the same statement. “no way to predict the future, bro”. This is the typical case of an over-reaction to an issue (over-developing). The argument against “building for the future” becomes “can’t know every scenario” so to heck with it. Give up and build code that is non-scalable.

    My philosophy is that a little preparation up front can save a ton of refactoring in the future. For instance:

    — build your code so that it is segregated, so that it can be easily refactored into services. Do you need to make it service oriented on day one? No. Will you need to eventually. Probably. Every company I’ve been at has hit that wall where they had unbuildable, un-deployable massive monoliths of code and half a year to refactor.

    — build code/db so that it is shardable. Again, shard on day one, probably not. On day 1000 you may need to.

    These are just examples, so don’t get stuck on the solutions. The point is, engineering is the science of using data to predict and build for the future. We are certainly (hopefully) intelligent and educated enough to gather data and make some reasonable predictions.

Leave a Reply