Code Simplicity

What Is Overengineering?

Software developers throw around this word, “overengineering,” quite a bit. “That code was overengineered.” “This is an overengineered solution.” Strangely enough, though, it’s hard to find an actual definition for the word online! People are always giving examples of overengineered code, but rarely do they say what the word actually means.

The dictionary just defines it as a combination of “over” (meaning “too much”) and “engineer” (meaning “design and build”). So per the dictionary, it would mean that you designed or built too much.

Wait, designed or built too much? What’s “too much”? And isn’t design a good thing?

Well, yeah, most projects could use more design. They suffer from underengineering. But once in a while, somebody really gets into it and just designs too much. Basically, this is like when somebody builds an orbital laser to destroy an anthill. An orbital laser is really cool, but it (a) costs too much (b) takes too much time and (c) is a maintenance nightmare. I mean, somebody’s going to have to go up there and fix it when it breaks.

The tricky part is–how do you know when you’re overengineering? What’s the line between good design and too much design?

Well, my criteria is this: When your design actually makes things more complex instead of simplifying things, you’re overengineering. An orbital laser would hugely complicate the life of somebody who just needed to destroy some anthills, whereas some simple ant poison would greatly simplify their life by removing their ant problem (if it worked).

This isn’t to say that all complexity is caused by overengineering. In fact, most complexity is caused by underengineering. If you have to choose, the safer side to be on is overengineering. But that’s kind of like saying it’s safer to be facing away from an atom bomb blast than toward it. It’s true (because it protects your eyes more), but really, either way, it’s going to suck pretty bad.

The best way to avoid overengineering is just don’t design too far into the future. Overengineering tends to happen like this: “Okay, I need some code to reverse a string. Well, might as well make a whole sytem for rearranging and modifying the letters in a string, since we might need that some day.” Essentially, somebody imagined a requirement that they had no idea whether or not was actually needed. They designed too far into the future, without actually knowing the future.

Now, if that developer really did know he’d need such a system in the future, it would be a mistake to design the code in such a way that the system couldn’t be added later. It doesn’t need to be there now, but you’d be underengineering if you made it impossible to add it later.

With overengineering, the big problem is that it makes it difficult for people to understand your code. There’s some piece built into the system that doesn’t really need to be there, and the person reading the code can’t figure out why it’s there, or even how the whole system works (since it’s now so complicated). It also has all the other flaws that designing too far into the future has, such as locking you into a particular design before you can actually be certain it’s the right one.

There are lots of common ways to overengineer. Probably the most common ways are: making something extensible that won’t ever need to be extended, and making something way more generic than it needs to be.

A good example of the first (making something extensible that doesn’t need to be) would be making a web server that could support an unlimited number of other protocols in addition to HTTP. That’s kind of silly, because if you’re a web server, then you’re sending HTTP. You’re not an “every possible protocol” server.

However, in that same situation, underengineering would be not allowing for any future extension of the HTTP standard. That’s something that does need to be extensible, because it really might change.

The issue here is “How likely it is that this thing’s going to change?” If you can be 99.999% certain that some part of your system is never going to change (for example, the letters available in the English language probably won’t be changing much–that’s a fairly good certainty) you don’t need to make that part of the system very extensible. (Even so, it’s still good to leave a tiny little room to expand, in the very rare chance that somebody adds something like, say, the Euro symbol to the language.)

There are just some things you have to assume won’t change (like, “We will never be serving any other protocol than HTTP”)–otherwise your system just gets too complex, trying to take into account every possible unknown future change, when there probably won’t even be any future differences. This is the exception, rather than the rule (you should assume most things will change), but you have to have a few stable, unchanging things to build your system around.

The second way (making something too generic) goes like this: Imagine that the Bugzilla Project suddenly went insane, and instead of saying that Bugzilla was a “bug tracking system”, we decided to make it into a “generic system for managing data in a database through forms.” It would become terribly complex, and it would also stop being a very good bug tracker, because it would be trying to be “everything to everyone” instead of just focusing on adding good bug-tracking features. That would definitely be overengineering–we’re just trying to track bugs, but suddenly we’re a generic form system? Yep, sounds like “orbital lasers” to me. :-)

In addition to being too generic on the whole-program level, individual components of the program can also be too generic. A function that processes strings doesn’t also have to process integers and arrays, if you’re never going to be getting arrays and integers as input.

You don’t have to overengineer in a huge way, either, to mess up your system. Little by little, tiny bits of overengineering can stack up into one huge complex mass.

Good design is design that leads to simplicity in implementation and maintenance, and makes it easy to understand the code. Overengineered design is design that leads to difficulty in implementation, makes maintenance a nightmare, and turns otherwise simple code into a twisty maze of complexity. It’s not nearly as common as underengineering, but it’s still important to watch out for.

-Max

15 Responses to What Is Overengineering?

  1. Pingback: zibaldone.info » Blog Archive » Sii prolifico, non perfetto. Stupidi piĆ¹ bravi a fare soldi.

  2. Chris says:

    Good post! I think your examples aren’t realistic enough though. Here’s a much more tangible example of overengineering:

    Say that you need to reverse a string in ONE place in your code, and to make it reusable you decide to make a function:

    function string_reverse(string)
    {
    // reverse the string
    // return the reversed string
    }

    somestring = string_reverse(somestring)

    While you are coding this, you think to yourself “wouldn’t it be AWESOME if this could handle a whole array of strings and process each of them? that COULD come in handy SOMETIME”, so you code:

    function string_reverse(string)
    {
    // if this is an array of STRINGS, loop through it:
    // {
    // –> reverse the string in the current element
    // –> add the reversed string back to the array
    // –> return the reversed array once the loop is complete
    // }
    // else (this is a string)
    // {
    // –> reverse the string
    // –> return the reversed string
    // }
    }

    somestring = string_reverse(somestring)

    Notice the ONLY thing that changed here? You just wrote a bunch of completely useless lines, and you STILL ended up with a function that did just one thing for YOU, reversing that ONE string you have.

    That’s why less overengineering is more productivity, because you spent X minutes coding something you don’t need now, and could have JUST AS EASILY added the array handling LATER if you ever needed it, but instead you wasted time coding features into a function you A) only used once on your code and B) only ever NEED it to handle strings for the forseeable future.

    I think this is a prime example of real-world overengineering. ;-)

  3. ph says:

    This string_reverse example is not great, its stupid because
    a single string should be the same as an one-element array.

  4. Christine says:

    Hi, this is very interesting.

    I got a deliverable in a research project asking me to make sure I avoid over-engineering in the project. Whilst I worked as programmer before and am totally with you on the way you discussed the topic above, thinking about it: Does over-engineering not go much further than just writing complicated code rather than a few lines which would do the job. Doesnt it start already on a mangement level? Is it not over-engineering in software projects in general if I for instance impose processes and reguluations (such as Unit tests) for all possible cases even if it might not be necessary?

    If anyone has any ideas or hints to literature that be great!!

    Christine

    • Max Kanat-Alexander says:

      That’s a good question. I think if we’re talking about software engineering, then the over-engineering is something that happens in the software. But we could be talking about process, and the same general principle would carry over–basically, building complex structures that don’t need to be there to accomplish the purpose into the reasonably-viewed future.

      -Max

  5. Pingback: Code Simplicity What Is Overengineering | debt solutions

  6. Palani Nama says:

    Max & others, thank you for the valuable comments. I see that we started with over “Designing” and ended up illustrating over “Coding”. Well, I would like to bring back the focus to over “Design”. From my experiences, I have come across problems that arise due to either extremes & also observed a pattern where these extremes occur. I see that we under-design applications targeted towards solving specific business problems, for example, a pre-process for an upload job, or a standalone application doing specific job and interacting with limited systems. On the contrary, enterprise solutions are over designed on purpose to provide interfaces to interact distinct non-existent (future) systems depending on future roadmaps.

    Well, my point here is, I couldn’t figure out a hardcore rule to categorize a system as over designed or under designed. If you have some guidelines to classy software systems so, pls let us know.

    • Max Kanat-Alexander says:

      Well, I think the problem with “enterprise” solutions is that they’re just trying to do too much, and often in the wrong way.

      For example, it’s just wrong to build a monolithic application instead of a collection of single, focused applications. We’ve proved this over and over for almost 40 years now in software development–it’s time that people just took the obvious proof and stopped designing systems to “do everything”.

      Once we get down to the single, focused application, that’s where under-engineering (or as you correctly put it, under-designing) gets into play too often. When you’re designing a focused application, you can actually be a lot more flexible than people think about. In fact, the smaller your focus is, the more flexible you can be. For example, if you break things out into libraries, then you can really focus on the quality of the library as though it was a library that anybody was going to use.

      For example, I just wrote Parse::StackTrace–a library that I needed for actually only one purpose, but I wrote it as though lots of other people were going to use it–I gave it a full test suite, a complete API, etc. As a result, it was a lot easier for me to use, too!

      Anyhow, I could go on and on about this, but that’s the basics.

      -Max

  7. Simon Berg says:

    Hilarious! So ironic. The first post from Chris is a classic example of over engineering a comment! The analogy of Laser and Ant hill was perfectly fine. Its an analogy! Its supposed to be short simple and easy for a none experienced person to relate to. Not a literal example! Funny!

  8. Warren says:

    Here are some examples. You can decide for yourselves if they are necessarily complex, or overengineered.

    Early web browsers had to parse HTML and present it reasonably. Modern browsers must present HTML precisely, support XML, support XSLT, support a lexer/parser and (fast) compiler for javascript, a DOM object model, a plugin infrastruture, CSS, SSL, SVG, embedded Java, Flash, O3D.

    C++ supports ‘generics’, generics let you define a data structure once, and then make multiple versions of it for different payload types. You write a generic linked list, and instantiate it to carry INTS, another for carrying CHARACTERS, and another for carrying a user defined structure. You end up with 3x the object code, but it is typesafe. This was done in C by using void* (a pointer to an object of unknown type) as the datatype — a solution which needs 0 new lines of code.

    Programming languages are conventionally a lexer (to turn TEXT into tokens), parser (to turn TOKENS into AST [abstract syntax tree]) and compiler (to optimize the AST and produce COMPUTER INSTRUCTIONS). An alternative is to just write a text -> AST translation, which can be done easily if the language is not complicated (i.e. PREFIX: (+ 1 4 3) or POSTFIX: 1 4 + 3 +, but not INFIX: 1 + 4 + 3)

  9. Erik says:

    Good analogy in the original post. The comments seem a bit overengineered though.

  10. Thomas says:

    Great Post! Thank you :)

  11. John says:

    I know the topic is stale, but I must chime in with a tell-tale sign that you’re dev’s (or yourself) is over engineering when:

    Code > DataModel

    When developers are more focused on macro-intricacies of their code VS retaining the integrity of the data model.

    IE: when you’re code is building relationships to form data models VS the data models being normalized entities that I identify their relationships using primary identifiers.

    Most of the time, this is a result of the “Second System” effect where you need to prove your ninja status by producing the most robust backend operations that utilize every design pattern that the gang-of-four called out.

    My view: You can always re-factor and redesign shitty code, it’s not so easy to roll back transactions after your data model is corrupted and/or fragmented.

    DataModel > Code

    Presentation > To customer/user VS mutli-threaded com apartments using a polymorhphic factory pattern.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>