Two is Too Many

There is a key rule that I personally operate by when I’m doing incremental development and design, which I call “two is too many.” It’s how I implement the “be only as generic as you need to be” rule from the Three Flaws of Software Design.

Essentially, I know exactly how generic my code needs to be by noticing that I’m tempted to cut and paste some code, and then instead of cutting and pasting it, designing a generic solution that meets just those two specific needs. I do this as soon as I’m tempted to have two implementations of something.

For example, let’s say I was designing an audio decoder, and at first I only supported WAV files. Then I wanted to add an MP3 parser to the code. There would definitely be common parts to the WAV and MP3 parsing code, and instead of copying and pasting any of it, I would immediately make a superclass or utility library that did only what I needed for those two implementations.

The key aspect of this is that I did it right away—I didn’t allow there to be two competing implementations; I immediately made one generic solution. The next important aspect of this is that I didn’t make it too generic—the solution only supports WAV and MP3 and doesn’t expect other formats in any way.

Another part of this rule is that a developer should ideally never have to modify one part of the code in a similar or identical way to how they just modified a different part of it. They should not have to “remember” to update Class A when they update Class B. They should not have to know that if Constant X changes, you have to update File Y. In other words, it’s not just two implementations that are bad, but also two locations. It isn’t always possible to implement systems this way, but it’s something to strive for.

If you find yourself in a situation where you have to have two locations for something, make sure that the system fails loudly and visibly when they are not “in sync.” Compilation should fail, a test that always gets run should fail, etc. It should be impossible to let them get out of sync.

And of course, the simplest part of this rule is the classic “Don’t Repeat Yourself” principle—don’t have two constants that represent the same exact thing, don’t have two functions that do the same exact thing, etc.

There are likely other ways that this rule applies. The general idea is that when you want to have two implementations of a single concept, you should somehow make that into a single implementation instead.

When refactoring, this rule helps find things that could be improved and gives some guidance on how to go about it. When you see duplicate logic in the system, you should attempt to combine those two locations into one. Then if there is another location, combine that one into the new generic system, and proceed in that manner. That is, if there are many different implementations that need to be combined into one, you can do incremental refactoring by combining two implementations at a time, as long as combining them does actually make the system simpler (easier to understand and maintain). Sometimes you have to figure out the best order in which to combine them to make this most efficient, but if you can’t figure that out, don’t worry about it—just combine two at a time and usually you’ll wind up with a single good solution to all the problems.

It’s also important not to combine things when they shouldn’t be combined. There are times when combining two implementations into one would cause more complexity for the system as a whole or violate the Single Responsibility Principle. For example, if your system’s representation of a Car and a Person have some slightly similar code, don’t solve this “problem” by combining them into a single CarPerson class. That’s not likely to decrease complexity, because a CarPerson is actually two different things and should be represented by two separate classes.

This isn’t a hard and fast law of the universe—it’s a more of a strong guideline that I use for making judgments about design as I develop incrementally. However, it’s quite useful in refactoring a legacy system, developing a new system, and just generally improving code simplicity.

-Max

How to Handle Code Complexity in a Software Company

Here’s an obvious statement that has some subtle consequences:

Only an individual programmer can resolve code complexity.

That is, resolving code complexity requires the attention of an individual person on that code. They can certainly use appropriate tools to make the task easier, but ultimately it’s the application of human intelligence, attention, and work that simplifies code.

So what? Why does this matter? Well, to be clearer:

Resolving code complexity usually requires detailed work at the level of the individual contributor.

If a manager just says “simplify the code!” and leaves it at that, usually nothing happens, because (a) they’re not being specific enough, (b) they don’t necessarily have the knowledge required about each individual piece of code in order to be that specific, and (c) part of understanding the problem is actually going through the process of solving it, and the manager isn’t the person writing the solution.

The higher a manager’s level in the company, the more true this is. When a CTO, Vice President, or Engineering Director gives an instruction like “improve code quality” but doesn’t get much more specific than that, what tends to happen is that a lot of motion occurs in the company but the codebase doesn’t significantly improve.

It’s very tempting, if you’re a software engineering manager, to propose broad, sweeping solutions to problems that affect large areas. The problem with that approach to code complexity is that the problem is usually composed of many different small projects that require detailed work from individual programmers. So, if you try to handle everything with the same broad solution, that solution won’t fit most of the situations that need to be handled. Your attempt at a broad solution will actually backfire, with software engineers feeling like they did a lot of work but didn’t actually produce a maintainable, simple codebase. (This is a common pattern in software management, and it contributes to the mistaken belief that code complexity is inevitable and nothing can be done about it.)

So what can you do as a manager, if you have a complex codebase and want to resolve it? Continue reading

Test-Driven Development and the Cycle of Observation

Today there was an interesting discussion between Kent Beck, Martin Fowler, and David Heinemeier Hansson on the nature and use of Test-Driven Development (TDD), where one writes tests first and then writes code.

Each participant in the conversation had different personal preferences for how they write code, which makes sense. However, from each participant’s personal preference you could extract an identical principle: “I need to observe something before I can make a decision.” Kent often (though not always) liked writing tests first so that he could observe their behavior while coding. David often (though not always) wanted to write some initial code, observe that to decide on how to write more code, and so on. Even when they talked about their alternative methods (Kent talking about times he doesn’t use TDD, for example) they still always talked about having something to look at as an inherent part of the development process.

It’s possible to minimize this point and say it’s only relevant to debugging or testing. It’s true that it’s useful in those areas, but when you talk to many senior developers you find that this idea is actually a fundamental basis of their whole development workflow. They want to see something that will help them make decisions about their code. It’s not something that only happens when code is complete or when there’s an bug—it’s something that happens at every moment of the software lifecycle.

This is such a broad principle that you could say the cycle of all software development is:

Observation → Decision → Action → Observation → Decision → Action → etc.

If you want a term for this, you could call it the “Cycle of Observation” or “ODA.”

Example

What do I mean by all of this? Well, let’s take some examples to make it clearer. When doing TDD, the cycle looks like: Continue reading

The Secret of Fast Programming: Stop Thinking

When I talk to developers about code complexity, they often say that they want to write simple code, but deadline pressure or underlying issues mean that they just don’t have the time or knowledge necessary to both complete the task and refine it to simplicity.

Well, it’s certainly true that putting time pressure on developers tends to lead to them writing complex code. However, deadlines don’t have to lead to complexity. Instead of saying “This deadline prevents me from writing simple code,” one could equally say, “I am not a fast-enough programmer to make this simple.” That is, the faster you are as a programmer, the less your code quality has to be affected by deadlines.

Now, that’s nice to say, but how does one actually become faster? Is it a magic skill that people are born with? Do you become fast by being somehow “smarter” than other people?

No, it’s not magic or in-born at all. In fact, there is just one simple rule that, if followed, will eventually solve the problem entirely: Continue reading