Code Simplicity

Two is Too Many

There is a key rule that I personally operate by when I’m doing incremental development and design, which I call “two is too many.” It’s how I implement the “be only as generic as you need to be” rule from the Three Flaws of Software Design.

Essentially, I know exactly how generic my code needs to be by noticing that I’m tempted to cut and paste some code, and then instead of cutting and pasting it, designing a generic solution that meets just those two specific needs. I do this as soon as I’m tempted to have two implementations of something.

For example, let’s say I was designing an audio decoder, and at first I only supported WAV files. Then I wanted to add an MP3 parser to the code. There would definitely be common parts to the WAV and MP3 parsing code, and instead of copying and pasting any of it, I would immediately make a superclass or utility library that did only what I needed for those two implementations.

The key aspect of this is that I did it right away—I didn’t allow there to be two competing implementations; I immediately made one generic solution. The next important aspect of this is that I didn’t make it too generic—the solution only supports WAV and MP3 and doesn’t expect other formats in any way.

Another part of this rule is that a developer should ideally never have to modify one part of the code in a similar or identical way to how they just modified a different part of it. They should not have to “remember” to update Class A when they update Class B. They should not have to know that if Constant X changes, you have to update File Y. In other words, it’s not just two implementations that are bad, but also two locations. It isn’t always possible to implement systems this way, but it’s something to strive for.

If you find yourself in a situation where you have to have two locations for something, make sure that the system fails loudly and visibly when they are not “in sync.” Compilation should fail, a test that always gets run should fail, etc. It should be impossible to let them get out of sync.

And of course, the simplest part of this rule is the classic “Don’t Repeat Yourself” principle—don’t have two constants that represent the same exact thing, don’t have two functions that do the same exact thing, etc.

There are likely other ways that this rule applies. The general idea is that when you want to have two implementations of a single concept, you should somehow make that into a single implementation instead.

When refactoring, this rule helps find things that could be improved and gives some guidance on how to go about it. When you see duplicate logic in the system, you should attempt to combine those two locations into one. Then if there is another location, combine that one into the new generic system, and proceed in that manner. That is, if there are many different implementations that need to be combined into one, you can do incremental refactoring by combining two implementations at a time, as long as combining them does actually make the system simpler (easier to understand and maintain). Sometimes you have to figure out the best order in which to combine them to make this most efficient, but if you can’t figure that out, don’t worry about it—just combine two at a time and usually you’ll wind up with a single good solution to all the problems.

It’s also important not to combine things when they shouldn’t be combined. There are times when combining two implementations into one would cause more complexity for the system as a whole or violate the Single Responsibility Principle. For example, if your system’s representation of a Car and a Person have some slightly similar code, don’t solve this “problem” by combining them into a single CarPerson class. That’s not likely to decrease complexity, because a CarPerson is actually two different things and should be represented by two separate classes.

This isn’t a hard and fast law of the universe—it’s a more of a strong guideline that I use for making judgments about design as I develop incrementally. However, it’s quite useful in refactoring a legacy system, developing a new system, and just generally improving code simplicity.

-Max

17 Comments

How to Handle Code Complexity in a Software Company

Here’s an obvious statement that has some subtle consequences:

Only an individual programmer can resolve code complexity.

That is, resolving code complexity requires the attention of an individual person on that code. They can certainly use appropriate tools to make the task easier, but ultimately it’s the application of human intelligence, attention, and work that simplifies code.

So what? Why does this matter? Well, to be clearer:

Resolving code complexity usually requires detailed work at the level of the individual contributor.

If a manager just says “simplify the code!” and leaves it at that, usually nothing happens, because (a) they’re not being specific enough, (b) they don’t necessarily have the knowledge required about each individual piece of code in order to be that specific, and (c) part of understanding the problem is actually going through the process of solving it, and the manager isn’t the person writing the solution.

The higher a manager’s level in the company, the more true this is. When a CTO, Vice President, or Engineering Director gives an instruction like “improve code quality” but doesn’t get much more specific than that, what tends to happen is that a lot of motion occurs in the company but the codebase doesn’t significantly improve.

It’s very tempting, if you’re a software engineering manager, to propose broad, sweeping solutions to problems that affect large areas. The problem with that approach to code complexity is that the problem is usually composed of many different small projects that require detailed work from individual programmers. So, if you try to handle everything with the same broad solution, that solution won’t fit most of the situations that need to be handled. Your attempt at a broad solution will actually backfire, with software engineers feeling like they did a lot of work but didn’t actually produce a maintainable, simple codebase. (This is a common pattern in software management, and it contributes to the mistaken belief that code complexity is inevitable and nothing can be done about it.)

So what can you do as a manager, if you have a complex codebase and want to resolve it? Continue reading

9 Comments

Test-Driven Development and the Cycle of Observation

Today there was an interesting discussion between Kent Beck, Martin Fowler, and David Heinemeier Hansson on the nature and use of Test-Driven Development (TDD), where one writes tests first and then writes code.

Each participant in the conversation had different personal preferences for how they write code, which makes sense. However, from each participant’s personal preference you could extract an identical principle: “I need to observe something before I can make a decision.” Kent often (though not always) liked writing tests first so that he could observe their behavior while coding. David often (though not always) wanted to write some initial code, observe that to decide on how to write more code, and so on. Even when they talked about their alternative methods (Kent talking about times he doesn’t use TDD, for example) they still always talked about having something to look at as an inherent part of the development process.

It’s possible to minimize this point and say it’s only relevant to debugging or testing. It’s true that it’s useful in those areas, but when you talk to many senior developers you find that this idea is actually a fundamental basis of their whole development workflow. They want to see something that will help them make decisions about their code. It’s not something that only happens when code is complete or when there’s an bug—it’s something that happens at every moment of the software lifecycle.

This is such a broad principle that you could say the cycle of all software development is:

Observation → Decision → Action → Observation → Decision → Action → etc.

If you want a term for this, you could call it the “Cycle of Observation” or “ODA.”

Example

What do I mean by all of this? Well, let’s take some examples to make it clearer. When doing TDD, the cycle looks like: Continue reading

6 Comments

The Purpose of Technology

In general, when technology attempts to solve problems of matter, energy, space, or time, it is successful. When it attempts to solve human problems of the mind, communication, ability, etc. it fails or backfires dangerously. Continue reading

9 Comments

The Secret of Fast Programming: Stop Thinking

When I talk to developers about code complexity, they often say that they want to write simple code, but deadline pressure or underlying issues mean that they just don’t have the time or knowledge necessary to both complete the task and refine it to simplicity.

Well, it’s certainly true that putting time pressure on developers tends to lead to them writing complex code. However, deadlines don’t have to lead to complexity. Instead of saying “This deadline prevents me from writing simple code,” one could equally say, “I am not a fast-enough programmer to make this simple.” That is, the faster you are as a programmer, the less your code quality has to be affected by deadlines.

Now, that’s nice to say, but how does one actually become faster? Is it a magic skill that people are born with? Do you become fast by being somehow “smarter” than other people?

No, it’s not magic or in-born at all. In fact, there is just one simple rule that, if followed, will eventually solve the problem entirely: Continue reading

33 Comments

Make It Never Come Back

When solving a problem in a codebase, you’re not done when the symptoms stop. You’re done when the problem has disappeared and will never come back.

It’s very easy to stop solving a problem when it no longer has any visible symptoms. You’ve fixed the bug, nobody is complaining, and there seem to be other pressing issues. So why continue to do work on it? It’s fine for now, right?

No. Remember that what we care about the most in software is the future. The way that software companies get into unmanageable situations with their codebases is not really handling problems until they are done.

This also explains why some organizations cannot get their tangled codebase back into a good state. Continue reading

4 Comments

The Philosophy of Testing

Much like we gain knowledge about the behavior of the physical universe via the scientific method, we gain knowledge about the behavior of our software via a system of assertion, observation, and experimentation called “testing.”

There are many things one could desire to know about a software system. It seems that most often we want to know if it actually behaves like we intended it to behave. That is, we wrote some code with a particular intention in mind, does it actually do that when we run it?

In a sense, testing software is the reverse of the traditional scientific method, where you test the universe and then use the results of that experiment to refine your hypothesis. Instead, with software, if our “experiments” (tests) don’t prove out our hypothesis (the assertions the test is making), we change the system we are testing. That is, if a test fails, it hopefully means that our software needs to be changed, not that our test needs to be changed. Sometimes we do also need to change our tests in order to properly reflect the current state of our software, though. It can seem like a frustrating and useless waste of time to do such test adjustment, but in reality it’s a natural part of this two-way scientific method–sometimes we’re learning that our tests are wrong, and sometimes our tests are telling us that our system is out of whack and needs to be repaired.

This tells us a few things about testing:

  1. The purpose of a test is to deliver us knowledge about the system, and knowledge has different levels of value. Continue reading
5 Comments

Users Have Problems, Developers Have Solutions

In the world of software, it is the job of software developers to solve the problems of users. Users present a problem, and the developers solve it. Whenever these roles are reversed, trouble ensues. Continue reading

3 Comments

The Accuracy of Future Predictions

One thing we know about software design is that the future is important. However, we also know that the future is very hard to predict.

I think that I have come up with a way to explain exactly how hard it is to predict the future of software. The most basic version of this theory is:

The accuracy of future predictions decreases relative to the complexity of the system and the distance into the future you are trying to predict.

As your system becomes more and more complex, you can predict smaller and smaller pieces of the future with any accuracy. As it becomes simpler, you can predict further and further into the future with accuracy.

For example, it’s fairly easy to predict the behavior of a “Hello, World” program quite far into the future. It will, most likely, continue to print “Hello, World” when you run it. Remember that this is a sliding scale–sort of a probability of how much you can say about what the future holds. You could be 99% sure that it will still work the same way two days from now, but there is still that 1% chance that it won’t.

However, after a certain point, even the behavior of “Hello World” becomes unpredictable. For example, “Hello World” in Python 2.0 the year 2000:

print "Hello, World!"

But if you tried to run that in Python 3, it would be a syntax error. In Python 3 it’s:

print("Hello, World!")

You couldn’t have predicted that in the year 2000, and there isn’t even anything you could have done about it if you did predict it. With things like this, your only hope is keeping your system simple enough that you can update it easily to use the new syntax. Not “flexible,” not “generic,” but simply simple to understand and modify.

In reality, there’s a more expanded logical sequence to the rule above: Continue reading

8 Comments

Code Simplicity, Second Revision

In June, I released a second revision of Code Simplicity. Some of you probably already know, but I thought that I should let everybody else know, too.

The most important change is that book gets into the laws and rules of software design much more quickly now. It starts with a completely re-written Preface that tells the story of how I developed the principles in Code Simplicity, and why you might be interested in them. Then it gets into a much shorter Chapter 1 that distills everything from the old Chapter 1 into a few short pages, skips the old Chapter 2 (which was a long discussion about what it means for something to be a science) and goes right into the laws.

Particularly if you’ve read the original version, I’d really love to hear your feedback on how the starting content of the new revision feels to you!

-Max

P.S. If you bought the ebook from O’Reilly, you get every new revision for free, and there will probably be even more revisions than this one! If you got the ebook elsewhere, there’s a little link inside of the book itself that will let you “upgrade” to the O’Reilly editions for pretty cheap, so that you can get this revision and every other future revision for free, too. I’m not partial to any particular method of you getting the book, but the O’Reilly editions are definitely the best way to get the new revisions as they come out.

5 Comments