Code Simplicity

The Helpful Developer

2026-02-05T10:55:42Z

Over the past few years, we have been so focused on worrying about the future of our own profession that most of us have failed to notice the most important thing that’s about to happen:

Everybody will be able to write software.

This isn’t a theoretical idea. There are already people in HR, Finance, and Sales at companies across the world using AI assistants to write complete computer programs. This isn’t entirely new—everybody is familiar with “that giant spreadsheet that somebody in Finance wrote for their own use.” The difference is that AI assistants unlock the full power of computer programming for everybody. However, they also unlock the full dangers and consequences of computer programming—a series of lessons that we as a profession have learned the hard way over a period of 60+ years.

We can worry about this, complain about it, try to ignore it, etc. But what we really should ask ourselves is: what are we going to do about it?

Well, there are a series of things that we need to do, but today I’m going to focus on just one:

We need to help the new builders.

Helping Others Understand

Too often, we have been dismissive of people with less technical background who seek to build software. We tell them it’s too hard to understand why we do what we do. We tell them they could never understand and they just have to trust us.

We do this in part out of arrogance (we really do have a craft that we have spent years learning, practicing, and polishing) but also because it’s actually really hard to explain to people all the lessons of software engineering without making them actually do it for years.

All of this needs to change in this new world. People with all sorts of backgrounds, technical or not, are going to be writing real software programs. We need to make it our mission to help them, not to tell them they’re “holding it wrong.”

We need to get good not just at building software, but at explaining to others why software is built in a certain way. Otherwise, millions of people are going to be building things that will come back to bite them in ways they can’t imagine or predict currently. And the greatest problem is: the consequences of bad decisions in software don’t show up for months or years, sometimes. So they will not just learn these lessons on their own. By the time they learn them, it will be too late. It will have caused so many problems that it will become your problem anyway: you will have to deal with security issues, massive infrastructure costs, unmaintainable systems, and numerous other consequences that are now causing serious issues for your business.

The solution is not to forbid people from using these tools. The tools give them too much power, too much convenience, and too much legitimate value. They will find some way to use them, or revolt against your restrictions until you give up. We have to lean into how we will help them, how we will teach them.

The very minimum bar is that we have to help them understand when they need to reach out to an engineer. How far should they go without help (which could be pretty far)? At what point do they need professional assistance, advice, or a little bit of education?

Won’t The Agents Just Do It?

I’m sure that some people will argue that this isn’t necessary, that the AI agents will handle the whole process, and so forth. Or that the agents will get so good at writing code that they will avoid all the consequences, and so everybody can write software of any level of complexity perfectly safely.

I’m almost entirely confident that won’t happen, because there are some problems that are only solved by understanding they exist and having the intention to solve or prevent them. Even if the agent wrote perfect code that did exactly what you said to do, you’d have to know to tell it to do certain things.

Just as an example, recently I talked to a non-technical person who had been building software using a modern AI agent. This person had no background in software engineering at all. They described how they had “discovered” that they got better results if they experimented with multiple solutions before choosing one. I’m sure at some point they will also “discover” that they get better results if they define a set of requirements, if they research more up front, if they write out clearer specifications, if they make more incremental changes to the system, if they release more frequently, and so on and so on. While agents will get better at doing some of this on their own, there will always be some areas where the agent simply can’t execute an intention you didn’t express, or where you even told it explicitly to do the wrong thing.

The problem with software development is that it can go wrong in an infinite number of ways. Recently I heard a story of a team that racked up a huge infrastructure bill in a single month, caused by bugs in software that had been written by non-developers. Sure, you could design a system to prevent that specific failure mode, but there are infinite more failure modes hiding in the bushes, just waiting to jump out and attack unsuspecting citizen developers.

There are solutions to a lot of this. There’s infrastructure you could build to make it much safer for such “citizen developers” to write software. But not everybody will have that infrastructure, and even when they do, it will never cover every possible edge case. Plus, some people will intentionally or unintentionally “work around” the infrastructure to do the wrong things anyway.

I could provide all the healthy food in the world to people, make it easily accessible, and some people would still eat terribly, because they don’t understand nutrition. This isn’t a technical problem, and so it doesn’t have a technical solution.

This is a Repeating Pattern

In the history of software engineering, every ten years somebody comes along and invents some new paradigm of development and re-discovers all the lessons we have learned about software development over the last many decades.

Once upon a time we had structured programming (blocks and loops). Then we developed object-oriented programming, and everybody re-learned all the lessons we already learned from structured programming. We were putting software on mainframes, and then we were shipping boxed software personal computers and had to re-learn all the lessons about development processes that had already been learned in the mainframe era. When web development started to rise in the world, the whole world of web development rediscovered all the tools, structures, and processes of software development that software engineers had already known for decades.

This pattern has repeated over and over. When mobile development rose, it went through the same cycle. Machine learning engineers went through it and are still going through it.

This pattern repeats because there are aspects of building software systems that are inherent in solving the problem, no matter what shape the problem takes. It’s also worth understanding that this pattern isn’t new, and that every time it’s happened, we have been dismissive of the new developers much to their detriment and ours.

This time, though, it’s not just a small group discovering a new paradigm of software development. Potentially, it’s everybody in the world discovering software development itself.

So here is my plea to you. When you find somebody using an AI agent to build software with no understanding of it, don’t dismiss them. Don’t throw your hands up and walk away. Don’t lecture them about how they are doing everything wrong and how they have to work and study for years to do it the right way. Listen to their troubles, gently give them specific suggestions to help with their immediate situation, give them a “tip” or two that will help them do better in the future, and let them know you’re there when they have more questions.

Over time, I suspect that this will actually become more and more of our role as software developers—not just to build things, but to help mentor and guide a world that is newly coming to discover both the joys and the pains of building real programs.

-Max

The post The Helpful Developer appeared first on Code Simplicity.

What is an Engineer?

2026-01-12T13:13:56Z

As I write this, it’s the start of 2026 and many people are in a crisis about what their job is. “If AI writes all the code, am I still an engineer?” “Will the world still need software engineers?” “What really is an engineer, anyway?”

Really, all of this comes down to that last question: what really is an engineer, anyway? Well, I have an answer for that that I think will help bring clarity to a lot of people. To get there, however, we’re going to have to go through a few logical steps that will seem stupidly obvious at first. At the end, though, it will be super eye-opening.

Let’s start off with a stupidly obvious definition of “engineer”:

An engineer (noun) is a person that engineers (verb).

Pretty stupid, right? Okay, so next step, what does “engineer (verb)” mean?

Engineer (verb) means “do engineering (the activity).”

Well, not interesting yet. It’s the next step where this starts to get interesting: what is engineering?

What is Engineering?

Usually I love dictionaries, but in this case, the dictionary is unhelpful. It says that the basic definition of engineering is: “the branch of science and technology concerned with the design, building, and use of engines, machines, and structures.” (Oxford New American Dictionary) This definition is why some people say “software engineers are not engineers.” They don’t build things in the physical universe. You could say that software engineers are building machines, they’re just logical machines instead of physical ones, but that really starts to feel like you’re stretching the definition. This is why many people insist on using the word “developer” instead of “engineer” when talking about people who write software.

The real problem with this definition, though, is that it doesn’t clarify anything for us. As AI becomes better and better at the design and building of software, is it the engineer now? That doesn’t sound right. We are still left wondering: what really is engineering?

Well, I have long had a definition for engineering that I really like that I think clarifies this tremendously. We will start off with the most basic version of this definition, and then build on it from there to understand what the job of engineers really is. The simplest statement is:

Engineering is taking the right actions in the right sequence.

You can use it exactly like that, and I often do. However, there are a lot of words doing a lot of heavy lifting in that definition, in particular, the word “right.” What is a “right action” and what is a “right sequence?” And why is “sequence” in there? Well, there’s a lot more to learn here, so let’s get into it.

Right Action

Most simply, a right action is one that successfully accomplishes your intention. If I intend to get dirt of my hands, washing them with soap and water is a right action. Putting my hands back in the dirt would be a wrong action.

Actions can be more or less right. If my hands are very messy, I can partially clean them with water alone. So that’s somewhat right. But using both water and soap is better.

People can have multiple intentions. They can say, “I want to wash my hands but I also want to conserve water and soap.” So then the right action would involve getting the hands clean with minimal use of water and soap. Every real engineering project I’ve ever worked on has multiple intentions. For example, “we need to solve this problem, but do it in a certain time and with a certain limited set of resources.” In fact, those specific constraints are so common that some engineers believe those are the only constraints they ever have to think about: deliver the expected result on time and under budget. But there can actually be an infinite number of different intentions, and real projects have many, many complex intentions. Just as an example, here is a set of intentions that could exist for a project:

Spend less money on infrastructure than the product makes.
Provide engineers interesting problems so that they keep working here.
Split up the work so that the four people on the team can all be contributing.
Ensure the CEO is happy with the product.
Make the product appealing to a whole new set of users.
Don’t lose any of our existing users.
Get a person on the team promoted who really deserves it, by showing they can tackle challenging problems.

And so on, and so on. Real projects often have lots of intentions.

Taking this all into account, we see that we can expand our definition of enginering:

Engineering is taking the right actions in the right sequence in order to accomplish a set of intentions.

Intentions

Realizing that the definition has to do with intentions has all sorts of fascinating consequences that you can see play out in real engineering projects. Do we know what our intentions actually are? Does everybody agree about what the intentions are? Many teams have built the wrong things due to not knowing the answers to those questions.

You might also wonder, “how do we know if we have the right intentions?” If you think it through, you will realize that there is some higher-level, more senior intention that you’re trying to accomplish:

“Why are we building this user interface that lets people add numbers?”
“Oh, we’re building it because we are trying to build a calculator.”
“Why are we trying to build a calculator?”
“Because our users need to do math while they are working in this other part of the system.”
“Why do our users need to do math while they are working there?”

And so on and so forth. You can keep playing this “why” game until you get up to something like “because we have to survive” or “because we want to help people” if you really want. Usually you only need to do it until you understand the overall intention of a system you’re trying to build, or the overall intentions of the company/group you’re in. For example, you know, “our company exists to build bikes for kids,” and so at some level you can say “we are doing this because it helps us build bikes for kids.” This process of asking “why do we want to accomplish this intention” is how you determine if your intentions are “right” or not.

Doing this “why” game often tremendously improves the quality of engineering work that gets done. You very frequently discover that you are taking the wrong actions (or were about to take the wrong actions) by discovering that your intentions were not aligned with the more senior intentions.

Thus, you could actually say:

Engineering is taking the right actions in the right sequence in order to accomplish the right intentions.

Right Sequence

So why is “sequence” in the definition? Well, this is one of the key things that makes the definition of “engineering” different from “do good stuff.” Engineering projects always involve more than one step, and in order to accomplish one’s intentions, the order in which those steps are done matters a great deal.

Let’s say the intention is: “every time our hands get dirty, we will get them effectively clean.” Here’s a wildly wrong sequence:

Put hands under the faucet.
Remove hands from under the faucet.
Turn on water.
Put soap on hands.
Turn off water.
Rub hands together.
Get hands dirty.

If you look at that sequence and you just feel like you have to fix it, you may be an engineer. This is part of the art and science of what we do: we put many actions in sequence to accomplish intentions. Knowing what that right sequence is is a skill we develop through education and experience.

Sequences of Sequences

Sequences almost always contain other smaller sequences. There are sequences inside of sequences inside of sequences. For example, even “Turn on water” from above has a sequence:

Move hand to faucet.
Turn faucet in the right direction to turn it on.
Observe water as it comes out and adjust faucet until the intended volume is coming out.

And then each of those points has a sequence, and so on and so on.

In real engineering projects, this manifests in very large sequences that get broken down into smaller sequences. For example, if you want to build a website that has your resume on it so that you can get a new job, you might:

Investigate resume formats that have been successful at getting people jobs.
Build a web site that has my resume in a format that will successfully get me hired.

Then you would have to break down each of those steps into smaller projects. For example, the “investigate” step might look like:

Search the Internet for advice on resumes.
Read advice until I feel like I have a good idea of what to build.
List out friends who recently got jobs in the same field.
Ask those friends how they structured their resumes.
Write up brief notes on what I learned.

In that list, we start to see a little bit of the art involved in sequencing actions. Should we search the Internet first or should we ask our friends first? Well, the logic here is that searching the Internet can be done quickly, so I will get whatever benefit I can from that. But maybe actually the right sequence is to make the list of friends first, then email them and then do my Internet searches. That way I’m doing something productive while I wait for them to respond. This also exposes another intention that I have: “use my time efficiently.”

Engineers vs. Non-Engineers

Sequencing is where we start to see the core difference between how an engineer approaches a problem and how a non-engineer approaches a problem. A non-engineer will take any winding set of steps that vaguely accomplishes an intention—usually some narrow, more simple intention. Their process for building a resume might be, “I will look at one resume and copy it.”

Non-engineers are sometimes not aware of sequences at all, or only realize that there are additional steps when those additional steps hit them in the face. They might think, “I want to have a website,” and that the whole sequence is: “I will build this website tomorrow and be done.” Then a year later, something goes wrong with the website. Only then do they realize they also have to maintain that website. A professional software engineer would have known that from the start.

To be fair, this isn’t a black and white thing. Everybody has the experience of discovering there was some step they should have done, and only learning about it when it hits them in the face. Think how many engineers have realized, “Oops, I should have written more tests while I was writing my code, and now it’s really hard to go back and add them.” That doesn’t mean they are non-engineers. It just means they have more to learn, like the rest of us.

In fact, all of us are probably non-engineers in some part of our lives. I am not a plumber; I have attached shower heads in non-optimal ways and only realized the problem later. Knowing the right sequence requires expertise, and nobody has expertise in everything.

Sequences Are Not Absolutely Right or Wrong

Just like actions, some sequences are better than others, and some are worse than others.

Sometimes, the difference simply doesn’t matter that much. Like, if you’re washing your hands, you can argue whether you should get them wet first and then put soap on them, or whether you should rub soap on dry hands and then wash it off with water. The truth is, both of those sequences are going to get your hands sufficiently clean for most purposes, in the same amount of time, with the same amount of effort. There is a point beyond which optimizing the sequence just doesn’t matter.

However, that doesn’t mean that all sequences are equivalent. Too many times, I’ve heard people say things like, “It doesn’t matter how we get there as long as we get there.” There’s so much confusion in the world about “does it matter how we get the result?” If you’re doing engineering, the answer is: how we get the result is the fundamental nature of my job. If you think “how we get the result doesn’t matter,” then you simply are not doing engineering and you are not an engineer. There are, unquestionably, better ways and worse ways to accomplish a set of intentions.

(If you’re curious about how to figure out the right sequence of actions in software engineering, that’s essentially the core message of my first book, Code Simplicity: The Fundamentals of Software, which is free.)

All that said, there’s also no perfect sequence. It is possible to spend so much time trying to optimize the sequence in advance that you end up impeding your intentions. For example, if you spend three months trying to figure out the right sequence of actions for something that takes one day to build, you’ve wasted three months. If your intention was, “I’d like to engage in an academic exercise to try to optimize a sequence,” that’s fine. But if you had the intention, “I would like to have the maximum impact for the least investment of my time,” then you would totally fail at that intention.

Most engineering teams I’ve worked with do not spend enough time ensuring they are taking their actions in the right sequence. So usually, the problem here is that there’s not enough optimization of the sequence. But you can overdo it, so I wanted to mention that too.

Prerequisites to Action

Before one acts, one must decide how to act and then decide that one will act. Let’s say I have the intention to pick up a piece of food. Am I going to do it with my fingers or with a fork? Then after I decide between those, I have to actually decide to do it, and then do it.

Some decision always precedes an action. Before you found a company to build a software product, you have to decide to do that. The same goes for designing a new feature in your software, fixing a bug, writing a single function, or any other action you take as an engineer. There is always some decision involved.

Then, before decision, there is always some observation, perception, or data that triggered the decision or that helps us make the decision. You have seen some problem, you’ve learned some data, you have some reasoning, you feel some way, etc. For example, “we observed a user get confused about this button, so we have decided to make it clearer,” or “I have written a lot of software and I know that if you name your variables this way it will confuse the other programmers on this project.”

In life, a person could make a decision with no input (like “I have decided to yell for no reason”) but those aren’t the sort of decisions or actions that occur in engineering. In engineering, all decisions are based on some form of input.

I nearly always simplify this and just say that “observation” is required before a decision. Theoretically that could mean observation of your own thoughts or feelings, but if we want to be effective in the physical universe, it nearly always means “something I have perceived outside of me in the physical universe.” I find that the word “observe” tends to be the best for communicating that.

Thus, there is a universal sequence that is actually involved in all action: Observe, Decide, Act. Before I pick up a piece of food, I have to observe the food, observe that I’m hungry, observe the environment, observe my hand, and observe the fork. Then I have to decide whether to use my hand or fork based on those observations. Then I have to decide to actually pick up the food, and then do it.

Observation is King

You will find that the quality and quantity of observation is what matters the most for taking the right actions. Having the right inputs to a decision is what lets you make optimal decisions. The better your observation is, the better decision you will make, and the better the actions will be as a result.

If you wondered above “how can I know that I’m taking the right action before I take it,” or “how can I know that I took the right action after I took it,” this is the answer to both questions.

Often, people refer to this as “data” or “being data-driven.” In the full definition of the word “data,” that is accurate. However, people often don’t fully realize what data is—sometimes they think it’s just numbers in a table or in a graph. Now, don’t get me wrong—having real numbers in a table or a graph can be super powerful, if it helps you make the right decision. But there are so many types of observations that go into making a good decision that it’s dangerous to believe there’s only one type of observation that’s valid. Since people so often misinterpret the word “data,” I use “observe” and “observation” instead.

On the other side, there’s many people who believe that their own thoughts and feelings are sufficient to engineer a solution that will be used by many other people. That’s almost always an insufficient level of observation in order to make a good decision.

In fact, we are starting to even see what observations we need, and what the purpose of observation is, in engineering:

In engineering, the purpose of observation is for making decisions that lead to action.

Observations that don’t help us make decisions are useless. If I want to build software for tax accountants to do taxes, observations about a fish in the sea don’t help me. They might have other value for me personally (maybe I want to learn about cool fish) but they aren’t useful for engineering on my software.

Some observations are more helpful at helping us make decisions than others, too. For example, let’s imagine I’m building that tax accounting software and I’m trying to understand the problems those accountants have. A story from one of my friends who once talked to a tax accountant has some value, though it’s pretty minor. On the other hand, if a professional survey analysis company did a very thorough survey and analysis on my exact target audience with the exact problems pinpointed that tax accountants want solved, that would be tremendously valuable for me.

So we can say:

In engineering, observations are valuable to the degree that they help us make the right decisions that lead to us taking the right actions in the right sequence.

The Full Definition

This gets us to the final form of our definition:

Engineering is making the right observations, making the right decisions, and taking the right actions, in the right sequence, in order to accomplish the right intentions.

This gives us a huge range of things that engineers do that make them engineers. It shows us where we can grow, how we can improve our craft, and how we can know if we are doing our job well or less well. Maybe we know how to code, but not how to gather the data required to make those decisions, so we can grow there. Maybe we haven’t really studied up on how to sequence actions in our projects, so we can improve there. Maybe we are great at planning, but not so great at execution (taking action) and so that should be our focus.

As a side note, one of the reasons that I am so skeptical of standardized processes for software development (people showing up from consulting companies and saying “this is the exact, rigid process everybody must follow for all software engineering forevermore”) is that determining these sequences is one of the core jobs of an engineer. I have seen too many times where some overly-rigid process causes a team to do something stupid. Then I go and ask that team if they knew they were doing something stupid, and an engineer on the team tells me, “I totally knew we were doing something stupid and I knew what the right thing to do was instead, but I have been told I must do these things this way.”

That said, large-scale software development really is a team sport. It involves the coordination of lots of people. There isn’t a single person that’s going to do all of the observation, decision, action, and sequencing. There do have to be some agreed-upon methods of collaboration within a team. You can’t have everybody operating on a different sequence, a different set of intentions, conflicting observations, etc. So I’m supportive of setting up agreements about how a team or company will work, as long as they exist so that they enable engineers to take the right actions in the right sequence to accomplish the right intentions. There’s a lot more I could say about this and how to design a process like that appropriately, but that’s enough text that it would require a whole other blog post.

For now, I hope that I’ve helped clarify that an engineer isn’t just “a person who writes code.” It’s a person with a very deep set of skills that go far beyond that. An engineer has almost infinite potential for growth across every area of observation, decision, action, sequencing, and intention.

Exactly how that work gets done might change. How much effort is spent on each of those parts might be different at different points in the future. How much one engineer can accomplish might change. But there will always be engineers, as long as there are people living on Earth.

The post What is an Engineer? appeared first on Code Simplicity.

How to Build a Great Platform

2025-12-09T07:41:52Z

In the field of software engineering, we often talk about using or building a “platform.” Let’s talk about what that really means and the principles for how to build a great platform.

What is a Platform?

Sometimes people think of a platform as being “a thing a lot of people use” or “a system built by one team that is depended on by a lot of other teams.” I have a more nuanced definition that I think helps us differentiate between a product and a platform:

A platform is a system that has multiple independent customers, where the customers of that system can use the system and modify its behavior without requiring coordination with other customers.

Often, we think of a platform as a system that “runs” other people’s systems. A good example of that type of platform is Amazon Web Services. Amazon provides a way for you to run your own systems (including various pieces you would need to run those systems, like storage, routing, etc.) “on top” of AWS. When you use AWS, you don’t have to check with every other AWS customer things like, “Is it okay if I deploy right now?” “Can I get five hosts for running my service?” It’s very hard for what you are doing to negatively affect other customers, and it’s very hard for what other customers are doing to negatively affect you.

There are other types of platforms, though. For example, I once helped develop a platform that allowed people to display their own metrics in a common dashboard. They wrote the code that generated the metric, ran that code somewhere else, and our system just displayed the metric in a standardized fashion on a company-wide dashboard. Nothing was really “running” on our platform, but customers still could use the system and modify its behavior for their own needs without requiring coordination with other customers.

Broad and Narrow Platforms

Theoretically, a website like YouTube is a platform in the most limited sense. You upload a video, and that “modifies the behavior of the system” so that it now shows your video. You don’t have to check with every other person uploading a video on YouTube if it’s okay for you to do it right now. And you have some power to configure how that video displays and how your YouTube channel works.

So from this, we can see that platforms have different levels of freedom that they allow. Something like AWS would be a very broad platform—you can run almost any type of software on it. YouTube, on the other hand, is a very narrow platform—you have only limited freedom to modify how the system works.

Getting this trade-off right for your platform is one of the most important things you will have to do as a platform owner. How many “degrees of freedom” you chose to allow to your customer will determine how difficult it is to design, build, and maintain your platform successfully.

Automated vs Manual Platforms

There is another property that the best platforms have:

The best platforms do not require manual intervention from the platform owner in order for the customer to use them successfully.

Imagine that you want to release new versions of your system five times a day to your customers. What if you had to email a person at Amazon and ask them to manually deploy your code every time you wanted to do that? Not only would that be a terrible experience for you, it would dramatically harm the success of AWS as a product. Amazon’s customers would rapidly look for a different platform to use.

There are different levels of manual intervention that a platform can require. Two of the most common are:

Requiring manual intervention from the platform owner in order to onboard to the platform.
Requiring manual intervention from the platform owner in order to execute a common configuration change or fix a common support issue.

These have two problems:

You make your customers wait until a human technician is available to support them or onboard them. Often that’s not acceptable for their timelines and so they will search for other solutions. At the very least, it slows your business down significantly.
It drowns the platform owner in manual work and they become less and less able to actually work on the platform over time. You can even get into a vicious cycle where there’s so much manual work that you no longer have engineers available to develop the automation necessary to eliminate that manual work.

And of course, manual work can also be more prone to error and so you open up your system to more malfunctions. That is usually a minor issue, in practice, but it does happen.

Overall, if you have a platform that requires manual intervention, it’s still a platform, but it’s not ideal, and you’re at risk of it degrading over time as your engineers spend more and more of their time doing manual work and less and less of their time improving the platform itself.

Building a Platform

Okay, now we know what a platform is. How do you make one? What I always tell teams is:

You start with a product and then you turn that into a platform over time.

A “product” here would be a system that helps people accomplish some goal, but where the users don’t control the fundamental behavior of the system. Most software that we interact with is a product, at its core: Gmail, Microsoft Word, Zoom, Photoshop, etc.

With a product, you, the developer of the product, are focused on creating a great experience for your users. You’re focused on helping them accomplish a known, specific goal, and you are curating the entire experience they have of using your product. Microsoft has put incredible resources into optimizing the user interface of Office, as an example, so that people writing documents or creating spreadsheets can focus simply on the task they need to do and have the product “get out of their way” as much as possible.

Above we talked about how YouTube is a very narrow platform, for video creators. But for video viewers, YouTube is entirely a product: you show up, it recommends videos for you to watch, you watch them, done. I worked at YouTube, and I can tell you, the company cares deeply about curating that experience for you.

If you intend to set out to build a platform, you should first figure out: what’s the core value that I intend to deliver to my customers? What are the parts of that that need to be a carefully curated experience in order for my system to be successful?

For example, let’s imagine you wanted to build a deployment platform for your company—a system that takes code and ships it to production. The most core value that that is delivering is: I no longer have to do manual work to successfully and safely push my code into production. So the first thing you should build is not a platform, but a product that provides that carefully curated experience.

What would that look like in actual practice? Well, you would find a team that’s experiencing some pain in terms of manual deployments. You don’t want them to be too much of an edge case in the company (like, if there’s one team of three mobile developers and the rest of the company writes Java backend services, don’t start with building something for the three mobile developers). You also don’t want them to be too important to the business—never start with the largest customer first. You’re going to be building a new product and you need some leeway to experiment, learn, and make mistakes. If you try building this product for some team that the whole company’s revenue depends on, you won’t have that space to experiment—instead, you’ll get pushed on hard for deadlines that cause you to cut corners, and be forced to build features that you shouldn’t support or create this early on in the lifecycle of your product.

So basically, you’re looking for a customer that is representative of how a lot of the company works, who’s willing and able to take some risks, and who’s able to work with you closely to provide feedback on what you’re building.

You then do your best to build a great deployment system for just that customer. At this point, it doesn’t matter if all the work to modify or reconfigure the system is manual. In fact, it probably should be, because what’s going to happen is you’re going to build the wrong thing, give it to your customer, discover it’s the wrong thing, and re-work it until it’s the right thing. You will never get the product right on the first try. So it’s not worth it to try to automate everything up front, because you’ll have to throw out that automation and rewrite it again. The deployments themselves happen automatically for the customer, but if they need to change how those deployments work, it’s fine if that requires you, the product owner, to go make manual changes to code.

The one thing that’s really important to keep in mind while you’re building this product, though, is don’t back yourself into a corner where you can only support this one customer, ever. It should be possible to modify the system safely to take another customer in the future. The primary way you do this is by following the laws of software design. Don’t make things generic before they need to be—just keep in mind that you will have to onboard other customers in the future, and don’t lock yourself out of being able to do that. Really, the best way to do that is keep the system simple so that you can easily modify it in the future.

Getting this product right and polishing it is supremely important, especially during this first customer engagement. You want to work closely with this customer until you are sure that you have created a great experience for this customer. Then you go out and you find some other customers that are representative in different ways—people who have different requirements from your original customer—and go through this same process with them, still building a product.

You will know that you can expand beyond the first customer when you stop learning significant new things on a regular basis. That’s true for each of your other pilot customers, too—you can expand beyond that set of users when you stop learning significant new things from them that cause significant changes in your product. You don’t have to be completely “done” with each customer before expanding, you just have to be confident the product is successful for that customer and that you aren’t discovering new requirements very often anymore.

All of this will form the core foundation of your future platform. Most platforms that are clunky, difficult, or failing did not spend enough time polishing their core product before they “opened the floodgates” and became a platform.

Transitioning to a Platform

If you have done a great job of building a product, people will start knocking on your door and demanding that they be allowed to use your product, too. You might have to do some light marketing of the product, but I usually find that for platforms that are internal in a business, if you’ve built a great product, word of mouth will spread and people will just start demanding that they be allowed to use it too.

The first people you want to onboard are those whose requirements are nearly identical to your initial customers. Essentially, at first you want to stay a product, and you want to see what it’s like when you have to scale out that product to many customers. You will learn a lot about your product and improve it dramatically at this stage. You may even start to automate onboarding and provide some configuration settings for very common configuration requests that were requiring manual work.

However, eventually you will reach a point where you feel like the “barbarians are at the gates.” You will see a few things happen that indicate that you’re at this stage:

You start to have more and more insistent demands to onboard from your customers. They may start complaining that you’re actually harming them by not letting them onboard.
You will start to get more and more feature requests that seem questionable to implement in the core product. They address only small edge cases, but if you implemented them in the core product, they would be disruptive to all your customers.

This is the point at which you build a platform. Note:

It is very important that you work very hard and very fast during the “product” stage to polish the product before you get to “the barbarians are at the gates” stage. Otherwise you will be forced to build a platform before you are ready, and your platform will suffer for the rest of its lifetime, as will the experience of all your users.

What Do We “Platform-ize?”

One of the most key decisions you will have to make over and over, as a platform owner is: what do we implement in the core product, and what do we allow our customers to develop or configure on their own as part of our platform?

This should always be driven by the requirements of your customers, but be wary of customers who simply demand that they be allowed to have total control and configure everything. If you really let them configure everything, you would be totally defeating the value of your platform—you’re just making them write their own software in a new and different way. So how much leeway do you give to your customers?

Well, there are a few principles that drive this.

Non-Interference

Our first principle is:

One customer must not be able to interfere with another customer.

In our deployment platform example, let’s imagine that a customer came to us and said, “I want to be able to pause all other deployments at the company other than mine, whenever I want to.” Under almost all circumstances, you would say no to that. The only people who should be able to affect all customers (or even more than one customer at a time) should be the platform owners. What you would do with a requirement like that, if it was really a legitimate requirement (I personally would dig into that one more and ask a lot of “what problem are you trying to solve?” questions) is you would say, “If you need to do that, please let us know and we will do it on your behalf.” That lets you maintain control and puts responsibility in the right place—you know what’s going on with your system the best, and so if anybody is going to do something that affects the whole system, it should be you.

However, if you think about this more deeply, you’ll realize also that this principle of non-interference also means you need to implement guardrails. For example, if you’re AWS, it should not be possible for one customer to request so many resources in a data center that suddenly no other customer can run their critical workloads. If you’re a deployment system, it should not be possible for one customer to put so much load on your system that it breaks other people’s deployments. Building these guardrails is often most of the work of building a platform. You have to think through “how could one customer break another customer” and design the system so that that isn’t possible.

Reasoning About the System

One of the most common ways to kill a platform is to make it difficult or impossible for the platform owners to evolve the platform over time.

Whenever you provide some freedom to your customers, ask yourself this question: “What happens if we change our mind about how this feature works, or how the core product works?” Would you be able to make that change in a safe, automated fashion just by yourself as the platform team, or would you have to go ask all your customers to do manual work? Even more importantly, how would you even know if a change is safe to make in the future? Would you have to go around and talk to every customer and ask them “how are you using the platform?” and “is this change I want to make safe for you?” Or is there some way to do all analysis of your customers in a programmatic way that gives you total confidence about how your customers are using your product and what changes are safe to make?

In essence:

You must continue to be able to reason about how your system behaves, no matter what customers do.

It needs to be easy to make logical statements like, “If we change the tool that pushes code onto a server, for everybody, I know that’s safe without having to ask our customers, because ______________.”

You will violate this principle if you allow too much freedom to your customers. For example, let’s imagine that you are designing a platform that takes your customer’s code, builds it into a binary, and runs its tests (a continuous integration system, basically). What if you let every customer write completely different build scripts that can do anything they want? You’ve stopped being a continuous integration system and become a totally generic task orchestrator. You can no longer provide any value to your customers that would be specific to continuous integration. You can’t even reason about tradeoffs between security and value (like “should we let these scripts access the Internet?”) anymore. It even “infects” the testing part of your system, because the testing part of the system has no idea what it’s getting—it could be any output of any script. So then the testing system also becomes a generic task orchestrator. If that’s all you wanted, you could have just used one of the many open-source task orchestrators and provided very little value to your customers right from the start.

It’s also extremely difficult to get out of that situation once you’re in it. Imagine that you want to move from that state into having a more restricted, more standardized build system. You have hundreds or thousands of different build scripts all across the company, and now you have to either go manually look at all of them yourself or ask all of your customers to go “fix” their scripts (good luck with that). This brings up another thing you learn after working on platforms like this for a long time:

It is much harder to go from freedom to restriction than from restriction to freedom.

Essentially, you never want to give people something and then have to take it away. You always want to be “giving” them things that you will never have to take back. And when I say the “freedom to restriction” path is much harder, I mean orders of magnitude more effort—sometimes to the point that it’s impossible and you just have to abandon all hope of ever having control of that part of your platform ever again. (This is what causes platform owners to create backwards-incompatible “Version 2.0” editions of their platforms where they abandon all their existing customers, but boy oh boy, does that have its own whole new set of terrible problems!)

Overall, you must never allow so much freedom to your customers that you can no longer reason about the behavior of the system, and thus can no longer evolve it or enhance it usefully.

Escape Hatches

You may have heard of the “80/20 rule” for platforms. This means the core product offering should handle 80% of the use cases successfully in a way that’s really simple and great for your customers. Then the 20% of your customers with really unique requirements get more power in the platform and are able to service their more complex needs, even though that means they have a more complex user experience.

Here’s what really ends up happening a lot of the time: that 20% of users have a lot of power. Often, the largest and most important customer systems are in that 20% of users. They show up and demand total freedom, and you get into a situation where you can no longer reason about or maintain your platform.

All platforms need to keep this principle in mind:

There must be some way for customers to fulfill their valid requirements.

Your customers have real needs and they need to be able to execute them. They don’t live in a magical walled garden that you may have designed in your mind for your platform. There will always be customers whose requirements are way beyond what you ever intended to support in your system.

However, and I cannot stress this enough: the way those customers fulfill their requirements does not have to be your platform. There just needs to be some way that they can fulfill them.

All platforms need “escape hatches” that allow a limited set of customers to do anything they need to do. However, the burden of maintenance and support for that must go on those customers. As much as possible, the platform owners must be isolated from the cost of supporting those customers.

Usually, the way that you accomplish that is that you have essentially two layers:

A set of tools that let people do anything they need to do. For developer platforms, these are often command line tools with complex interfaces and immense power.
A “platform” built on top of those tools that creates a great user experience but severely limits the power of the system compared to the tools. This solves 80% or more of the common use cases, but requires customers to conform to a specific, standardized way of working in order to get the benefits. We often call this the “paved path” or “golden path.” It’s a little like a freeway—it doesn’t go everywhere, but it goes to the places that most people want to go in a fashion that’s much more streamlined than taking the back roads.

The vast majority of your users should be using the platform. For those who aren’t, you want some place where you record which customers have taken the “escape hatch” and been allowed to use the underlying tools due to their complex requirements. There will be a lot of people who will want to know who those teams are in the future, like your security team (“I need a list of everybody who uses the escape hatch, because there’s a vulnerability that we fixed centrally for everybody who uses the platform, but all the escape hatch users need to fix it on their own.”).

If you don’t have these escape hatches, you can’t say “no” to any customer. And you must be able to say “no” to functionality that shouldn’t be in the platform. Otherwise, your platform will degrade over time—sometimes even to the point that it would be better if you had never had the platform in the first place. When you have to be “everything to everyone,” your product can become so complex that it actually slows customers down rather than speeds them up.

One thing to watch out for, though, is when a lot of your users start using the escape hatch instead of the platform. This indicates a few different possible situations that could be going on:

A lot of teams have legitimate requirements that you haven’t implemented successfully in the platform.
There are no incentives in the business that cause teams to get onto the platform and so they “take the shortcut” that is less immediate effort on their part (using the escape hatch) even though it will be more expensive for them in the long run.

Almost always it’s an issue with your platform. Never be too confident that you’ve gotten everything right and there’s nothing left to learn. Even when it seems like the issue is incentives (people feeling they won’t get rewarded individually for doing the work to migrate to your platform, leaders feeling like the work isn’t important, etc.) often the real issue is something like: it’s much too hard to onboard to the platform.

People make decisions based on the data they have and the purpose they are trying to accomplish. If the data they have about your platform is “it will be hard to accomplish my purpose if I use that platform,” you’re going to have a hard time getting people to adopt it.

What is a Valid Requirement?

Often above I talk about “valid requirements” or “legitimate requirements.” What are those?

A valid requirement is something that would be best for the company as a whole, and which comes from a real problem the customer has.

As a platform owner, I would also generally say that there is a timeline aspect here: requirements are more valid when they are the right thing in the long term, not just in the short term. Once in a while there is a really compelling reason to compromise for the short term, but if you do, think about how you could eventually build toward the right long-term solution.

This is true even when you’re building your first product for a single customer. I don’t mean make guesses about the future. Don’t worry about “am I implementing this feature of my product in the best way for all other possible future customers?” That road leads to bloated, terrible products. Instead, just ask, “Is what I’m doing for this customer going to deliver the best outcome for the business?”

Later on, when you’re building a platform, it’s still the same question: “is what this customer wants to do the right thing for the company as a whole?” For example, imagine you are a company that has written everything in Java. A customer comes to you and says, “I intend to build this system in Python and so I need you to support Python systems.” Well, that could be legitimate. Let’s find out more: why do they want to use Python? “Oh, I just like it more.” So this customer wants to break the company’s standards based on a personal preference. That’s not a legitimate requirement.

On the other hand, let’s say that same customer had a different answer for that same question: “We are building a machine learning system and Python is the standard language for machine learning systems across the industry. The tooling we have available to us in Python makes a huge difference for the success of this product.” That sounds like a very legitimate requirement. Also, we just learned a lot more—the customer wants to deploy a machine learning system, which actually is going to have a whole additional set of requirements.

As a platform owner, you must say no to implementing requirements that are not the right thing for the company as a whole. Not only do those requirements cause trouble for the company, they also tend to degrade the quality of your platform over time. They add “cruft” that most customers don’t really want in order to provide functionality that the company doesn’t really need.

Invalid Requirements

It is very surprising how many platform requirements are not actually legitimate when you look into them. A common one is “it would be slightly more convenient for us if you did a huge amount of work to save us a little bit of work.” Sometimes customers don’t even look at how much work it would be to migrate their systems onto your platform. They are busy and they don’t even want to think about doing more work. It could be a few hours’ work for them to fix their system to be more standardized, but they feel overwhelmed and haven’t even developed that estimate.

If you had 1000 customers who would all be saved one hour of work if you did 20 hours of work, that sounds like a great tradeoff that you absolutely should do! But if you would be doing 20 hours of work to save 10 hours of work for one customer, that’s obviously not the right decision for the company as a whole, so don’t do it.

Another common issue is imaginary requirements. Customers sometimes believe they have requirements that they don’t actually have. A customer comes and says, “This must scale to one million queries per second.” You ask them why, and they say, “because our Product Manager said so.” You ask the Product Manager and they say, “I just want to be sure the system can do way more than what we need, in case we expand a little faster than what we expected.” When you make people do the real math, you find out that even if we exceed our wildest dreams in terms of usage, the system will actually be doing one thousand queries per second.

So you need to make sure your customer’s requirements are grounded in the reality of the actual problems they need to solve. Be polite about it—don’t go saying that they are imagining things. Just ask for clarification so that you understand the requirement as fully as you can. Often, even if it’s a legitimate requirement, you’ll discover a lot more about it by doing this. For example, if that was a legitimate requirement to serve one million queries per second, you’d learn a lot about things like: what latency do we have to serve those queries at ? How many different regions do we have to host this system in? Does traffic peak at certain times of day? And so forth.

One of the most dangerous types of requirements to accept is “please build this solution for us.” Like, imagine that you have built a security platform that scans code using different tools to find vulnerabilities, but the outcome you’re going for is successfully finding vulnerabilities, not just “running tools.” A customer comes to you and says, “Please start running this new vendor tool called QuxBugs.” Your response should always be, “Oh, thanks for your request. Can you tell me more about the problem you’re trying to solve?” Never accept solutions from your platform customers, only accept descriptions of the problems they are trying to solve. Once in a rare while, the solution they proposed will be the right solution, but most of the time it’s not. And unfortunately, even if you deliver that solution to them, they will end up hating it eventually because it’s the wrong solution.

Most engineers are not experts at specifying their requirements. You will have to help them. Usually, you know when you have a valid requirement because the solution becomes obvious (at least, obvious to you, the platform owner). Even when the solution doesn’t become fully obvious, having the right requirement opens the door to figuring out the right solution, and helps clarify the answer to any questions that come up when you’re developing the solution. In general, if you aren’t sure how you should implement something, there’s probably something about the problem you don’t fully understand.

The Challenges of Platforms

The above lays out what I know about the basic principles of how to design and develop great platforms that users love. The challenges in doing these things are mostly organizational and social challenges: changing people’s minds, getting leaders on board with “doing it the right way,” getting the funding and time required to develop the product and eventually the platform, etc. All of those are very real problems, but at the very least, if you understand and apply the principles and processes above, you’ll know how you should get to your destination. I hope that you do, and get to experience the joy of implementing, operating, and maintaining a truly great platform.

The post How to Build a Great Platform appeared first on Code Simplicity.

What Makes a Great Developer Experience?

2025-04-15T11:03:33Z

I’ve been working for over 20 years in the field of “developer experience,” where we help developers be more effective, efficient, and happy, by improving tools, systems, and processes. I have been intimately involved in designing key aspects of the developer experience at Google and LinkedIn, have been very involved with the research community in this space, and I’m constantly in touch with developer experience leaders at every major tech company.

I’d like to spell out for you the fundamental principles of what makes a great developer experience—the most important things to understand in the space. I’m only going to give an overview of each item, and it’s possible that I will miss some points (because there is a lot to cover) but hopefully this is a good overview of the key points.

Basic Concepts

There are three primary things you’re trying to optimize for with a developer experience:

Cycle Time (aka Iteration Time): The time between when a developer has any intention and the time when that intention is accomplished.
Focus (aka “Flow”): The ability for a developer to stay focused on the task they are working on, and not be interrupted.
Cognitive Load (aka Required Knowledge and Decisions): How much a developer must know in order to do the task they are doing, and how many decisions they have to make to accomplish their task.

Let’s talk about these points in more detail.

Cycle Time (aka Iteration Time)

The process of writing software involves large and small cycles, which are basically the time between when a developer intends something and when the result materializes in the physical universe. The smallest cycles are things like “I intend to write this line of code,” or “I intend to run this command line tool and see the output.” The largest cycles are things like “I intend to solve a problem by creating a product and having people use it,” or “we intend to redesign our system in a way that will take 100 people a year.”

In general, the way that you speed up the large cycles is by speeding up the small cycles. If you only focus on speeding up the large cycles, very often the quality of the result will suffer. For example, let’s say that I see it’s taking a team two weeks to ship any change to production. If I just go in there and say, “ship faster no matter what it takes,” and I don’t say anything else, then the team will likely cut corners in a way that harms the quality of the end result and the maintainability of the software.

However, if I investigate more deeply, I will find that there are smaller cycles that are taking too long. Perhaps code reviewers are taking forever to respond. Maybe the tests take too long to run, and so developers have to wait too long every time they make small changes. Maybe the deployment tool is too hard to use and so developers struggle with it and avoid it.

On the other hand, it’s always safe to speed up the smaller iterations (as long as you discover there really is some problem with them). For example, let’s say code reviews are taking too long. Often, what you discover in this situation is that the reviewers just aren’t responding fast enough. The solution here is to focus on that reviewer response time more than on the overall code review time. For example, you often discover in this situation that code authors are sending out PRs that are way too large, so it takes the reviewer forever to review them.

Code review is fundamentally a quality process, so you want the end result to be high-quality code. Sometimes it takes five rounds back and forth between the developer and the reviewer, with the developer submitting changes and the reviewer requesting modifications, until the end result is good. You want all five of those rounds to happen. But you want each of them to happen quickly. If each round happens quickly, you find that PR review times are only as long as they should be, and both the developers and reviewers are happy with the process.

This applies in all parts of software development. If you find that it’s taking a long time for people to code things, it’s worth looking into things like: how long do their builds take? How long does it take to run their tests locally while they are developing? How quickly can they get the information they need to make a decision?

It’s also worth noting that every cycle is some form of an Observe, Decide, Act loop, which I’ve written about extensively elsewhere. If you work on improving developer experience, one of the best ways to improve is to make it easier for developers to observe things. Put the information they need right in front of them right at the moment they need it. (Never put information they don’t need in front of them, because that is bad for cognitive load, as we will discuss later.) For example, when a test fails, the failure should clearly indicate what’s wrong so that a developer doesn’t have to go digging to figure out how to fix it. When they are creating a new project, make it clear what their options are (like what language, framework, how to name the project, etc.) and where to start. Just think: “How could I make it so that a developer never has to think and only has to look at something to know what to do next?”

Focus (aka “Flow”)

Software development is an activity that requires many hours of deep focus in order to accomplish successfully. Developers are building up a very complex mental structure of the thing they are working on and using that mental structure to make decisions and write code. They are constantly thinking (or writing in their notes) “Oh, don’t forget to do that other thing after we’re done with this thing.” This state of focus is often referred to as being “in flow.”

When developers get interrupted too thoroughly or too often, this complex mental structure vanishes and has to be re-built when they go back to the task. They risk forgetting that “do this next” thing that was in their mind, and actually shipping software that’s broken because of it.

We call this form of interruption “context switching,” where the developer’s focus is now primarily on something other than writing code. When you force a context switch on a developer, it can take them ten or fifteen minutes to fully rebuild the mental structure they had when they were focused on coding. So small interruptions can be very expensive.

Exactly how long an interruption can be before it causes a context switch varies between people. I’ve seen numbers between 30 seconds and 2 minutes. It also depends on what the interruption is. If the interruption is a phone notification that requires no thought, you look at it for 30 seconds, and dismiss it, that might not break focus. If the interruption is some very complex failure message that you get as an alert, it’s probably going to break focus no matter how long it takes to read it, because it requires so much mental work that it breaks the developer’s mental structure they had about their current task.

In my experience, interruptions can also be more disruptive if they cause a bad emotional reaction. Being upset about something is so distracting that it makes it hard to keep focusing on the work you’re doing. You feel compelled to do something about the upset, or at least think about it. If somebody sends me an upsetting message, or one of my tools behaves in a wildly frustrating way, that can break my focus even if the interruption is 5 seconds long.

If you work on developer experience and you want to improve flow, think about things like:

Do my tools force developers to engage in some complex task that isn’t coding, testing, or debugging, while they are doing those tasks?
Are we ensuring that developers have multiple uninterrupted hours to focus on coding?
Do my tools, systems, or processes do something that’s so frustrating to developers that it breaks their focus?

Another key point here is: do developers clearly know the purpose of the work they are doing and have a clear direction to go? That is, have I been given a clear task where I know why I’m doing the task, who it’s for, and what the intended result is? Otherwise, my own confusion will frequently break my focus. I’ll have to keep asking my co-workers what I’m supposed to be doing. I’ll sit there and wonder about my task instead of actually doing it. Plus, I probably won’t build the best possible thing, since I don’t have all the data I need to make good decisions about it.

Focus is another area where you want to put clear information right in front of people that they can use to Observe, Decide, and Act. Don’t make them interrupt their work to go hunting for the data they need right now. Sometimes, “I’m going to go learn about something” is a developer’s whole task, and in that case, making them search for information isn’t an interruption, as long as they can easily find what they are looking for. But if they are coding and they just need to know something like “what does this function do,” that information should be available directly in their editor as fast as possible.

Cognitive Load (aka Required Knowledge and Decisions)

“Cognitive Load” is a term that I don’t love, because it’s not clear what it means. For the purpose of developer experience, what we mean when we say that is:

How much does a developer have to know in order to perform a task?
How many decisions is the developer forced to make when performing a task?

I’ll talk about these points separately.

Reducing Required Knowledge

There are many people that would argue that developers should know as much as possible about what they are doing, and I agree with them. However, the productivity and experience of developers is dramatically improved by removing things they must know to do a task. Developers were far less productive when they had to write 1’s and 0’s to code. Writing in assembly language is far less productive than writing in a higher-level language like Java or Python. Does it make you a better programmer to understand how those languages translate down into assembly? Yes, it does. Should you have to know that in order to do every programming task? No.

This is an area where most teams that own developer infrastructure at a company fall down very hard. They make tools that require the developer to gain a deep understanding of that tool in order to do their job. This would be fine if there weren’t a hundred of these tools that the developer has to interact with in order to do their job. Sure, sometimes there are going to be more complex tasks that require a developer to go deep and really learn about one of the tools. But ideally, each tool should be intuitive enough that the developer doesn’t have to learn much about it at all in order to use it successfully. Even more ideally, the infrastructure should be redesigned so they don’t have to use that tool at all unless they really need to.

Reducing Choices

This is one of the most controversial aspects of developer experience, but after long experience at multiple companies, I can confidently say:

Developers should only have to make the choices they need to make.

Some developers believe they must be allowed to make every choice about their system—what programming language it’s written in, what package manager to use for installing dependencies, what build tool to use, how to indent the code, what monitoring system to use, what deployment system to use, etc. I understand that—I have strong opinions about what I like the best, too. But most of the time, decisions like that actually just force a team to do work that it should not have to do—work that distracts them from the core task they are trying to accomplish.

Taking it to extremes, imagine that every time you were going to start working on a change to your system, you had to do full research on what programming language to choose, which library to use out of 10 options that all do the same thing, what build tool you would use, argue with everybody about tabs vs. spaces, decide what code review tool to use, and generally make every other possible decision involved in software engineering. You’d never get anything done, right? What makes developers actually the most effective, efficient, and happy is just getting to focus on the task they need to do, not having to constantly make decisions about a zillion things outside of that.

Within a company I would even go so far as to say that developers should not be allowed to make decisions they don’t need to make. That sounds extreme, but actually it leads to a great developer experience, when done right.

The problem is that developers will often believe they need to make decisions about things that they actually don’t, but they won’t see the consequences of that for months or years (or they can’t understand how the decision will affect the larger business). For example, developers should not be allowed to choose any programming language in the world to do their task—it forces them to develop a bunch of new libraries or infrastructure around that language instead of focusing on the task they need to actually accomplish. It has a ton of other bad long-term consequences for a business, too, that are hard to see when you’re just an individual developer who really likes a particular language. (Which, once again, I understand! I am a programming language nerd and have lots of strong personal preferences.)

However, one must be very careful with this principle—developers need to be allowed to make the decisions they need to make. I have watched a lot of people code, and how each person uses their editor and the tools around it is wildly different, because the way people think is wildly different. You can’t put too many hard constraints on certain parts of people’s workflows, because it would dramatically harm productivity without any real benefit to the business as a whole. I’ve never seen anything good come of a company saying, “everybody must use this editor.” I have seen good things come out of, “we are going to provide more support for one editor than another, but we will still expose all functionality as command-line tools so you can use whatever you want.”

Another thing you have to be careful of, is that if your infrastructure restricts the decisions developers can make, you need to do a really good job of maintaining that infrastructure centrally. That is, a central team needs to be making those decisions and responsible for the results on an ongoing basis. If you say, “everybody must use this build tool for their Java projects,” then you’d better make sure that tool actually works for everybody, now and continuously into the future. And sometimes you need to provide a set of limited options. For example, if you just say “everybody can only write everything in Java,” you’re going to find it’s not an appropriate language for everything.

The key to doing this well is to deeply understand the requirements of your developers and the requirements of the systems they work on. You need to deeply understand the types of problems they work on solving, so that you know what decisions do or don’t need to be made. And you need to be able to adjust this appropriately over time (change policies) without opening everything up and allowing chaos to reign.

The truth is, most developers do not want to make every decision about their system and then be forced to do a bunch of custom low-level work before they can even get to their task. They love programming because they get to tell the computer to do something and have it do it, and because they enjoy solving the problems that help their users. Let them focus on what they need to do to accomplish that, not every single other possible decision or task in the whole universe of software engineering.

I wrote more about this in Reasoning and Choice, including a discussion of how this principle makes it much easier to reason about systems. That’s one of the most important aspects of any software system: the ability to reason about its behavior without running it, and the principle of Reducing Choices helps make that a reality.

Note that I talked about this above mostly in terms of infrastructure-level choices, because that’s what often matters the most if you’re a central team working on developer experience, but the principle applies broadly. Within a team, you can say “these are our code patterns, please don’t use other ones,” (as long as that doesn’t restrict necessary choices). When developing a tool, you can set sensible defaults so people don’t have to make a decision about every possible option the tool allows. There are many, many areas where this principle can be applied.

The Challenges of Developer Experience

Most of the difficult aspects of developer experience are human aspects, not technical aspects. Some of these have a technical component, but much of the difficulty is more how human beings make decisions, adapt to change, etc. The primary difficulties are:

Understanding the Problem: What are the most important things to work on to improve developer experience right now? How do we get enough understanding of those problems so that we build the right solution?
Managing Change: How do you roll out new changes? How do you get people to adopt new systems or behaviors? How do you deal with people who are upset about changes?
Providing Leverage: The tools and systems you build for developers need to not take so much work to use/adopt them that they actually create more work than they save.
Saying No: You can’t solve every problem in the world all at once. You have to be able to prioritize. Plus, sometimes you get demands from a team that really wants some tool or feature that would actually harm developer experience for the company.
Putting the Pain on the People Who Cause It: If one team does something that causes pain to another team, and the team creating the pain never feels any of it, the pain will grow unbounded.

Let’s cover these in more detail.

Understanding the Problem

The most important thing to know here is that problems come from users, and solutions come from the developers of systems. You must never, never reverse this relationship, where your users tell you what solution to build and you build it, and the developers sit around imagining problems that nobody actually told you they have. If you work on developer experience, developers are the “users,” and you are the “developer.”

This can be very tricky because sometimes your users involve senior executives or technical leads who have strong opinions about exactly what they want you to build. You still need to hold your ground, though, and just get from them what the problems are while you design the solution based on good data collection from all your different users. The weird thing is (I’ve seen this happen so many times) if you build exactly the solution your users told you to build, instead of doing good research and then designing the best solution, your users will not like it. They might like it at first, but over time they will come to hate it, or the executive who “designed” it for you leaves and the new one doesn’t like it.

It’s also very tempting to think, “Hey, I’m a developer, so I know what developers want,” and entirely skip talking to your users. That doesn’t work, because developers all work very differently, and often have very different requirements from you. For example, if you’ve never spoken to a Machine Learning engineer, you will be shocked to discover how different that experience is from being, say, a frontend web developer.

If you want to understand how to gather data and feedback from developers, I worked extensively on the LinkedIn Developer Productivity and Happiness Framework, which provides some guidelines for how to do that. However, if you work at a small company, you can also just talk to people. It’s pretty easy.

Feedback and Complaints

After you’ve built something, it’s really important to seek out data and feedback about the system. Even though we love the things we have built, we should always be searching out problems that developers have so that we can solve them—including problems with the thing we just made! This ends up creating the best developer experience for the company over time.

One thing to understand here is that in the field of Developer Experience, people will rarely send you positive feedback. When developer tools “just work,” developers mostly don’t think about them. They are focused on the tasks they are doing, not on your tools, and that’s how it should be. However, if some tool or system made it hard for them to do their job, even a little, they’re very aware of that.

Also, not every developer is a great feedback provider. Often, they just had a very frustrating experience with a tool, and they are going to provide you some very emotional feedback that isn’t going to be very considerate of your feelings. Certainly, people shouldn’t do that, and if you’re going to write feedback, I encourage you to realize that the person you’re sending it to is a smart person with good intentions, so please respect them. But sometimes it happens, and you shouldn’t take it as a personal attack on you.

When you get very harsh feedback, the key to handling is is: let the user know they have been heard. If there are genuine deficiencies in your tool, agree with the user about those! You don’t have to insult yourself or your own work, just be honest about the state of the stuff you own. Users actually respect you more when you’re willing to say, “Yep, I agree that’s not a good experience.” That doesn’t mean you have to fix it right away—your team sets its own priorities. Just let people know they have been heard, that their complaints are valid, and a lot of the noise will die down.

Sometimes people just keep being mean after you’ve heard them and acknowledged the pain, and in that case you should actually just tell them it’s not okay to behave that way, and talk to their manager if they keep doing it. Only a small minority of people behave that way, though. Never ever do this as your first response to feedback.

Managing Change

Growing up in America, I was taught in history class about the great revolutions that changed history: the American Revolution of 1776, the French Revolution, the Soviet Revolution, and so forth. The stories made it sound like leaders caused a violent revolution and then the world changed almost overnight. However, when you look into what actually happened in many of those situations, the “violent overnight revolution” actually just re-established a government much like the previous one (or worse) and true, positive change happened through a slower evolution over time.

The analogy I like to use is turning a ship at sea. There’s only a certain rate at which you can turn a very large ship before it actually breaks in half. The larger the ship, the slower this rate of turning is. A company or team is much like this. The larger it is, the slower it will safely “turn” from one direction to another. Now, to be clear, I’m not saying that changes have to take forever. But they also don’t happen overnight with the wave of a magic wand.

There’s a lot to know about how to roll out changes safely and successfully.

Change Aversion

One of the major factors you have to deal with, when improving developer experience, is what we call “change aversion.” Any change you make to a system will receive negative feedback from somebody when you roll it out. This happens not because there’s something actually wrong with the change, but because the user was used to the way it used to work, and they don’t like that it changed. They might be critical of the new user interface, or have some emotional argument about why the new thing “sucks,” but often they are really just upset that any change happened at all. This isn’t something unusual; it’s a factor that exists in almost all human beings. People tend to have a certain appetite for how much change they are willing to experience in a period of time, and if you go beyond that, they get upset.

It’s important to recognize when people’s feedback is simply change aversion vs. real feedback about something valuable. There are a few ways to tell:

Change aversion usually lasts 3 to 10 days. If you get feedback from a user within the first 3 to 10 days after you roll out a change, and that same feedback is not being provided by a huge number of your users, it’s worth considering whether the user’s response is just change aversion.
Change aversion feedback is often emotional. It might be insulting. It might be expressed as just an opinion (“The color of the new menu is ugly”) vs a fact (“I ran the new command line tool and it’s 10 times slower than the old command line tool.”).

What you are trying to avoid, when managing change, is creating so much change aversion that you cause a revolt. A revolt basically looks like so many people getting angry that they go to your management and they stop your work. When in doubt, roll out smaller changes to fewer people with a slower rate of expansion. Over time, you will learn how much change you can roll out, how quickly. Never, ever do a “big bang” release where all of your users experience massive change all at once.

Now, all of this said, never attribute all feedback to “change aversion.” Often people do have feedback that is legitimate. If many people provide you the same factual feedback, and you look at it and their feedback makes sense and you think fixing it would improve the product, that’s very likely not change aversion. Plus, saying “this is just change aversion” doesn’t mean you should totally ignore the feedback. You should at least acknowledge it, let people know they were heard. It does mean you should not argue with the person or try to reason with them about it, because they are having an emotional reaction where you trying to “logic” them out of it won’t help anybody. If you think it’s change aversion, you acknowledge them, and they just keep fighting with you, sometimes you should just ignore them and get on with doing your job. If they come back 3 days later with a more reasoned argument, then you know it’s not just change aversion, and also they’re probably being more helpful by then, anyway.

Incremental Rollouts

In addition to incremental development and design, one has to know how to incrementally roll out changes so that not all of your users experience all change all at once. This basically has three components:

Managing releases so that they are as minimally disruptive as possible. This means small changes that try their hardest to not require your users to do work to adopt the changes.
Picking a good initial cohort of users to roll out to, that is big enough to get useful feedback from, but not too large.
Understanding when to expand the cohort and how quickly to expand it.

Minimal disruption means things like: don’t break your users’ systems. Don’t require your users to do manual work to adopt the change. Provide clear documentation when necessary. Make sure the system provides very clear error messages so developers know what to do if it fails when they first try it. Basically, put yourself in your users’ shoes and think, “I have a lot to do and I use 30 tools every day to do it. What experience do I want to have when there’s a new feature or new tool for me to use?”

How you choose the initial cohort is mostly about who’s going to provide useful feedback, and how many people you need to have in the group in order to get that feedback. While you’re developing, often you and your team are enough. Then perhaps your tech leads, a few senior folks in your org, and then one other team outside your area that really wants to adopt the tool. You want your initial users to be genuinely engaged, not forced to use the system, because that’s going to get you the most active usage and the best feedback. Then you can expand to more teams, then some percent of the company, and then the whole company.

The speed at which you can expand depends on a few things:

How much feedback you’re getting. If you’re getting enough feedback from the current set of people and you have a lot of work to do to fix that feedback, then don’t expand the cohort. After all, what value are you going to get from getting the same feedback from even more people? Also, rolling out a product that you already know needs major work will harm your credibility.
How much manual support or manual onboarding work you have to do for every new team. This is something that needs to be reduced before you roll out broadly. Otherwise your team will get overwhelmed by manual work and won’t be able to make forward progress on the tool. The time to automate (or fix the problems that cause a huge support load) is when you only have a small number of users and you know you’re about to roll out to a large number.
How disruptive the change is, or how much change aversion it causes. Obviously, the more disruptive it is, the slower you will have to roll it out. You have to consider how much work you’re putting on the rest of the company, and how that affects the rest of the company’s ability to achieve their own goals.

Driving Adoption

Let’s say that you’ve done a great job of incremental rollout, but you just can’t get people to use your new thing. How do you fix that?

First off, you have to be sure that you have truly understood the problem and that the problem is one that developers know about and really want solved. Then you have to make sure that the solution really handles that problem, without causing a lot of new problems for the developer (like the new system is hard to use, fails often, doesn’t do things as well as the old system, etc.). You have to listen to feedback and address it.

However, there is only one solution that gets you to 100% adoption across an entire company:

The only way you ever get 100% adoption is to make your tool take zero steps to use.

What does that mean? Basically, it means that instead of giving developers some new, manual thing to do, put the functionality directly into their workflow. Instead of making them run a tool to check their PR, you make it happen automatically during code review. Instead of making them run a tool to format their code, you make the IDE automatically format their code at an appropriate point in the workflow. Basically, you find something the developer already naturally does, and you improve the experience of that instead of making the developer do something new.

If it takes one step to use, maybe you’ll reach 80% adoption. If it takes two steps, you’ll be lucky to hit 30%. If it requires the developer to read a long document and follow a set of complex instructions, well, now you know why you can’t get anybody to use your thing.

You should never start by making a tool take zero steps to use. It’s a lot of work to get that right, and it’s not worth it at the start of a tool’s lifetime. Early on in its lifecycle, it’s fine if the tool is harder to adopt. You’re only giving it to small, enthusiastic groups for initial feedback, anyway. You want to focus on getting the tool right before integrating it straight into people’s everyday lives. But once you do have it right, “zero steps” is the mantra that will drive you to 100% adoption.

Providing Leverage

The whole point of working on a tool or an improvement to developer experience is that you should be saving more effort for the team/company than the amount of effort you put into creating and maintaining the tool. This is even true if your tool primarily focuses on improving the quality of work rather than the velocity of work—think about how much work it would have required to get the same quality result without your tool. (Sometimes the answer is “infinite,” because it would have been impossible to get that high-quality of a result without the tool!)

Now, remember that it’s more important to reduce the effort of maintenance than the effort of creation, in software. So you have to consider not just the immediate impact of your tool (and the immediate cost of building it) but how much effort it will save over time (and how much it will cost to maintain the tool, over time). Different companies have a different “time horizon” for making this calculation—that is, if I put in X hours now, I save Y hours of total effort at the company across Z years. Z is the “time horizon”—how far out a company is willing to look at savings. At Google we usually set that time horizon at two years, so any investment in developer productivity had to “pay off” within two years.

Thus, whenever you look at an investment in developer productivity, quality, etc. you have to consider how much work it will be to develop and maintain the solution vs how much value it will actually deliver to the company.

Human Time

One of the most common mistakes in this area is saving machine time (the cost of CPU usage, memory usage, and disk space over time) with some optimization and forgetting about how much human time you’ve saved. In most situations, the cost of an hour of human time is hundreds of times more expensive than the cost of an hour for all the machine time involved. At Google, one of the most popular documents I ever wrote was essentially a proof of this principle, showing that you’d have to save tens of thousands of hours of machine time in order for it to be worth a few hours of your own work. There are plenty of legitimate reasons to optimize machine time, but improving developer experience is rarely one of them.

When you think about the cost and value of developer productivity improvements, think first about how much human time you would be saving.

Serving Your Developers

If you work on developer productivity, one of your mantras should be “we serve our developers, our developers don’t serve us.” It is all too easy to ship some tool that makes developers do more work than it saves them, or just more work than they ought to have to do. The tool might make them fill out forms of information that could be automated. It might expose a lot of complex implementation details that developers have to learn about, instead of allowing the developer to just express their intent and having the system work out how to accomplish that.

If you work on developer experience, it’s a lot more work for you to make the tools behave this way. It’s way easier to ship something that forces the developer to do all the work. But it defeats the whole purpose of what we are doing—providing leverage.

Saying No

Teams that work on developer experience tend to have a very close relationship with their customers. They may sit in the same room or the same building as you. As mentioned above, they may include senior leaders at the company who have a lot of authority. Plus, your customers are developers, who tend to have very strong opinions about what they want, have numerous complex requirements, and be under time pressure to deliver results for their own projects.

Balancing this out, there is a limited amount of work that can be done by humans on developer experience, just based on the number of hours in the day, how much time it takes to design good solutions, how many people work on solving the problems, how much the solution can actually be split between multiple engineers to solve it, and so forth.

As a result, when requests come in to the team, there has to be a way of providing one of three answers: “yes, we will do it now,” “we will do it later,” or “sorry, we will not do this.” You can abbreviate these as “yes,” “not now,” and “no.”

Yes

There are many situations in which you should be able to say “yes, we will do it now.” In the world of developer experience, there is a lot of legitimate, valuable unplanned work that comes up. Often this is small, uncontroversial fixes to a tool or system that can be handled in an hour or two. There are also things that can come up that take a day or two but provide immediate and obvious leverage, and don’t need to wait and go into a planning process.

In order to be able to say “yes” to small amounts of immediate work, you have to allow for unplanned work in your planning process. If you do a quarterly plan that allocates 100% of your team’s time, you will destroy your relationship with the rest of the company. You never want to be in a situation where one of your users comes to you and says, “Can you fix this typo?” and you have to respond, “Please put in a request and we will put it into our planning process and consider it in three months.”

Keep in mind that any team does have a limited capacity for doing immediate work, though. Developers on the team need to understand how much immediate work they can actually say “yes” to, and when to instead answer “not now” or “no.”

Not Now

For work that takes longer than a few hours or days, which requires requirements research, which might be controversial, which would be difficult to roll out, etc. there has to be some process that takes in customer requests and prioritizes them for deeper work. Developer experience teams and their managers must not say “yes” to every single request. If you do, what happens is that you never deliver anything of great value and high quality, because you’re always running around fighting immediate fires, delivering half-baked solutions, and then fighting the new fires that the half-baked solutions cause.

There’s a lot to know about prioritization, but it should be based on your understanding of the problem, how much leverage the solution will provide, and how much total effort it will be to create and maintain the solution.

One of the trickiest pieces here is that engineers on a developer experience team are often the “front line” for receiving these requests, and they have to have some way to politely defer requests into the “not now” category (often called the “backlog”). Engineering managers, product managers, and project managers need to establish a known process for putting requests into the backlog of a team, so that engineers on the ground don’t feel awkward about saying “not now.” Engineers on the ground also need to understand how to escalate requests to a manager who can say “not now” on behalf of the team, if the conversation gets difficult.

Most customers are fine with “not now,” as long as they know vaguely when they’re getting the result. You don’t have to provide exact dates for all requests. If they are in the far future, you can say “next year.” If they are in the near future, your customers likely want more precise dates, though everybody should understand that delivery dates in software engineering are always an estimate. Never force the developer experience team to deliver a bad product just to meet an arbitrary deadline—the cost of that bad solution to the company will be far more expensive than just waiting another week for the developer experience team to build something better.

No

Saying “we will never do this” is the hardest response to give to a customer. It feels like you are being unkind. However, you have to be able to say “sorry, no” to requests that would genuinely harm developer experience at the company more than it would help it.

For example, maybe you are replacing an old system, and your efforts will take a year, and you know from understanding the problem that the new system will bring dramatic improvements to developer experience. A team comes by and says, “we need you to put in a full quarter’s worth of work on the old system to make our experience better.” Most of the time, that’s simply a “no.” If they are getting an awesome result in a year, it doesn’t make sense to defer that awesome result by a full quarter to get a minor improvement in the existing system.

However, you can only say “no” when there is some other way for the customer to accomplish the things they actually need to do. If you have made your tool into the only way to do things at the company, and there is some feature that your customer must have in order to do their work at all, then you are forced to implement the feature no matter what. This is why I advocate for there to be multiple layers to any developer experience system:

A set of tools, libraries, and infrastructure at a low level that allow people to do anything they need to do, but which are complex to use.
A high-level platform that is simple to use, but requires customers to work in a very specific way in order to take advantage of it.

Usually the high-level platform will cover 80% of the use cases, and 20% of the use cases will need to “break the glass” and use the tools that exist beneath the platform.

The great part of this system is that the high-level platform can (and should) be very aggressive about saying “no” to feature requests that would harm the experience of using the platform. Of course, the owners of the high-level platform have to have a very good understanding of the problem, so they know what really needs to be built. But the whole point of the high-level platform is to provide an incredible experience for 80% of developer use cases at the company, which means that the whole experience needs to be curated carefully and you need to be able to say no to a lot of feature requests that would degrade the experience of using it.

That’s only possible, though, when there are low-level tools, libraries, and infrastructure (sometimes we call these “low-level primitives”) that users can use to accomplish anything they need, so that you can tell them, “Sorry, the high-level platform won’t do that, but you can still use the low-level primitives to do what you need to do.”

Kindness

A note about kindness: it may feel mean to say “no” (or even “not now”) to the person you’re talking to. But it’s far more unkind to harm the developer experience of all the people you’re not talking to, by saying “yes” to things that you shouldn’t say yes to.

Putting the Pain on the People Who Cause It

If one team can do something that causes difficulty (“pain”) to another team, and the team creating the difficulty never experiences any difficulty themselves as a result, then the amount of pain created will just grow and grow and grow.

This isn’t about teams being ill-intentioned. Nobody wants to hurt their co-workers. It’s about natural incentives and how humans respond to those.

Let’s say you have a team that owns a library that is used by ten other teams. The library team has their own goals, their own deadlines, and their own priorities. They are focused on what they have to do to accomplish those. If they are allowed to ship breaking changes to their customers every day and they experience no consequences themselves for doing so, then they will ship breaking changes every day, because that’s the most efficient and effective way to accomplish their goals. There might be people on the library team who feel they shouldn’t behave this way, but over time the pressures of the business will force them to behave that way no matter what.

There are different forms of “consequences” a team can experience that are more or less effective. What matters the most is how quickly they experience those consequences and the degree of difficulty they experience compared to how much difficulty they foist off onto their customers.

Surprisingly, in this situation, human consequences like bad performance reviews or people getting mad at the team are the least effective. They happen way too long after the pain is caused, and they don’t shift the real-world incentives that cause the team to behave the way it’s behaving. They just make the team feel oppressed—they did what they had to do, and they got yelled at for it. That’s injustice.

Responsibility

The right way to fix the problem is to change who has to do the work, which shifts the incentives and the fundamental “laws of physics” for teams. The rule I’ve found to work the most consistently is:

If a team wants to make a change, they are responsible for performing that change on behalf of the company.

If you are a library team and you want to make a breaking change, you have to refactor the codebases of all your customers in order to adopt the new, breaking change. If you are a cybersecurity team and you want a new control to be implemented on all the codebases at the company, you have to install it yourself in all those codebases and make sure it works.

This requires a lot of tooling to make it possible. You have to be able to make large-scale changes across the company using centralized tooling. You have to have some centralized way to know if you’re breaking people when you roll out some change. You have to be able to understand every consumer that depends on one of your libraries or systems, so you know where you have to make changes. There’s a lot to do here. But it dramatically changes the culture of the business, the effectiveness of infrastructure teams (including developer experience teams), and the overall ability of the company to do work. It changes infrastructure teams from work-creators to leverage providers, reduces the total amount of effort required at the company to adopt a change, and dramatically increases the rate of helpful change across the business.

Limitations

There’s a lot to know about this paradigm. For example, there have to be contracts between infrastructure providers and their customers as to what changes will be automated and who is responsible for fixing things when a change breaks them. For example, at Google, we had what we jokingly called the “Beyoncé Rule,” which said, “if you liked it then you should have put a test on it.” In other words, if an infrastructure team tries to roll something out and an automated test breaks in one of the customers, then it’s the infrastructure team’s job to fix the problem. If an infrastructure team tries to roll something out and no test fails even though the customer’s system is actually broken, then it’s the customer’s job to fix the problem.

Now, you do want to push central teams to refactor as much as possible themselves—they have a tendency to avoid looking at their customers’ codebases, because it’s hard to confront a codebase you’re not familiar with. So teams have to be pushed to actually do work centrally. (For example, you can have a check in your tools that enforces “if you’re going to send out more than 100 PRs or file tickets against more than 10 teams, there’s a committee that has to review your change, which has certain guidelines that will require you to automate as much of the work as possible.”)

However, there is a limit to what a central team can actually do. There are legitimate times when customers have to do manual work to adopt a change. You need to make sure that the central team has automated as much as possible, first. If you require customers to do manual work, you need to make sure that the instructions you’re providing are as clear as possible—ideally, try it out yourself at least once to make sure it can actually be done, makes sense, etc. There also has to be some control on how much manual work is being required for the company at any given time. So if you need a large amount of manual work done, that has to go through some central committee that makes sure the company isn’t experiencing too much “churn” (manual or disruptive change) at once.

All of that said, there are times when automation isn’t worth it. If your change requires two teams to do three hours of work to adopt it, and automating it would require 20 hours of work from your own team, don’t do it.

Final Words

That’s what I can think of for all the fundamental principles of how to make a great developer experience. There’s a lot more to know about each point above, enough to write a book, or talk for ten hours straight (at least). But the above covers all the core concepts of what needs to be known about each area, as far as I can think at the moment.

Now, of course, you also want to ensure that the result of developers’ work is high quality, because the ultimate purpose of software is to help people as much as possible. And you want to ensure the resulting system is simple (meaning easy to read, understand, and correctly modify) so that it can be easily maintained over time. However, these points are mostly something to just keep in mind as you implement the systems described above. They are mostly accomplished not by tooling, but by every software developer understanding and applying the fundamental principles of software design in their day-to-day work. Tools are very important for enabling these things to happen, and tool developers can’t forget about them, but it’s not like you just release new tools, libraries, infrastructure, or platforms into the world and these qualities magically appear in every codebase. They are the responsibility of every software engineer at the company on an ongoing basis.

That said, I hope that what I’ve written here is helpful, and I’d love to know if it helps make life better for you or developers at your company, or just hear your thoughts about any of it.

-Max

The post What Makes a Great Developer Experience? appeared first on Code Simplicity.

An Analogy for Software Development

2025-03-07T08:38:51Z

Sometimes, I have to explain software development to people who are not software developers. Over the years, I have come up with an analogy that explains what software development is like and its processes. I have successfully used it to explain software development to a 9-year-old kid, including advanced concepts like cybersecurity and so forth. I figure others might benefit from it, so here it is:

Imagine that you live in a world without computers. You are the owner of a custom car factory, where people can write down on pieces of paper any car they want in the world, and your factory will build it for them. However, these cars are not built by humans. Instead, they are built by special robots that can read and follow instructions (but not think for themselves).

The way that the robots know how to build the car is that there is a special book of instructions that describes how to build each piece and how to put together these pieces into any car that anybody could want. This book is huge—100,000 pages of exact instructions for the exact steps that the robots need to do, taking into account any situation the robots could potentially find themselves in while building the car. (For example, what if one of the robots breaks while they are building a piece? What if the factory runs out of materials for the piece they are building? What if a customer asks to put together two pieces that no customer has ever asked for before, and the instructions for how to do that aren’t in the book? And so on—every possible circumstance that you can imagine. That’s why the book is so long.)

It is obviously impossible for one person to create the instructions in a book with 100,000 pages. So this book is written by 1000 people, all working together on the same book. Some people write instructions for how to build car engines. Other people write instructions for how to build car doors. And so on, for every piece of the car and every way of combining them. All the instructions in the book have to all “agree” with each other—for example, if I build a door, it has to be able to attach to the body of the car. If I build tires, they have to fit the wheels. So all the authors of this book are constantly working together to make sure that all the instructions in the book work correctly together.

As new types of car parts come out, new instructions have to be written. As the authors of this book discover new situations the robots could find themselves in (“Oh, we didn’t realize that when it rains, some of the pieces can rust!”) they have to update the instructions in the book. In other words, not only does the book have to be written once, it is actually constantly changing. In fact, there is almost always more work needed to keep the book up to date than there is work being done to write new instructions.

Now, if this sounds like an impossible problem to solve, don’t worry. No single human being could possibly comprehend the entirety of the book. So if you feel overwhelmed by the problem, that’s normal! Every human being in the world feels that way about the problem. It’s 100,000 pages of constantly-changing instructions. So how do we do it? We set rules about how the book will be written. For example, we say “doors will always attach to car bodies the same way.” Like, the doors will always have the same type of hooks that connect them to the body, and the bodies will always have the same place for those hooks to fit in. That way, no matter what door you build or what body you build, they always fit together. Now we don’t have to think about that problem any more. We set lots of rules like this so that each person writing the book can safely work on their part of the book without worrying they’re going to break the whole car just by changing one instruction about the wheels or the engine. As long as each author follows the rules, they can change anything in the book and the whole car will keep working. This limits what types of cars we can possibly build, but it makes solving the problem possible, where otherwise it is impossible.

These authors, the people writing and maintaining this book, are software developers. The problems they encounter are nearly identical to the problems that software developers and development teams encounter. For example, how do you know that you wrote instructions that actually work? What happens when somebody new starts working on the book, how do they learn the rules? What happens when there are so many rules that the authors can’t remember all of them? How do the authors learn about new cars and new car parts that customers want to build? You can explain essentially all the processes, problems, and principles of software development to anybody, using this analogy.

I hope it helps!

-Max

The post An Analogy for Software Development appeared first on Code Simplicity.

Code Simplicity: The Fundamentals of Software is Now Free

2022-06-07T10:09:39Z

About a year ago, a Twitter user tagged me and some other programming authors in a thread where they described the barriers to accessing computer programming books in their country. I’ve been made distantly aware of these problems before—there are many countries in the world where the cost of a book in USD could be a person’s entire weekly salary.

I didn’t write any of my books to make money—I wrote them to get a message out and to help people. I usually think that people are more likely to actually read a book if they pay for it, and the point was to get people to read the book, because that was the only way I was going to change the software industry for the better. The book does still sell copies (which is unusual for a computing book, since it’s been ten years since its release) but the money it makes is not important to me—it’s getting people to read the book that’s important to me.

Once I realized that there was a huge population of the planet that was entirely barred from reading the book legally if they had to pay for it at all, I worked with my editors at O’Reilly to see if we could make the book completely free.

It turns out that for complex reasons beyond their control, they can’t make the book free on Amazon or in the O’Reilly store. But they can give me the distribution rights to the book, take the cover off, and let me distribute it for free!

So here you go, you can now download Code Simplicity: The Fundamentals of Software for free! I hope that this gets more people to read and understand the fundamental laws of software design, and that it helps make the world of software development a better place.

The post Code Simplicity: The Fundamentals of Software is Now Free appeared first on Code Simplicity.

What is a Monorepo, Really?

2022-03-11T16:20:14Z

There are often discussions at software companies about whether they should or shouldn’t have a “monorepo,” meaning “a single, version-controlled repository for all code at the company.” Very often, people base this decision on the fact that this is how Google stores its code.

I have now worked in developer productivity organizations at a company with a very advanced monorepo (Google) and a company with a very advanced multi-repo system (LinkedIn), and I have to tell you: most of the valuable properties that people associate with a monorepo have nothing to do with how many source control repositories you have. In fact, what people (and Google) consider a monorepo is actually multiple different concepts:

Atomic commits across different projects. (And thus an atomic “head” commit that moves forward atomically for all code.)
A universal directory hierarchy and a single view of all source code.
The single place where you go to check out or commit code. (Including all tools that read or write stuff.)
(Sometimes) The smallest unit of check out, commit, and dependency is a file.
(Usually) No concept of a project, only concepts of directories and files.
(Sometimes) The One Version Rule: There may only be one version of any dependency in the repository at any one time.
The ability to require library maintainers to solve the problems they cause.

I’ll talk about these in more detail, including some of their upsides and downsides.

Atomic Commits Across Projects

Let’s say we have two separate projects, A and B. We want to make a change that affects both of them. Part of a “monorepo” is the guarantee that you can commit atomically to both of these projects simultaneously. There is no view of the repository where Project A is at Commit #1 but Project B is at Commit #2.

This is especially important where you want to make a change where either Project A or B would be broken if they are not changed at exactly the same time. For example, let’s say we have one project called App, and it depends on a project called Library. We want to change the signature of a function in Library and update App at the same time. If we just update Library or just update App, then App is broken.

This is the feature that most depends on things being in a single source code repository, because practically the definition of “a repository” is “a location to which you can commit multiple files atomically, which tracks those atomic commits, and from which you can check out at any point in that atomic commit history.”

This feature also implies that there is a single definition of “head” (the most recent commit) for the entire repository. This is important to think about because when developers check out from a repository, they usually check out at “head.” This means that when developers check out, they are guaranteed a consistent view of the entire source code tree, no matter how many projects they check out simultaneously. They never have to think about whether they checked out App and Library at two different versions that are incompatible with each other. For the most part (as long as you have a good testing system that validates that all commits actually work, which is a complex problem in and of itself) code checked out at any given commit should all work together.

A Standardized Cross-Project Directory Structure

All code in a monorepo is thought of as being in a single directory structure. This has advantages when you are developing, and advantages when you are browsing through code.

While Developing: Checking Out Is Standardized

During development, if Project A is stored at /path/to/project/A in the repository and Project B is stored at /path/to/project/B in the repository, they will be in directories right next to each other when I check them both out. I can guarantee that that will be the directory structure. I never have to think about where I should place Project A on the disk in relationship to Project B, if I need to have them work together while I am developing.

For those who are used to a monorepo, this may seem like a small detail. However, in most multi-repo systems, this can be very confusing. If I am working on an App that depends on a Library, and I want to modify them both on my disk to test how the two modifications will work together, it can be very confusing to figure out how to get the App to consume my modified Library.

All this said, there’s nothing about this principle that actually requires a single source code repository. There could be a standardized way, provided by tools, that projects are always checked out, even if you have multiple repos.

A Uniform Way of Browsing Code

Since you have a single directory structure, it’s relatively straightforward to browse through directories in your code search tool, and to have a single code search tool that searches that one repository.

However, there’s nothing preventing you from having a single, universal view of a multi-repo system via some UI tool or some virtual filesystem. It’s more complicated because there isn’t an atomic “head” for a multi-repo system—all repositories are at different versions at different times. However, you could either (a) account for that in the UI of your code review tool (such as by making the version number part of the “path” people see when they are browsing, or letting people choose versions somehow) or (b) decide that when you’re browsing or searching, you always see the “head” commit of every repository (which is how most code search tools work today anyway).

A Single Place to Check Out and Commit

This may seem unimportant, but one of the values of a monorepo is not having to think “which repository do I check out from?” Instead developers just have to think about what code they need to check out. Similarly, all commits go to that same repository.

This also means that you have a single view of all the commits throughout history, which can sometimes be helpful (such as when you are trying to figure out everything that could have changed between Time A and Time B, for debugging purposes).

And finally, all the tools only have to worry about accessing a single repository—all they have to care about is directory and file names.

Once again, this doesn’t really require having just one repository. You could have a facade in front of your multi-repo system that provides the important parts of this functionality, such as a unified view of history, a single place to check out from, and a single place to commit to, if that was really important.

Files Are the Smallest Unit of Checkout, Commit, and Dependency

In most monorepos the smallest thing you can commit to, that is tracked by the versioning system, is a file. The system knows that “a file” is what changed. It might seem to be aware of lines in a file, but that’s only because it can reproduce the changes to a file as a “diff” by comparing the previous version to the current version. When you commit, the new commit actually contains an entirely new copy of the file you modified.

In some monorepos, you can also check out individual files without checking out the entire repository. In fact, if the repository gets very large, this becomes a very important productivity feature. Otherwise you could be forced to check out gigabytes of code that have nothing to do with what you’re working on.

Also, in some monorepos (Google’s in particular) the smallest unit of dependency is a file. That means that the build system can be aware that one file depends upon another file. It can’t be aware that one function depends on another function, or that one class depends on another class. This means that when you build, you only have to build the specific files that you need, transitively across all of your dependencies. (It should be noted that in Google’s monorepo, sometimes you can only depend upon a group of files or an entire directory, and sometimes that makes more sense.)

None of this requires having a single repository, at all.

No Concept of a Project

Since everything is in the same repository, there’s no inherent concept that a collection of different directories could all represent a single “project.” The build system probably knows that some directories are compiled together to produce a particular artifact, but there’s no universal way of easily seeing that just by looking at the directory structure or something like that. Any level of the directory hierarchy could have any significance. There could be a top-level directory in the repository that is a whole project. There could be a directory three levels down that’s a project, like /code/team/project. There are no inherent rules (except usually top-level directories are mandated to be very broad categories of potential projects that contain many projects in their tree).

In contrast, a multi-repo system could say that each repository is a project, which would give you a more concrete artifact to represent a project. However, there’s also nothing really enforcing this in a multi-repo system either. There could be four projects in one repo and two projects in another.

In reality, most of this ends up being defined by your build system’s configuration files, not by your source code repository.

The One Version Rule

Often, a monorepo will mandate that only one version of any given piece of software can exist in the repository at the same time. If you check in a library, you may only check in one version of that library in the entire repository. Since you have a monorepo, that ends up meaning that only one version of that library may exist at the company at any given time. This is the way (mostly) that Google’s monorepo works.

This is done for multiple reasons.

First off, it makes it much easier to reason about the behavior of your system. You understand which version of your dependencies you’re going to get, always. You don’t have to inspect your transitive dependency tree every time you check out a piece of code to understand what you’re actually getting, because you’re getting the version of that dependency that exists in the repository when you check out.

But perhaps the most important reason this is done is that most programming languages mandate having only one version of any particular dependency exist in a final program. Otherwise, they end up having weird behavior at runtime when you include multiple versions of the same thing. For example, in Java, it’s essentially random (from the viewpoint of the programmer) which version of a dependency will get used, if you include both in your binary. Including multiple versions in a program can lead to some very complex and difficult-to-debug errors at runtime.

This problem can be solved, and many dependency-resolution systems in modern languages or frameworks do solve this. Some systems allow for multiple versions of a dependency to exist, and for calling code to actually “know” which version they expect to be calling. Other systems will “force upgrade” all versions of a dependency to be the most recent one, or “force downgrade” all versions to be the oldest one.

However, all of that only exists if your system has the concept of projects and versions of those projects, which most monorepos don’t have.

This rule has some pretty significant downsides. If you own a piece of code that a lot of people depend on, it can be very difficult to upgrade that piece of code, because any change you make will break somebody. You can’t fork your codebase, move everybody who depends on you incrementally to the new version, and then delete the old version. Instead, when you make a breaking change you have to either:

(a) commit to every project that depends on you, all at once
(b) do a dance where you create a new function with no callers, commit that, then move your callers to use the new function over lots of commits, then delete the old function.
(c) decide never to make breaking changes even though you’re an internal library

Honestly, option (b) above is not that bad. It’s actually kind of a good software practice, but it can be a lot of work for a library maintainer, sometimes so much work that maintainers opt for (c) by default and let their systems stagnate more and more over time.

Where this really becomes a problem is third-party libraries. If all code must live in your repository, that means you have to check third-party libraries into your repository. And there can be only one version of them, for everybody in the company at once. But you’re not the maintainer of those libraries, and you can’t realistically do the function dance of option (b) above.

Plus, the outside world is not a monorepo. Libraries out there depend on specific versions of other libraries. Let’s say you check in Library A that causes you to have to check in Library B, C, and D as dependencies. But then somebody wants to check in Library X that requires a newer version of C. But that requires them to now have to upgrade Library A. But the upgrade to Library A breaks all of the people who depend on Library A, so now the person who just wants to check in a single library so that they can use it has to upgrade everybody who depends on Library A.

This gets even worse when you have a very-broadly-used third-party library inside of the repository. Often, they get “stuck” at a particular version and never get upgraded, because upgrading them is just so hard. Instead, people start bringing in selective patches to the library that they know won’t break it. Or they start making their own fixes to it and diverging from upstream, making it difficult or impossible to upgrade to the external version later.

One other thing about the one-version rule is that systems in production in a complex multi-service environment were all built at different versions, so the reality is that you’re actually always experiencing multiple versions of things in production. The one-version rule provides a polite fiction that makes life easier at development time for most situations, but it can also make you forget that it’s not actually true when you have multiple programs interacting with each other.

It’s worth noting that this rule doesn’t really require a monorepo. You could allow only one version of a dependency to exist across all of your repositories. Then you just have to mandate that all repositories across your company always build at head and only consume each others’ code at head, and you would have essentially the same effect. I’m not recommending that you do so, just pointing out that you could. Whether you do it is up to you.

Making Library Maintainers Solve the Problems They Cause

In a monorepo world, if you own a library, you can break the builds of every project who depends on you by checking in something incompatible with those projects. This is especially true in a one-version world, where library owners must check in to the single version of the library that everybody depends on. This means that library maintainers can’t just force their consumers to do all the work of upgrading to a new version of the library. The library maintainers have to dig in and do the work themselves. If they think that making a breaking change is worthwhile, they have to bear the cost for the business. Otherwise, library maintainers could create a lot of unplanned work for their consumers without talking to their consumers. (Sometimes those consumers represent projects that don’t even have developers on them anymore, but are still important to the business, so there’s nobody even there to do upgrade work.)

This is mostly a matter of company policy, but it’s much easier to do in a world where you can actually enforce it, and where there is some system that causes pain for the library developers when they cause pain to others. For example, having a lot of teams complain that their builds are broken can be that pain. In some monorepos, you can actually prevent the library maintainers from checking in their change at all, because the test system runs the tests of all their consumers and stops breaking changes from going in.

This enforcement doesn’t exactly require a single source repository. There are various ways to accomplish this, or parts of it, in a multi-repository system.

Summary

So you can see that a “monorepo” is actually a lot more than having just one source code repository where you put all your stuff. Some people have grouped all of these things together, because the above is basically a description of the Google monorepo, and most people seem to be thinking of that system when they talk about “a monorepo.” But it’s important to separate out these concepts, because a lot of them can be implemented in the systems you have today. Plus, maybe not all of these things are actually good, and perhaps you should be intentional about which ones of them you are trying to adopt at your business.

-Max

The post What is a Monorepo, Really? appeared first on Code Simplicity.

Reasoning and Choice

2020-08-11T06:16:21Z

One of the most important properties about any software system is the ability to understand what it is going to do without having to run it. This concept is usually referred to as the ability to “reason about the system.” Basically, you want to make statements about the structures, actions, and results of the system without having to see them in action first.

To understand why this is important, imagine a system with a hundred different pieces. To keep this simple, let’s pretend it’s an actual physical system, and not a computer. Let’s say that we have an automated plant that produces cars, with 100 steps from raw materials to finished car. Each of these parts makes some change to the input materials to produce an output product. There are various ways we could configure this system and each of its pieces:

We could make each piece do multiple actions, and depending on which action was taken, the next machine we choose is actually different. For example, let’s say we are converting metal into circular rods. Each car has a different number of circular rods it needs, and our rods could be made out of 5 different kinds of metal. So the machine has a program that decides, each time it gets a bar of steel, which rod it will make. This is different depending on the time of day and the current demand for our cars. Then, depending on which rod was made, that rod goes to one of five different next machines.

Now imagine that every single machine in the entire system was like that–it took a complex set of inputs and produced a complex possible set of outputs which went to a complex possible set of next machines. Not only would it be impossible for a human being to make statements about (i.e., reason about) the exact behavior of the whole system at any given time, it would even be difficult to reason about the behavior of the individual pieces.

Now imagine a different setup, where each machine takes one input, provides one output, and each machine only “talks” to one other machine (that is, its input always comes from one, specific machine and its output always goes to another single machine). Although it might be hard to think about the whole system all at once, because it’s still 100 machines, it’s easy to look at each individual piece, and from there, reason about both the individual pieces and the logical behavior of the whole system.

This is a core part of simplicity–the ability to reason about systems like this. When you look at any individual piece of a software system, you should be able to make statements about its behavior, guarantees, structure, and potential results, without having to run that piece. It should be clear exactly how that piece can interface with the rest of the system–either we should know exactly what calls into it and what it calls, or we should understand the structure that creates the boundaries of how the piece can be used. For example, this is why the concepts of “private” and “public” functions in many programming languages ease the ability to reason about the system–they are boundaries that tell us what can and can’t possibly happen. And when you look at the actual implementation of a function or class, it should be easily possible to understand the actions it’s taking by reading the code and comments. This is, for example, why naming is so important for functions and variables–because good naming allows the reader to reason about the behavior and boundaries of the system.

Choice

There is another very important component to enabling systems to have this quality, though. To explain this part, imagine that each of the machines in our imaginary car factory was not automated, but was instead run by a person. This is more like a software engineer who is typing actual code, “running” the machine of their IDE, computer, compiler, programming language, etc.

In our first example, where we have complex machines making complex decisions, imagine that all of the choices the automated machine was making before, now a human being has to make. That is, every time a piece of metal comes into our machine, a human being has to look at it, decide what type of metal it is, decide what rod to make, and all based on looking up the current demand for cars and noting the time of day. Now, in a real factory, some of that might actually be acceptable. It does at least create an interesting job for a person to do. But even there, you can see that you would be opening the door to a lot of mistakes and bad results.

Compare that to our latter example, where we have simple machines that have simple inputs and outputs. They would be so easy for a person to operate that you could have one person operate multiple machines, probably, and you would eliminate almost all potential for mistakes or bad results.

Now take into account that in programming, the programmer is often operating tens or hundreds of these “machines” in terms of the classes and functions that they maintain. So a better analogy for the complex car factory is having one person run all one hundred machines. As you can see, if each part of the system offers too many decisions to operator has to make, creating our “car” quickly becomes impossible. Even if you could do it, you would be manufacturing cars tremendously slowly and burning out the people operating your machines. And lo and behold, that is exactly what happens to teams that have to maintain software systems that have that level of complexity.

What’s the key point here that we introduced, though, when we added human beings to our “factory?” We introduced the factors of decision (something a human being does with their mind) and choice (options that are presented to a human being).

There are some schools of thought that say that all developers should be empowered to make every possible decision about their software system, at all times. This sounds great, because it sounds like it’s providing intellectual freedom to intelligent people—something that we all want. However, if you take this principle too far, you actually end up creating the complex car factory for your developers—a system where there are so many choices to make that they either become paralyzed, are guaranteed to do it wrong, or develop wildly inconsistent systems that others can’t easily make heads or tails of.

So what’s the solution here, is it to remove all choice from everybody, and make them into mindless automatons carrying out the will of your Chief Architect? Well, I’m sure there are some Software Architects out there who would like that, but actually, that’s a bit extreme of a solution. The answer is to instead recognize which choices are important for a developer to be able to make, and which are unimportant.

This differs depending on who you are in a software team and what point you’re at in the lifecycle of your software. For example, if you’re just starting up a new company and you’re the first developer, it’s important that you be able to choose almost everything about the basic platform your company will run on–the language you’re using, the frameworks, the libraries, etc. But even then, you don’t want those frameworks and libraries to present you with decisions you don’t need to be making. Imagine if a compiler stopped and asked you exactly how it should optimize each piece of code. Would that help you or aid your productivity? Would that actually be a net benefit for your company or the goals you’re trying to achieve? I don’t think so.

Then, at a different point in the lifecycle of a project, once you have standardized on a language and a specific framework you’re using, you usually wouldn’t want to allow a random junior developer to choose a different language or framework for their part of your codebase. It’s a decision that they don’t need to spend time making–it’s more productive for them to just go with the flow. Even if there is a better language or framework they could be using, re-writing your entire system just to implement this junior developer’s one feature doesn’t seem like a good use of your resources.

In the aggregate, if you can remove enough choices that developers don’t need to have, you can actually save quite a bit of developer time across the scope of an entire company. Imagine if every team in your company had to spend two weeks going through a review of different frameworks before they could start developing their system. Now imagine that you standardized on a framework that was good (that is, it was capable of fulfilling all the business needs of everybody who was going to use it) even if not perfect, and nobody had to make that decision anymore. How much engineering time would you have saved the whole company? That’s huge–bigger than almost any other productivity improvement you could make, in the long term.

Now, it is important to keep in mind that there are decisions that developers need to make. They absolutely need to be able to decide how the business logic of their system functions—that’s the core requirement for them to be able to do their jobs. There have been frameworks and libraries in the past that simply don’t allow people to actually write the systems they need, and that’s a level of restriction that’s detrimental to productivity. For example, imagine that your company standardized on a framework that supported HTTP but somehow fundamentally could not support SSL (that is, no HTTPS). That would be disastrous when you needed to encrypt your connections for security purposes. So that would be a very bad restriction.

This is a very tricky line to walk, sometimes, but in general I have found that erring on the side of deleting choices actually makes developers happier in the long run, because it makes them more productive. This is very tough at first when you take away certain choices from people, because they feel like you are impacting their personal freedom. And in a way, in the short term, you are. But the truth of the matter is that you’re trying to provide much more freedom to create—the freedom that that developer actually wants, fundamentally. The purpose of restricting choice should always be to improve the ability to create systems. You’re not killing production, you’re deleting distractions, barriers, and confusions in the form of choices that somebody simply doesn’t need to be making.

-Max

The post Reasoning and Choice appeared first on Code Simplicity.

The Definition of Simplicity

2020-05-20T03:07:02Z

Many years ago, I wrote a blog post explaining what was wrong with computers, and essentially saying the problem was complexity. Several years after that, I published Code Simplicity, which was essentially a thesis describing how and why simplicity was the most important quality of software.

Many years after that, I was sitting in a room of some of the world’s most experienced software engineers, coming up with guidelines and principles around which we wanted to structure software development, and after questioning the room, I came to a terrible realization: nobody had ever defined what “simplicity” was for software.

I thought, perhaps naively, that this was simply a known fact—that when I said “simplicity,” everybody just knew what I meant. To some degree, honestly, this was true. When you say the word “simplicity,” people at least get some idea. But I noticed that people would apply it in many different ways, some of them very much not what I intended. I would see people point at a function or file and say, “Look, it has fewer lines of code now, thus it is simpler!” Or say, “Look, this system uses such-and-such design pattern, thus it is now simpler!” Or worse, “This system is now completely generic and follows all of the things that ‘everybody knows’ you’re supposed to do with software, so that’s simple, right?”

So, I went on a search to try to find some sort of valid definition for simplicity. Eventually, I had to come up with it. I actually came up with this several years ago now, and I’ve been meaning to write a blog post about it, but simply haven’t done so. So what is the answer to this great mystery? For software, what is simplicity?

For software, “simple” means easy to read, understand, and correctly modify.

There are several important things about this definition.

First off, simplicity is fundamentally a human factor. It is not caused by machines and it is not done for machines. Only human beings read and understand software. In terms of “correctly modify,” yes, that can be done by a computer, and there are aspects of simplicity where you might be making things easier to be modified by automated refactoring tools or something like that. But the important part there is that we want people to be able to correctly modify software.

This tells you at once that you will never write a computer program that will magically do all simplifications of software. Tools absolutely help people in their quest to simplify and make code understandable. But they cannot do all the work. When somebody wants to take on a quest to simplify software at their company, be very skeptical if their only solution is a tooling solution. If they say, “We want to encourage this better practice by improving the tooling,” great! That’s a human factor. The practice involves people. But never lose sight of the fact that simplifying software always involves human beings acting causatively to develop that simplicity (usually by deleting or modifying something that’s difficult to understand into something easier to understand).

It also tells us that there will never be an automated analysis system that tells us whether or not software is complex. This is a constant question among people who are working on software simplicity–how do I measure how simple something is? Simplicity is a quality experienced only by human beings. It has no inherent truth to it other than the viewpoint of the beholder. There is no inherent quality to code called “simplicity.” You cannot put a number down on paper that says how simple your code is. What you can do is find out from people how simple or complex they find a piece of code to be. (As a side note, you often can’t ask them directly how complex it is, but you can ask them for their emotional reaction to it, which is often the best indicator of how complex it is. If they find some code or system frustrating, frightening, angering, hopeless, etc. that’s often a good sign of complexity.) So any measurement of software simplicity must include a measurement via finding things out from people. Once you’ve done that measurement via people, you might find that certain patterns or systems are almost universally bad or complex, and you can write tooling that bans or fixes those patterns. But the understanding of complexity and its validity comes from understanding what people are doing with the code, what the think about it, how they feel about it, etc.

One of the things that this tells us is that simplicity tends to come when a person spends some time paying attention to simplicity for code. That sounds obvious, but if you observe the behavior of many software teams, you will find out that they do not operate on this fact. Specifically, what this means is that some person being responsible for a piece of code (not responsible in the sense of “to blame for it” but responsible in the sense of “has ownership of it and actively works on it”) is almost necessary to developing simplicity. And this plays out in the physical universe. You can see that reading, understanding, and modifying unmaintained, unowned code almost always becomes more and more difficult over time. I only say “almost always” because I haven’t looked at every single piece of code in the world, and because in order for this to be fully true, you would have to look at it over infinite time (that is, it tends to become more and more complex the longer it’s unmaintained, so sometimes it has to be unmaintained for a very long time before you really encounter this). But it’s been true in every situation I’ve ever seen–I don’t know of any counter-examples where a piece of code got simpler the longer it was unmaintained. When you let software be unowned, you are allowing it to become more difficult to read, understand, and maintain over time.

The other primary cause of complexity, besides lack of ownership, is time compression. In fact, this is the most common cause of complexity–possibly the only true cause of complexity. By “time compression” I basically mean making people feel like they don’t have enough time. Funnily enough, this is most often done to programmers by themselves. They say it’s being done to them by their management (“They gave me these deadlines and now I have to cut corners,”) and sometimes that is true. But more often, developers simply feel a pressure themselves to “be done” with this work. They feel like they need to finish something or somebody will be mad, or they will be in trouble, or they won’t be fast enough, or somebody will think they are a bad programmer, or, or, or…. But often none of that is true, and the actual truth is that they’re perfectly free to take a little longer to do something the right way.

If you don’t think this is true, ask a developer why they made a hack the next time you see a hack. They will either tell you that they didn’t understand what they were doing (another primary cause of complexity) or they will tell you something like, “Well, that other library is very hard to use and doesn’t work right so I had to do it this way.” But think that through. The developer is often saying, “I did not want to spend the time to fix that other library.” Wow, you say, that’s pretty harsh! Fixing that other library could have been a lot of work! True, but did they actually have the time to do it? Maybe they did. One could also say, “I did not feel responsible for that library,” which is where you see once again responsibility coming in for causes of complexity. Maybe it was “outside of their control,” like it was at another company or something. Okay, that’s still a declaration of the level of responsibility they’re willing to take. I’m not saying the decision there was bad, just that you have to recognize that you’re making a conscious decision of how responsible to be (you could have reached out to the other company, reported it to them, worked with them to fix it, etc.) and that you’re intentionally deciding to make complexity instead of spend the time to develop simplicity.

Anyhow, I hope that gives you a better understanding of what simplicity is and what actually causes it. Remember, when you’re talking about simplicity with somebody, make sure they know first that what you’re really talking about is making the code easier to read, understand, and correctly maintain!

-Max

The post The Definition of Simplicity appeared first on Code Simplicity.

Fires vs. Strategy

2020-03-15T09:15:50Z

There’s a point that I’ve been making to engineers recently that I realized would be valuable if shared more widely.

When you do engineering work, there are different types of tasks that get given to you. Some tasks are emergencies or short-term work. We sometimes call this “putting out fires,” especially when the work relates to handling something that is urgently broken or immediately needed without delay.

Other tasks are strategic in nature. You have collected information about what is needed and/or wanted from your users, you’ve designed a solution, and you’re working toward it methodically and intelligently.

It is important to understand when you are doing which type of work, and to think about them differently.

Fires

When you’re putting out a fire, the goal is to put out the fire. You basically want to do whatever the minimal work is to put out the fire so that you can get back to your long-term strategic work. You don’t want to get involved in building huge, complex systems that will live forever, just to put out a fire. Emergencies are the time when you want to do “quick and dirty” work. It doesn’t mean you should do bad work. But you shouldn’t be building up some long-term, high-maintenance system around putting out a fire.

There are different types of fires. Sometimes an executive or another team comes to you with an immediate demand–something that must get done in the next few weeks or so. What you want to do is figure out how to get that task done and out of the way so that you can get back to your long-term strategic work.

Other times, you have some sort of actual emergency, like an outage. It’s clearer in that case that you should just fix the outage and not run around doing a bunch of other stuff. An outage is not the time when you want to say, “Well, let’s wait to write a design document and review it next week with our senior engineers.” The same is true of any fire, though–a fire is not the time to apply the methods and systems of long-term software design.

Example

Let’s get a more concrete example to show what I’m talking about. Let’s say that an executive comes to you and says, “We have a customer who wants to give us a million dollars next week, but before they do this, we have to produce a graph that shows how our servers stand up under high load.” But, let’s say, you don’t even have any systems for recording the load on your servers.

If you were thinking about this in a long-term, strategic manner, you might say, “Ah, well, we should have a system that tracks the load on our servers. We have to work out in detail how the storage for this would work, how we can be sure that it’s accurate, how we monitor it, and how we test it. Then we should work with a user experience designer to make sure that the graphs it produces can be understood well by its users, by conducting standard user research and working out a UI design from that.”

That’s not going to get done in a week. Also, it’s a waste of time. You actually have no idea if this fire is going to be happening again or not. Just because somebody has come to you with some urgent demand once does’t mean that this will be a long-term need at all. It might seem like it will be, and you could guess that it will be, but why are you guessing about long-term strategic designs? There’s no need to guess about long-term work–when you’re doing long-term work, you have the luxury of doing research to find out what the actual user needs and requirements are. So do that and build things based on that, not based on your guesses.

Instead, what you should say is something like, “Okay, I will work out a very basic load test that I can run manually from a script on my machine, tomorrow. I will roll out a new version of the server that just writes information about its load to a log file, and then I will manually make a graph based on parsing that log.” All of that was basically the minimum work required to solve the problem.

Even that solution comes with a risk, though–you instrumented the server to log something related to load. There is a chance that later, somebody will come along and think that you intended this to be a long-term, supported mechanism for tracking the load of the system, and rely on it being well-designed and well-thought-out when it isn’t. This highlights a very important point:

Never make long-term decisions or implement long-term solutions during a fire.

In fact, you might even want to intentionally undo all the work you did during the fire, like remove that log line, just so nobody else thinks that you made some long-term decision.

This rule doesn’t just apply to technical implementation details, but also to organizational changes, or really any decision. For example, let’s say that there is an outage ongoing. During the outage is not the time to talk about how you will prevent it from happening in the future, or how you should change your normal, everyday processes.

The one time that it is safe to make long-term decisions based on a fire is when you’re doing a “postmortem”–a rational review of the situation after the fire has been “put out.” Then you can sit down and say, “Okay, what sort of strategic work do we want to do to prevent fires like this from happening again?” or “What did we learn from this that we could use to change how we work?”

This rule is extremely important. Violating it builds up insanities that can destroy groups. If you built up all of your company’s policies and work patterns based only on decisions made during times of extreme emergency, it would eventually look to be a totally crazy company, and would probably fail.

Strategic Work

The other end of the spectrum (and it is a spectrum, it’s not black and white) from “putting out fires” is: doing strategic work. Basically, you have a known goal and you’re working toward it, applying all of the basic principles of software design, making sure that you’re thinking about the long-term, and working together with your group intelligently to create something sustainable.

Similarly, if you apply the methods and systems of “putting out fires” to strategic work, you will cause a disaster. If you treat every single project as though it’s an emergency and just dash it out “quick and dirty” because it “has to be done tomorrow” (even though it really doesn’t), you’ll end up with a mess. What will actually happen is you will create fires! Your system will be so poorly designed that it will fall over, cause trouble, be hard to maintain, and eventually consume you entirely in putting out fires around this poorly-designed mess.

When you apply the principles of Fires to Strategy work, you never actually get your strategic work done. If you see an engineering organization that just can’t seem to get things done over the long-term, this is very often the reason why–they have been treating everything like the world is on fire, and so can never actually move forward.

Strategic work requires a lot of saying, “Okay, we understand your requirements. Thanks for telling us what your problems are. We are building a solution for you, we are doing it the right way, and it will take a little bit of time. Not forever, but it will take some time to get it done.”

I think that sometimes, executives get worried that if they tell engineers to “take enough time,” that the engineers will get lazy and just never complete the work. This might be a legitimate concern in some companies, and certainly executives have an interest in keeping things moving along so that the company can deliver its products! But there has to be a balance between encouraging people to deliver on time and making sure that they follow the processes and procedures of long-term software development. In general, it’s best, when doing strategic work, to err on the side of doing a little too much design, a little too much review, etc. I’m not saying go overboard and stop building things, or put everybody through unnecessary reviews just because something “might need it.” I’m just saying that if you’re uncertain, this is the direction you should err in.

Doing Both

As long as you apply the general principles above, it’s possible for one team (or one person) to handle both strategic work and fires simultaneously (at least, within the same week or month). The trick is doing minimal work on the fires, to make sure that emergencies are handled and the business keeps chugging along, and then focusing back on the strategic work once the fire is put out.

After all, if you’re doing it right, the strategic work should be the stuff that’s most important to the business–the things that you’ve researched and know will make the highest impact if you deliver them, in the long run. So put out the fires and get back to doing what’s actually going to be important in the long term.

-Max

The post Fires vs. Strategy appeared first on Code Simplicity.

How to Learn to Program

2020-03-04T05:35:52Z

One question that people ask me all the time is, “How do I become a programmer?” Or, “How do I learn to program?” There are a lot of possible answers to this, depending on the person and how you want to go about it. I figured that since people ask me this so often, I had better finally write an article about it.

Find the Best Way

One rule that has served me well when I’ve been learning to program, no matter the method I was using, was to always ask, “What’s the best way to accomplish this?” or “What’s the right way to accomplish this?” That is, in programming there are many different routes you can take to accomplish your task. But usually, only one of those is the recommended way, either in terms of the most modern way to do it in your programming language, or the best practices that the community of programmers have agreed on based on experience. Usually, you can find out this information by reading the documentation of the programming language you’re learning, or searching online for a best practice via Google or Stack Overflow. If you can’t find the answers, ask the question on a forum, mailing list, or Stack Overflow. I still do all of this, to this day, when I’m given a task where I have to learn something that I don’t know about.

The advantage here is that you don’t just learn to program, but you learn to be a good programmer. Also, it forces you to really dive deep and understand the tools and languages that you’re using. If you do this continuously as you go, you eventually develop a good, deep understanding of the systems you’re working with, while maintaining enough practicality to keep yourself interested. (That is, you are only diving this deep on the things you’re actually doing something with—not random theoretical stuff that you’re never going to use. That can be interesting too, but it’s not an educational system that you can use forever to really learn to program.)

Of course, you have to make sure that you really understand everything you are reading when you do this. That might mean diving down into more documentation, and then even more documentation, until you understand all the words and symbols being used. That’s okay! That’s a big part of what makes this system work—that you gain a true, deep understanding of the symbols and concepts you’re working with.

Now look, to be clear, when you’re first learning to program and you’re given a challenging task, it’s okay to just try to get it done any old way you can. You’re learning the basics, not the best practices. This piece of advice is for once you get over the hump of learning the basics of how to get anything done at all. For a project that you’re doing purely to learn something, the most important part is that you learn the thing you’re setting out to learn. But once you’ve got that in hand, delve deeper into it and try to see if you’ve done things the best possible way.

Okay, now that said, let’s talk about the different methods that people actually go through to learn to program.

University

The most commonly-taught subject that’s related to programming in universities is called “computer science.” I say that it’s related to programming because very little of what people learn in most computer science courses will actually end up being useful in their day-to-day lives as a professional, working programmer. That’s not always true—there are some fields where computer science comes very much in handy. But in general, the field we call “software engineering” or “development” is different than what universities cover as “computer science.”

Usually, the basics of computer science that universities cover are very useful. I went to university and studied computer science, and my first two years of study were very useful to me, especially the first few classes. I got a great grounding in some of the basic concepts of software development.

What I didn’t realize at the time, though, is that computer science is only partly a study of programming. The other part is a study of algorithms. (For those who are reading this and don’t know, an algorithm is a series of steps for accomplishing some task. That’s really all it means. Even a shopping list is a sort of algorithm.) The study of algorithms usually involves learning the most efficient way to do something. That is, figuring out how to accomplish a task like sorting a list of integers using the fewest number of steps or using the least amount of memory. There are some problems that can’t be solved by computers at all unless you know the right algorithm, and once in a while you run into a problem in programming that requires this knowledge. So it does have some use. But solving these problems of algorithms is not what you will be spending most of your time doing.

Even for those universities that offer courses in “software engineering,” they are rarely a full experience of what the real world will be like. The reason is that most courses only take a few months at most, and make you collaborate with only a few people at best, on a codebase of a few thousand lines of code. In the real world, you will be working with a large number of people on a codebase that will last for years that is at least tens of thousands of lines of code. However, these software engineering courses are still far better than nothing.

All this said, there are a few universities that do turn out excellent programmers, whether that be by computer science courses or software engineering courses. And there is always some value in learning to program by taking classes at a university. At the very least, being at a university provides a structure and discipline that encourages you to get through the class.

Self-Taught

On the other end of the spectrum from university training, many programmers are self-taught. They read up on some documentation online, mess around with things, somehow get those things to work, and eventually become competent programmers through a lot of painful, hard experience.

A lot of how I learned to program was this way. The most important thing for me was to have some task that I wanted to accomplish. You see, programming is a tool, it’s not an end in itself. It’s a system that you use to accomplish something else. So you have to have something you want to accomplish. Sometimes you have to make up that task yourself. For example, a friend of mine had an idea for a very simple game. So one summer, I spent several weeks learning Java so that I could write that game. I already knew the basics of programming in another language, C, from my university classes. So I wasn’t entirely new to programming, which helped a lot.

What I did (and what I recommend that most people do if they want to learn some language) is I just went through the official tutorial that the creators of the language supplied. For me, this was an older version of the Java tutorial. Almost every language has one of these official tutorials, or at least recommends some website where you can learn. Usually you just have to search Google for the name of the language you want to learn followed by the word “tutorial” and you’ll find what you’re looking for. Or go to the main web site for the language you want to learn and look over their “getting started” links.

Now, if you don’t know how to program at all, there are some other things you might have to learn first, depending on your level of experience with computers. You might have to learn a bit about how computers work, how to edit text files, and how to run programs via the command line (since that’s usually how you’ll be running your first programs). A lot of this is simplified nowadays by using a web-based code editor, though, where you can just write code and run it right there in your web browser. To find one of these, just search for the name of your language followed by the words “web editor” (without quotes) in Google.

Once you’ve taken the tutorial for some language, there are a lot of ways to go to teach yourself how to program. Google and Stack Overflow are definitely your friend, as are the official docs for the language that you’re using. A lot of what I would do when I was teaching myself was I would search through the official docs of the language for words related to what I was trying to do, and then just read those docs to understand what I was trying to accomplish. But the key here, as I pointed out above, is that you have to actually understand. If you’re very very new to programming, sometimes it’s okay to just copy and paste some code and not understand what it does. But usually that gets you into hot water very quickly, even in terms of your own understanding. You’ll find that something doesn’t work, and you will have no idea why.

In fact, not understanding why something is broken is one of the most frustrating parts of teaching yourself to program. I would guess that this is where most people give up. Something doesn’t work, the system outputs some totally cryptic error message, and they don’t understand how to fix it. This is always, always because the programmer didn’t understand something about the programming language, the tools they were using, or what the words in the error message mean. This happens most often when you have cut and pasted something without fully understanding every word and symbol in the code that you copied.

Now yes, it might seem like a lot of work to go through and learn what every single word and symbol really mean, but that’s how you learn to program. Programmers aren’t valuable to society just because they know where and how to copy and paste stuff from Stack Overflow. They are valuable because they have learned and understood certain concepts, and how how to apply them. You’re actually increasing your value as a programmer by doing this.

However, it’s still often very frustrating to try to get through this process without any kind of help. Hopefully, if you’re doing this, you have somebody you can reach out to when you get totally stuck, who can help answer your questions. I didn’t have anybody, and sometimes that was very rough, where I would spend 30 hours staring at a piece of code, tearing my hair out trying to understand why the dang thing didn’t work. So if you have somebody you can talk to, that’s often better than trying to just figure it out on your own, especially when you’re just starting out. Sometimes reading all the documentation for a thing when you’re brand new to programming can be overwhelming. It helps to have a guide, or at least a helper.

Bootcamps

Somewhere between university training and teaching yourself comes “coding bootcamps,” a phenomenon that has arisen in the last few years where an organization claims that it can teach people to program in X weeks or Y months. Very often, these programs actually provide a better and more practical education than universities do, although some of them don’t quite give you the theoretical depth that university training would give you. That is, you don’t learn as much of the theory behind programming (in particular, the basics of how to program well) that you might learn from some more formal training.

Here’s what I would say as a guideline: if a program promises to teach you to program in less than a month, it’s probably garbage. You could maybe learn some of the basics of web development in that time, but you could never learn to be a professional programmer with just four weeks of experience. Gaining basic skill as a good programmer is a process that takes at least months, as far as I’ve seen from seeing the results of various bootcamps. You might get enough of the basics in a few-week class that you could then go on and teach yourself the rest, but a month of programming isn’t going to turn you into a professional developer.

Now, this all might sound like I’m pretty down on bootcamps, and that the only way to be a “real programmer” is to suffer through the pain of teaching yourself. However, nothing could be further from the truth. In fact, I’ve seen great products come out of bootcamps. Personally, the best programming school of any kind that I’ve seen would probably be called a “bootcamp” these days, and that’s The Tech Academy. They do a good job of taking people who don’t know how to program and making them into programmers, and it’s what I always recommend to people when they ask me how to program if they don’t want to just teach themselves. There are probably a lot of other decent bootcamps out there, too.

If I were evaluating a bootcamp today, I would find out how many of their graduates got good jobs as professional programmers and try to find some of those graduates and ask them how relevant their training was to what they are actually doing on a day-to-day basis.

Experience

Once you’ve taught yourself the basics of programming, either through university, self-teaching, or a coding bootcamp, one of the most important things you can do in terms of your career is pick some good first experiences that can help you grow as a software developer.

Internships

If you are a university student, I strongly recommend that you take advantage of the summer internships that many software companies offer. It’s the best way to have a low-risk experience of what working at an actual software company is like. Most of the companies will actually pay you, and it is also the easiest way to get hired at that company after you graduate. Since they already have experience with you, the hiring process is much easier.

Open Source

If there is some open source project that you really want to work on, try to see what they need and just go work on it. Very often, open source projects have a list of available tasks for newcomers, and you can just pick some task and start to get to work on it. The advantage here is that there are usually no deadlines, and there is usually a community of very supportive volunteers who can answer your questions on mailing lists, in chat rooms, etc. Since most open-source projects engage in a process called “code review,” you will also get feedback on your code from a more senior engineer, which will also help you grow as a programmer. On top of all of this, working in open source provides evidence of your work that you can show to any employer. (When you work on code for a company, you usually can’t then take it outside the company and show it to other prospective future employers as evidence that you are a good programmer.) Also, companies usually like to see open source participation on your resume, since it shows that you’re interested in and passionate about programming.

Mentorship

When you take a first job as a programmer, consider the opportunities that it will provide you to become a better programmer. Does the company have more experienced software engineers who can help you grow, or is it composed entirely of new university graduates with little to no professional experience? Does the company do code review, where each of your changes is reviewed by a more senior engineer to help you get better at programming?

And perhaps most importantly, when you accept a job, make sure that it’s a company that cares about software best practices. This is more frequently true in software companies—that is, companies where their primary product is software—than it is at other companies (for example, financial institutions employ a lot of programmers, but whether or not they care deeply about software quality depends a lot on the company and even what part of the company you work in). You’re more likely to have this experience at a company that’s a bit more established, or at least at a company that’s not under terrific deadline pressure and about to die (like a startup that’s running out of money, for example).

I’m not saying that you should refuse the only offer you have just because they might not be the perfect company to work for. But keep in mind that if you’re looking for career and skill growth as a programmer, you’re not going to get it by hacking out poor code on tight deadlines for users who hate your product but are somehow forced to use it anyway.

Reading

There is a lot of content online about software best practices. There are also a lot of good books. (I even wrote some of those books, which you would probably like if you’re learning to program.) It’s worth reading up online about various software best practices, especially when they pertain directly to some practical problem you’re actually trying to solve. Continuing to read books, blogs, websites, etc. is a good way to stay up to date, sometimes even when they’re not directly practical. Once in a while, for example, I’ll go read up about some new programming language or some CPU feature, not because it has anything to do with what I’m doing, but just because it seems like good and useful information for me to know as a programmer.

Summary

That’s the basics of the advice I usually give people when they ask me how to program. There are probably other things to know, especially the stuff I covered in Why Programmers Suck. And of course, all of the above is just a summary of a process that takes months or years to really get good at. But hopefully, this helps you learn to program, or provides you a resource that you can send other people when they ask you the same question!

-Max

The post How to Learn to Program appeared first on Code Simplicity.

How to be a Great Programmer: Awareness, Understanding, and Responsibility

2017-12-21T10:54:12Z

There are three key factors to being or becoming a great programmer: awareness, understanding, and responsibility.

I’ve talked a lot about the subject of understanding. Heck, I even named my most recent book Understanding Software. In particular, I’ve pointed out many times that the better you understand something, the better you will do it.

However, there are two other factors that go along with understanding to make somebody into an excellent software developer. In brief, if you aren’t aware of a problem, there’s no understanding to be had. And if you don’t take action to solve the problem (which starts with taking responsibility for it) then you can’t do anything about it.

There’s a lot to know about each of these points, though, and they are key points when one looks at how to become a better software developer. So I want to go into each one in more depth with you, here.

Awareness

In a way, this is a specialized form of understanding. “Awareness” means, essentially, that you can perceive something and know that it exists. There are a lot of ways that this applies to programming.

Let’s take one of the simplest: if you aren’t aware of a bug, you can’t possibly fix it.

Sometimes, people use this to get out of having to do something about a problem. If they aren’t aware of it, they think, then they don’t have to do anything about it. That may or may not be true, but what is true is that the problem exists whether or not you personally know about it. If your system has a bug that is affecting millions of users, then you should be aware of it. Otherwise your users are going to suffer. Their suffering is not justified by you being unaware of their suffering—that is, it doesn’t suddenly become okay in the universe just because you don’t know about it.

There are more subtle forms of awareness. For example, some people are not actually aware of code complexity. They don’t realize that the code that they just wrote is complex. Sometimes they aren’t aware that complexity is even an issue, or aware of what the viewpoint would be of another programmer who has to read the code they just wrote. They don’t see that the code is missing comments, that it’s hard to read, or that there is another way to solve the problem. All of these are aspects of awareness for a programmer.

The simplest way that you can improve your awareness as a programmer is to be willing to go out and read code that’s outside your normal realm of control. That is, maybe on a day to day basis you work on code that encodes video. But that video also comes from somewhere—users upload it into your system. If you encounter a problem with how the videos are coming into your encoder, then maybe you should choose to go look at how that code actually works.

This sounds simple but it can be quite emotionally difficult for many people, sometimes even including me. It’s easy to think, “they’re all idiots over there and this sucks and I just have to work around it.” But how do you even know that if you haven’t looked at their code? Maybe there’s something you would learn that would help you.

Awareness can also be about knowing the existence of libraries, frameworks, or tools that you can use in your code. Maybe you don’t know everything about the latest web development libraries, but if you’re a web developer, you should at least be aware that they exist. Then, when you need to make a decision about which framework to use, you know which ones to go look at.

In general, one can improve awareness simply by being willing to experience new programming languages, new design patterns, ideas about testing, new frameworks, etc. I personally want to know about new systems and methods of programming. When I hear about a new language that’s all the rage, I go and at least read the front page of its web site, and sometimes even delve into its documentation somewhat. That way I at least know the thing exists, and I could learn more about it if I needed to. If I’m not aware of different ways of solving problems, then I have far fewer tools at my disposal as a software engineer. Increasing my awareness in this way increases my options when I’m faced with a difficult situation.

This all may sound overly simple, but it is quite true:

You can improve your ability as a programmer by improving your awareness of programming and programs.

Understanding

As I have mentioned elsewhere:

The better you understand something, the better you can do it.

Awareness is a first step—you know a thing exists. After that, you want to understand it in order to do well. Let me give you an example in a made-up psuedocode language:

function print_hello() {
   clear_screen()
   display_hello()
}

What do you think this program does? It looks like it clears the terminal out and then prints the word “Hello.” But honestly, I’m just guessing. I haven’t looked at the documentation or code of clear_screen or write_hello. If I looked at the code, I would see that clear_screen actually turns the whole screen white, and write_hello prints the word “HELLO” in fifteen different languages and colors across the screen. But I never could have known that without actually looking at something in order to understand it.

That may seem like an oversimplified example, but this happens all the time in programming. Now that I know what clear_screen does, I can use it elsewhere without having to think about it or spend time re-writing it. I can write a program that does what I intend it to do, instead of a buggy program. I could write a test for it that makes sense. All of this adds up to simplicity, stability, and all the other qualities that we think of as being “good programming.” And all of that stemmed from understanding.

There are other forms of understanding, which I’ve gone over quite a bit in several articles, most particularly “Why Programmers Suck.” It really is the key point which makes or breaks a programmer. There are other skills that are useful—typing quickly, communicating well, managing your priorities appropriately, and many others. But the key point is understanding. Without that, you could be the fastest programmer in the world, but only write terrible, unmaintainable systems. You could be the most charming person on the team, but fail utterly to write a working line of code. I’m not saying it’s bad to be fast or charming—those are also useful skills—but that without understanding, they are not going to make you into a great developer.

Responsibility

I talked about this a bit in an article on the O’Reilly blog that I wrote shortly after the first book came out, but I’m not sure how many codesimplicity.com readers have read that post, even though it’s actually a very key point about being a great programmer.

Now, first off, I want to clarify something. By “responsibility” here, I do not mean, “taking the blame for something that went wrong.” That’s a common definition of responsibility, but not the one I’m using. What I mean in this case by “responsibility” is, “the willingness to be at cause over, or in control of, some thing.” For example, if I have responsibility for my room, I’m willing to clean it up. It doesn’t mean I have to clean it up, just that I’m completely willing to. And not just a big PR statement about “oh yes, I’m completely willing to clean my room” and then go off and lie down in the mess for three hours. No, I mean really, actually willing. Like, are you willing to eat your favorite dessert? That’s willing. I’m not saying you have to be as excited about everything as you might be about your favorite dessert, I’m just saying that there has to be actual willingness involved.

So how does this apply to programming? Well, all too often I run into this negative example:

“I don’t want to refactor that code because I don’t own it.”

That’s silly. Are you capable of editing it? Can you read the source code? Are you allowed to send changes to the owners? Then you’re theoretically capable of fixing it. Now, that doesn’t mean that you should go around fixing all the code in the whole world all the time, because sometimes that’s not the right time trade-off to be making in terms of your development goals. But even that sentence I’m loathe to have said, because all too often people use that as an excuse to not clean things up, also:

“I can’t spend any time cleaning up that team’s code because it will make my project take longer.” Or, “Well, cleaning up this code that I depend on isn’t part of my team’s goals.”

Okay, there are two flaws with those theories. (And believe me, they really are theories—not facts.)

The first is that usually, it takes longer to do something correctly when you’re depending on bad, crazy, or complex code. Often, it actually ends up taking less time to refactor and then write your feature than it does to try to write your feature on top of some messy or horrible library that doesn’t work at all the way you need it to. So you’re not usually adding a lot of extra time to your goals. The reason that it might seem like you are is that you’re not thinking about the difference between getting something “done” and getting something actually done.

What do I mean by “actually done?” Well, I mean that it works, it’s not full of bugs all the time, you can maintain it easily, it isn’t sucking your whole life up as a developer, and you can move on to doing other productive things.

Think of the problem this way. Let’s imagine that you’re building a wall, and you start off at the bottom with some shoddy bricks. Then you put a few more layers of bricks on top of those. But after you get to about the fourth layer of bricks, the shoddy bricks at the bottom start breaking. So then you go and you patch up the shoddy bricks by adding some shoddy wood over them. You add a few more layers of bricks, and now the wall starts to fall over. So you prop it up with some rusty iron bars and go on with your work. If you continue this way, you’re going to be spending your whole life simply maintaining the wall. It’s going to become an immense problem in your life. You’re probably going to eventually walk away from it and leave somebody else with this terrible disaster of a wall that they now have to maintain, but that they don’t even understand because it looks like a crazy amalgamation of shoddy building materials that nobody in their right mind would ever combine. That’s pretty cruel, honestly.

When you “hack” or “patch” around bad code in your underlying dependencies, you’re building your software in the same way that you would be building that wall. It’s less obvious, because programs don’t have a huge physical structure that can fall on your face. But nonetheless, the same principle of building applies—when you solve an underlying complexity by adding a complexity on top of it, you increase the complexity of the system. When you instead resolve the underlying complexity, you decrease the complexity of the system.

And I want to point out that it doesn’t matter who made the thing complex in the first place. The fact that “somebody else did it” doesn’t change any of the above rules. And it doesn’t matter who owns the complexity now. If you hack around it, you will make the system more complex. If you fix it, you will make the system less complex. You are making a choice—you have this power, you can be responsible. Yes, sometimes that involves getting somebody else to fix it. But in my experience, even more often it involves you being willing to make the change yourself.

And sure, maybe you know all that. But still, you might think, “Oh, well, I can just make this one tiny piece a little bit more complex this one time, because it’s just this one thing that I’m doing.” And you know, sometimes, you might be right, especially if you have some sort of legitimate emergency where you have to hack something out for just a short time. But more often, you’re actually contributing to a huge mess of a broken wall that will become a nightmare for you and everybody else to maintain. It’s these little complexities, these little choices to not be responsible, that eventually add up to the giant broken messes that nobody wants to deal with.

So when I say “responsibility” in this way, a big part of what I mean is, “be willing to change things outside your normal perimeter.” It doesn’t have to be an infinite perimeter. You can draw a line somewhere and say, “Beyond this point, it really is somebody else’s problem.” For example, on my projects I have often drawn a line that says, “Okay, I’m not going to do as much work on code that comes from outside the company entirely, because that’s not as productive of a use of my time.” But I have from time to time even gone as far as filing bugs against programming languages or sending patches to developer tools when I thought they were causing complexity or making my life more difficult. I mean heck, I actually became the primary architect of Bugzilla because I thought there were a lot of things about it that needed to be fixed. I’m not saying that everybody should go that far. But I am saying that you will become a much better programmer by accepting responsibility for some code beyond the scope of your project. And that the wider you spread this scope out, the better of a programmer you will become.

Oh by the way, I said there were two flaws with the theories above. The second one is that you actually belong to a group, even if that group is just humanity. You’re not the only person around. It’s okay to contribute to other people’s code. We’re actually all kind of on the same team.

If you’re working for a company, you’re helping the company as a whole by fixing these things when you come across them. And if you’re just an individual developer out in the world, you’re making the world a little bit of a better place for every other programmer when you fix some widely-used library, some popular tool, or some bad sample code on the web somewhere. And honestly, it makes you a better programmer. That’s the whole point of what I’m saying here. The best programmers that I’ve known are the ones who are the most willing to take broad responsibility for everything going right no matter what piece of code they have to touch or what team they have to talk to to get things done. I’m telling you this because it will help you.

And by the way, as a consumer of a library or API, often you are in the best place to refactor it, because you know better than the authors how that library or API should be consumed. At the very least, file a bug against the code saying what problem you had with it. Otherwise how will the authors ever know that there is some problem? You might just expect them to magically know, but believe me, very often they do not know. Your experience could be very valuable to them, who knows!

Summary

Overall, if you want to be a better programmer, the thing to ask yourself is whether you should improve awareness, understanding, or responsibility, and then focus on that for a little while. If you aren’t sure, start with awareness, then go on to understanding, and finally take more responsibility. It’s very difficult to do it in the other direction—you can’t easily take responsibility for something you don’t understand (it would just be very confusing and you’d not be very good at it), and it’s impossible to understand something you’re not aware of. So awareness, understanding, and then responsibility is the right sequence.

If it’s awareness that you should improve, just read some new code, find a new programming blog, look around for book titles, talk to some other programmers about the latest technology—anything that just helps you become more aware of problems, solutions, knowledge, patterns, people, organizations, principles, or anything else that would help you in your job.

If it’s understanding you’d like to focus on, then read some more documentation, spend more time understanding how each function works, ask more questions of your fellow programmers, look up some words in the dictionary, read some articles about the technology you’re using, or read a book—any method of coming to a complete and full understanding of some knowledge related to your job.

And finally, responsibility is achieved mostly by just deciding to take on more. When a problem comes your way, decide to solve it rather than put it off. When there’s a difficulty outside your team, decide to help resolve it rather than be part of the problem. And there is a special type of responsibility, too—when somebody else has a problem to solve, be willing to help them, or, be willing to let them solve it on their own if that’s what they should do. Responsibility doesn’t just mean that you have to do everything. It can also mean being willing to help other people get things done, too.

It’s not even hard to do the above—as long as you do it in simple, individual steps. You don’t have to become aware of the whole universe overnight. You don’t have to understand every word of every program ever written, tomorrow. And you don’t have to be willing to change every piece of software in existence just because you read this blog post. Start with something small. Then move on to the next thing, and the next thing, and the next thing, and in time, you’ll be just as good of a programmer as you want to be.

Enjoy.

-Max

The post How to be a Great Programmer: Awareness, Understanding, and Responsibility appeared first on Code Simplicity.

Understanding Software

2017-10-18T07:03:48Z

Hey everybody. I’ve published a new book! It’s called Understanding Software.

The book contains all of the content that I’ve written on software development and working in teams since the publication of Code Simplicity, plus some entirely new content that’s never been published anywhere. In fact, it contains one of my favorite essays that I ever wrote from back in 2008 but never published before. So there’s that for you. All the content has been put into a beautiful layout, then curated and organized for maximum readability.

It’s something I’m actually really happy with, and I’m looking forward to hearing what you have to say about it, too.

From the Publisher

Understanding Software covers many areas of programming, from how to write simple code to profound insights into programming, and then how to suck less at what you do! You’ll discover the problems with software complexity, the root of its causes, and how to use simplicity to create great software. You’ll examine debugging like you’ve never done before, and how to get a handle on being happy while working in teams.

Max brings a selection of carefully crafted essays, thoughts, and advice about working and succeeding in the software industry, from his legendary blog Code Simplicity. Max has crafted forty-three essays which have the power to help you avoid complexity and embrace simplicity, so you can be a happier and more successful developer.

Max’s technical knowledge, insight, and kindness, has earned him a status as a code guru, and his ideas will inspire you and help refresh your approach to the challenges of being a developer.

What you will learn

See how to bring simplicity and success to your programming world
Clues to complexity – and how to build excellent software
Simplicity and software design
Principles for programmers
The secrets of rockstar programmers
Max’s views and interpretation of the Software industry
Why Programmers suck and how to suck less as a programmer
Software design in two sentences
What is a bug? Go deep into debugging

You can get it on Amazon, direct from the publisher , or in any other place where programming books are sold.

-Max

The post Understanding Software appeared first on Code Simplicity.

Kindness and Code

2017-08-12T16:07:07Z

It is very easy to think of software development as being an entirely technical activity, where humans don’t really matter and everything is about the computer. However, the opposite is actually true.

Software engineering is fundamentally a human discipline.

Many of the mistakes made over the years in trying to fix software development have been made by focusing purely on the technical aspects of the system without thinking about the fact that it is human beings who write the code. When you see somebody who cares about optimization more than readability of code, when you see somebody who won’t write a comment but will spend all day tweaking their shell scripts to be fewer lines, when you have somebody who can’t communicate but worships small binaries, you’re seeing various symptoms of this problem.

In reality, software systems are written by people. They are read by people, modified by people, understood or not by people. They represent the mind of the developers that wrote them. They are the closest thing to a raw representation of thought that we have on Earth. They are not themselves human, alive, intelligent, emotional, evil, or good. It’s people that have those qualities. Software is used entirely and only to serve people. They are the product of people, and they are usually the product of a group of those people who had to work together, communicate, understand each other, and collaborate effectively. As such, there’s an important point to be made about working with a group of software engineers:

There is no value to being cruel to other people in the development community.

It doesn’t help to be rude to the people that you work with. It doesn’t help to angrily tell them that they are wrong and that they shouldn’t be doing what they are doing. It does help to make sure that the laws of software design are applied, and that people follow a good path in terms of making systems that can be easily read, understood, and maintained. It doesn’t require that you be cruel to do this, though. Sometimes you do have to tell people that they haven’t done the right thing. But you can just be matter of fact about it—you don’t have to get up in their face or attack them personally for it.

For example, let’s say somebody has written a bad piece of code. You have two ways you could comment on this:

“I can’t believe you think this is a good idea. Have you ever read a book on software design? Obviously you don’t do this.”

That’s the rude way—it’s an attack on the person themselves. Another way you could tell them what’s wrong is this:

“This line of code is hard to understand, and this looks like code duplication. Can you refactor this so that it’s clearer?”

In some ways, the key point here is that you’re commenting on the code, and not on the developer. But also, the key point is that you’re not being a jerk. I mean, come on. The first response is obviously rude. Does it make the person want to work with you, want to contribute more code, or want to get better? No. The second response, on the other hand, lets the person know that they’re taking a bad path and that you’re not going to let that bad code into the codebase.

The whole reason that you’re preventing that programmer from submitting bad code has to do with people in the first place. Either it’s about your users or it’s about the other developers who will have to read the system. Usually, it’s about both, since making a more maintainable system is done entirely so that you can keep on helping users effectively. But one way or another, your work as a software engineer has to do with people.

Yes, a lot of people are going to read the code and use the program, and the person whose code you’re reviewing is just one person. So it’s possible to think that you can sacrifice some kindness in the name of making this system good for everybody. Maybe you’re right. But why be rude or cruel when you don’t have to be? Why create that environment on your team that makes people scared of doing the wrong thing, instead of making them happy for doing the right thing?

This extends beyond just code reviews, too. Other software engineers have things to say. You should listen to them, whether you agree or not. Acknowledge their statements politely. Communicate your ideas to them in some constructive fashion.

And look, sometimes people get angry. Be understanding. Sometimes you’re going to get angry too, and you’d probably like your teammates to be understanding when you do.

This might all sound kind of airy-fairy, like some sort of unimportant psychobabble BS. But look. I’m not saying, “Everybody is always right! You should agree with everybody all the time! Don’t ever tell anybody that they are wrong! Nobody ever does anything bad!” No, people are frequently wrong and there are many bad things in the world and in software engineering that you have to say no to. The world is not a good place, always. It’s full of stupid people. Some of those stupid people are your co-workers. But even so, you’re not going to be doing anything effective by being rude to those stupid people. They don’t need your hatred—they need your compassion and your assistance. And most of your co-workers are probably not stupid people. They are probably intelligent, well-meaning individuals who sometimes make mistakes, just like you do. Give them the benefit of the doubt. Work with them, be kind, and make better software as a result.

-Max

The post Kindness and Code appeared first on Code Simplicity.

The Fundamental Philosophy of Debugging

2017-07-18T00:54:04Z

Sometimes people have a very hard time debugging. Mostly, these are people who believe that in order to debug a system, you have to think about it instead of looking at it.

Let me give you an example of what I mean. Let’s say you have a web server that is silently failing to serve pages to users 5% of the time. What is your reaction to this question: “Why?”

Do you immediately try to come up with some answer? Do you start guessing? If so, you are doing the wrong thing.

The right answer to that question is: “I don’t know.”

So this gives us the first step to successful debugging:

When you start debugging, realize that you do not already know the answer.

It can be tempting to think that you already know the answer. Sometimes you can guess and you’re right. It doesn’t happen very often, but it happens often enough to trick people into thinking that guessing the answer is a good method of debugging. However, most of the time, you will spend hours, days, or weeks guessing the answer and trying different fixes with no result other than complicating the code. In fact, some codebases are full of “solutions” to “bugs” that are actually just guesses—and these “solutions” are a significant source of complexity in the codebase.

Actually, as a side note, I’ll tell you an interesting principle. Usually, if you’ve done a good job of fixing a bug, you’ve actually caused some part of the system to go away, become simpler, have better design, etc. as part of your fix. I’ll probably go into that more at some point, but for now, there it is. Very often, the best fix for a bug is a fix that actually deletes code or simplifies the system.

But getting back to the process of debugging itself, what should you do? Guessing is a waste of time, imagining reasons for the problem is a waste of time—basically most of the activity that happens in your mind when first presented with the problem is a waste of time. The only things you have to do with your mind are:

Remember what a working system behaves like.
Figure out what you need to look at in order to get more data.

Because you see, this brings us to the most important principle of debugging:

Debugging is accomplished by gathering data until you understand the cause of the problem.

The way that you gather data is, almost always, by looking at something. In the case of the web server that’s not serving pages, perhaps you would look at its logs. Or you could try to reproduce the problem so that you can look at what happens with the server when the problem is happening. This is why people often want a “reproduction case” (a series of steps that allow you to reproduce the exact problem)—so that they can look at what is happening when the bug occurs.

Sometimes the first piece of data you need to gather is what the bug actually is. Often users file bug reports that have insufficient data. For example, let’s say a user files the bug, “When I load the page, the web server doesn’t return anything.” That’s not sufficient information. What page did they try to load? What do they mean by “doesn’t return anything?” Is it just a white page? You might assume that’s what the user meant, but very often your assumptions will be incorrect. The less experienced your user is as a programmer or computer technician, the less well they will be able to express specifically what happened without you questioning them. In these cases, unless it’s an emergency, the first thing that I do is just send the user back specific requests to clarify their bug report, and leave it at that until they respond. I don’t look into it at all until they clarify things. If I did go off and try to solve the problem before I understood it fully, I could be wasting my time looking into random corners of the system that have nothing to do with any problem at all. It’s better to go spend my time on something productive while I wait for the user to respond, and then when I do have a complete bug report, to go research the cause of the now-understood bug.

As a note on this, though, don’t be rude or unfriendly to users just because they have filed an incomplete bug report. The fact that you know more about the system and they know less about the system doesn’t make you a superior being who should look down upon all users with disdain from your high castle on the shimmering peak of Smarter-Than-You Mountain. Instead, ask your questions in a kind or straightforward manner and just get the information. Bug filers are rarely intentionally being stupid—rather, they simply don’t know and it’s part of your job to help them provide the right information. If people frequently don’t provide the right information, you can even include a little questionnaire or form on the bug-filing page that makes them fill in the right information. The point is to be helpful to them so that they can be helpful to you, and so that you can easily resolve the issues that come in.

Once you’ve clarified the bug, you have to go and look at various parts of the system. Which parts of the system to look at is based on your knowledge of the system. Usually it’s logs, monitoring, error messages, core dumps, or some other output of the system. If you don’t have these things, you might have to launch or release a new version of the system that provides the information before you can fully debug the system. Although that might seem like a lot of work just to fix a bug, in reality it often ends up being faster to release a new version that provides sufficient information than to spend your time hunting around the system and guessing what’s going on without information. This is also another good argument for having fast, frequent releases—that way you can get out a new version that provides new debugging information quickly. Sometimes you can get a new build of your system out to just the user who is experiencing the problem, too, as a shortcut to get the information that you need.

Now, remember above that I mentioned that you have to remember what a working system looks like? This is because there is another principle of debugging:

Debugging is accomplished by comparing the data that you have to what you know the data from a working system should look like.

When you see a message in a log, is that a normal message or is it actually an error? Maybe the log says, “Warning: all the user data is missing.” That looks like an error, but really your web server prints that every single time it starts. You have to know that a working web server does that. You’re looking for behavior or output that a working system does not display. Also, you have to understand what these messages mean. Maybe the web server optionally has some user database that you aren’t using, which is why you get that warning—because you intend for all the “user data” to be missing.

Eventually you will find something that a working system does not do. You shouldn’t immediately assume you’ve found the cause of the problem when you see this, though. For example, maybe it logs a message saying, “Error: insects are eating all the cookies.” One way that you could “fix” that behavior would be to delete the log message. Now the behavior is like normal, right? No, wrong—the actual bug is still happening. That’s a pretty stupid example, but people do less-stupid versions of this that don’t fix the bug. They don’t get down to the basic cause of the problem and instead they paper over the bug with some workaround that lives in the codebase forever and causes complexity for everybody who works on that area of the code from then on. It’s not even sufficient to say “You will know that you have found the real cause because fixing that fixes the bug.” That’s pretty close to the truth, but a closer statement is, “You will know that you have found a real cause when you are confident that fixing it will make the problem never come back.” This isn’t an absolute statement—there is a sort of scale of how “fixed” a bug is. A bug can be more fixed or less fixed, usually based on how “deep” you want to go with your solution, and how much time you want to spend on it. Usually you’ll know when you’ve found a decent cause of the problem and can now declare the bug fixed—it’s pretty obvious. But I wanted to warn you against papering over a bug by eliminating the symptoms but not handling the cause.

And of course, once you have the cause, you fix it. That’s actually the simplest step, if you’ve done everything else right.

So basically this gives us four primary steps to debugging:

Familiarity with what a working system does.
Understanding that you don’t already know the cause of the problem.
Looking at data until you know what causes the problem.
Fixing the cause and not the symptoms.

This sounds pretty simple, but I see people violate this formula all the time. In my experience, most programmers, when faced with a bug, want to sit around and think about it or talk about what might be causing it—both forms of guessing. It’s okay to talk to other people who might have information about the system or advice on where to look for data that would help you debug. But sitting around and collectively guessing what could cause the bug isn’t really any better than sitting around and doing it yourself, except perhaps that you get to chat with your co-workers, which could be good if you like them. Mostly though what you’re doing in that case is wasting a bunch of people’s time instead of just wasting your own time.

So don’t waste people’s time, and don’t create more complexity than you need to in your codebase. This debugging method works. It works every time, on every codebase, with every system. Sometimes the “data gathering” step is pretty hard, particularly with bugs that you can’t reproduce. But at the worst, you can gather data by looking at the code and trying to see if you can see a bug in it, or draw a diagram of how the system behaves and see if you can perceive a problem there. I would only recommend that as a last resort, but if you have to, it’s still better than guessing what’s wrong or assuming you already know.

Sometimes, it’s almost magical how a bug resolves just by looking at the right data until you know. Try it for yourself and see. It can actually be fun, even.

-Max

The post The Fundamental Philosophy of Debugging appeared first on Code Simplicity.

Refactoring is About Features

2017-05-02T15:58:44Z

There’s a point that I made in the book but which I have had to point out to people a few times since then, and so I wanted to emphasize it a bit more.

When you clean up code, you are always doing it in the service of the product. Refactoring is essentially an organizational process (not the definition of “organizational” meaning “having to do with a business” but the definition meaning “having to do with putting things in order”). That is, you’re putting in order so that you can do something.

When you start refactoring for the sake of refactoring alone, refactoring gets a bad name. People start to think that you’re wasting your time, you lose your credibility, and your manager or peers will stop you from continuing your work.

When I say “refactoring for the sake of refactoring alone,” what I mean is looking at a piece of code that has nothing to do with what you’re actually working on, saying, “I don’t like the way that this is designed,” and moving parts of the design around without affecting the functionality of the system. This is like watering the lawn when your house is on fire. If your codebase is like most of the codebases I’ve seen, “your house is on fire” is probably even an appropriate analogy. Even so, if things aren’t that bad, the point is that you’re focusing on something that doesn’t need to be focused on. You might feel like you’re doing a great job of reorganizing the code, and probably you are, but the point of watering your lawn is to have a nice lawn in front of your house. If your refactoring has nothing to do with the current product or feature goals of your system, you’re not actually accomplishing anything other than re-ordering something that nobody is using, involved with, or cares about.

So what is it that you want to do? Well, usually, what you want to do is pick a feature that you want to get implemented, and figure out what you could refactor that would make it easier to implement that. Or you find an area of the code that is frequently being worked on and get some reorganization done in that area. This will make people appreciate your work. It’s not just about that—it’s really about the fact that they will appreciate it because you are doing something effective. But getting appreciation for the work that you’ve done—or at least some form of polite acknowledgment—can help encourage you to continue, can show you that other people are starting to care about your work, and hopefully help spread good development practices across your company.

Is there ever a time when you would tackle a refactoring project that doesn’t have something directly to do with the work that you have to do? Well, sometimes you would refactor something that has to do indirectly with the goal that you have. Sometimes when you start looking at a particularly complex problem, it’s like trying to pick up rocks on the beach to get down to the sand at the bottom. You try to move a rock, and figure out that first, you have to move some other rock. Then you discover that that rock is up against a large boulder, and there are rocks all around that boulder that prevent it from being moved, and so forth.

So within reason, you have to handle the issues that are blocking you from doing refactoring. If these problems get large enough, you will need a dedicated engineer whose job it is to resolve these problems—in particular the problems that block refactoring itself. (For example, maybe the dependencies of your code or its build system are so complex that nobody can move any code anywhere, and if that’s a big enough problem, it could be months of work for one person.) Of course, ideally you’d never get into a situation where your problems are so big that they can’t be moved by an individual doing their normal job. The way that you accomplish that is by following the principles of incremental development and design and always making the system look like it was designed to do the job that it’s doing now.

But assuming that you are like most of the software projects in the world who didn’t do that, you’re now in some sort of bad situation and need to be dug out of the pile of rocks that your system has buried itself under. I wouldn’t feel bad about this, mostly because feeling bad about it doesn’t really accomplish anything. Instead of feeling bad about it or feeling confused about it, what you need to do is to have some sort of system that will let you attack the problem incrementally and get to a better state from where you are. This is a lot more complex than keeping the system well-designed as you go, but it can be done.

The key principle to cleaning up a complex codebase is to always refactor in the service of a feature.

See, the problem is that you have this mountain of “rocks.” You have something like a house on fire, except that the house is the size of several mountains and it’s all on fire all the time. You need to figure out which part of the “mountain” or “house” that you actually need right now, and get that into good shape so that it can be “used,” on a series of small steps. This isn’t a perfect analogy, since a fire is temporary, dangerous, and life-threatening. It will also destroy things faster than you can clean them up. But sometimes a codebase is actually in that state–it’s getting worse faster than it’s getting better. That’s another principle:

Your first goal is to get the system into a place where it’s getting better over time, instead of getting worse.

These are practically the same principle, even though they sound completely different. How can that be? Because the way that you get the codebase to get better over time instead of getting worse is that you get people to refactor the code that they are about to add features to right before they add features to it.

You look at a piece of code. Let’s say that it’s a piece of code that generates a list of employee names at your company. You have to add a new feature to sort the list by the date they were hired. You’re reading the code, and you can’t figure out what the variable names mean. So the first thing you’d do, before adding the new feature, is to make a separate, self-contained change that improves the variable names. After you do that, you still can’t understand the code, because it’s all in one function that contains 1000 lines of code. So you split it up into several functions. Maybe now it’s good enough, and you feel like it would be pretty simple to add the new sorting feature. Maybe you want to change those functions into well-designed objects before you continue, though, if you’re in an object-oriented language. It’s all sort of up to you—the basic point is that you should be making things better and they should be getting better faster than they’re getting worse. It’s a judgment point as to how far you go. You have to balance the fact that you do need to make forward progress on your feature goals, and that you can’t just refactor your code forever.

In general, I set some boundary around my code, like “I’m not going to refactoring anything outside of my project to get this feature done,” or “I’m not going to wait for a change to the programming language itself before I can release this feature.” But within my boundary, I try to do a good job. And I try to set the boundary as wide as possible without getting into a situation where I won’t be able to actually develop my feature. Usually that’s a time boundary as well as a “scope of codebase” (like, how far outside of my codebase) boundary—the time part is often the most important, like “I’m not going to do a three-month project to develop a two-day feature.” But even with that I balance things on the side of spending time on the refactoring, especially when I first start doing this in a codebase and it’s a new thing and the whole thing is very messy.

And that brings us to another point—even though you might think that it’s going to take more time to refactor and then develop your feature, in my experience it usually takes less time or the same amount of time overall. “Overall” here includes all the time that you would spend debugging, rolling back releases, sending out bug fixes, writing tests for complex systems, etc. It might seem faster to write a feature in your complex system without refactoring, and sometimes it is, but most of the time you’ll spend less time overall if you do a good job of putting the system in order first before you start adding new feature. This isn’t just theoretical—I’ve demonstrated it to be the case many times. I’ve actually had my team finish projects faster than teams who were working on newer codebases with better tools when we did this. (That is, the other team should have been able to out-develop us, but we refactored continuously in the service of the product, and always got our releases out faster and were actually ahead in terms of features, with roughly the same number of developers on both projects working on very similar features.)

There’s another point that I use to decide when I’m “done” with refactoring a particular piece of code, which is that I think that other people will be able to clearly see the pattern I’ve designed and will maintain the code in that pattern from then on. Sometimes I have to write a little piece of documentation that describes the intended design of the system, so that people will follow it, but in general my theory (and this one really is just a theory—I don’t have enough evidence for it yet) is that if I design a piece of code well enough, it shouldn’t need a piece of documentation describing how it’s supposed to be designed. It should probably be visible just from reading the code how it’s designed, and it should be so obvious how you’d add a new feature within that design that nobody would ever do it otherwise. Obviously, perfectly achieving that goal would be impossible, but that’s a general truth in software design:

There is no perfect design, there is only a better design.

So that’s another way that you know that you’re “bikeshedding” or over-engineering or spending too much time on figuring out how to refactor something—that you’re trying to make it “perfect.” It’s not going to be “perfect,” because there is no “perfect.” There’s “does a good job for the purpose that it has.” That is, you can’t even really judge whether or not a design is good without understanding the purpose the code is being designed for. One design would be good for one purpose, another design would be good for another purpose. Yes, there are generic libraries, but even that is a purpose. And the best generic libraries are designed by actual experimentation with real codebases where you can verify that they serve specific purposes very well. When you’re refactoring, the idea is to change the design from one that doesn’t currently suit the purpose well to a design that fits the current purpose that piece of code has. That’s not all there is to know about refactoring, but it’s a pretty good basic principle to start with.

So, in brief, refactoring is an organizational process that you go through in order to make production possible. If you aren’t going toward production when you refactor, you’re going to run into lots of different kinds of trouble. I can’t even tell you all of the things that are going to go wrong, but they’re going to happen. On the other hand, if you just try to produce a system and you never reorganize it, you’re going to get yourself into such a mess that production becomes difficult or impossible. So both of these things have to be done—you must produce a product, and you must organize the system in such a way that the product can be produced quickly, reliably, simply, and well. If you leave out organization, you won’t get the product that you want, and if you leave out production, then there’s literally no reason to even be doing the refactoring in the first place.

Yes, it’s nice to water the lawn, but let’s put out some fires, first.

-Max

The post Refactoring is About Features appeared first on Code Simplicity.

Effective Engineering Productivity

2017-04-05T15:57:58Z

Often, people who work on engineering productivity either come into conflict with the developers they are attempting to help, or spend a long time working on some project that ends up not mattering because nobody actually cares about it.

This comes about because the problem that you see that a development team has is not necessarily the problem that they know exists. For example, you could come into the team and see that they have hopelessly complex code and so they can’t write good tests or maintain the system easily. However, the developers aren’t really that aware that they have complex code or that this complexity is causing the trouble that they are having. What they are aware of is something like, “we can only release once a month and the whole team has to stay at work until 10:00 PM to get the release out on the day that we release.”

When engineering productivity workers encounter this situation, some of them just try to ignore the developers’ complaints and just go start refactoring code. This doesn’t really work, for several reasons. The first is that both management and some other developers will resist you, making it more difficult than it needs to be to get the job done. But if just simple resistance was the problem, you could overcome it. The real problem is that you will become unreal and irrelevant to the company, even if you’re doing the best job that anybody’s ever seen. Your management will try to dissuade you from doing your job, or even try to get rid of you. When you’re already tackling technical complexity, you don’t need to also be tackling a whole company that’s opposed to you.

In time, many engineering productivity workers develop an adversarial attitude toward the developers that they are working with. They feel that if the engineers would “just use the tool that I wrote” then surely all would be well. But the developers aren’t using the tool that you wrote, so why does your tool even matter? The problem here is that when you start off ignoring developer complaints (or don’t even find out what problems developers think they have) that’s already inherently adversarial. That is, it’s not that everything started off great and then somehow became this big conflict. It actually started off with a conflict by you thinking that there was one problem and the developers thinking there was a different problem.

And it’s not just that the company will be resistive—this situation is also highly demoralizing to the individual engineering productivity worker. In general, people like to get things done. They like for their work to have some result, to have some effect. If you do a bunch of refactoring but nobody maintains the code’s simplicity, or you write some tool/framework that nobody uses, then ultimately you’re not really doing anything, and that’s disheartening.

So what should you do? Well, we’ve established that if you simply disagree with (or don’t know) the problem that developers think they have, then you’ll most likely end up frustrated, demoralized, and possibly even out of a job. So what’s the solution? Should you just do whatever the developers tell you to do? After all, that would probably make them happy and keep you employed and all that.

Well, yes, you will accomplish that (keeping your job and making some people happy)…well, maybe for a little while. You see, this approach is actually very shortsighted. If the developers you are working with knew exactly how to resolve the situation they are in, it’s probable that they would never have gotten themselves into it in the first place. That isn’t always true—sometimes you’re working with a new group of people who have taken over an old codebase, but in that case then usually this new group is the “productivity worker” that I’m talking about, or maybe you are one of these new developers. Or some other situation. But even then, if you only provide the solutions that are suggested to you, you’ll end up with the same problems that I describe in Users Have Problems, Developers Have Solutions. That is, when you work in developer productivity, the developers are your users. You can’t just accept any suggestion they have for how you should implement your solutions. It might make some people happy for a little while, but you end up with a system that’s not only hard to maintain, it also only represents the needs of the loudest users—who are probably not the majority of your users. So then you have a poorly-designed system that doesn’t even have the features its actual users want, which once again leads to you not getting promoted, being frustrated, etc.

Also, there’s a particular problem that happens in this space with developer productivity. If you only provide the solutions that developers specify, you usually never get around to resolving the actual underlying problems. For example, if the developers think the release of their 10-million-lines-of-code monolithic binary is taking too long, and you just spend all your time making the release tools faster, you’re never going to get to a good state. You might get to a better state (somewhat faster releases) but you’ll never resolve the real problem, which is that the binary is just too damn large.

So what, then? Not doing what they say means failing, and doing what they say means only mediocre success. Where’s the middle ground here?

The correct solution is very similar to Users Have Problems, Developers Have Solutions, but it has a few extra pieces. Using this method, I have not only solved significant underlying problems in vast codebases, I have actually changed the development culture of significant engineering organizations. So it works pretty well, when done correctly.

The first thing to do is to find out what problems the developers think they have. Don’t make any judgments. Go around and talk to people. Don’t just ask the managers or the senior executives. They usually say something completely different from what the real software engineers say. Go around and talk to a lot of people who work directly on the codebase. If you can’t get everybody, get the technical lead from each team. And then yes, also do talk to the management, because they also have problems that you want to address and you should understand what those are. But if you want to solve developer problems, you have to find out what those problems are from developers.

There’s a trick that I use here during this phase. In general, developers aren’t very good at saying where code complexity lies if you just ask them directly. Like, if you just ask, “What is too complex?” or “What do you find difficult?” they will think for a bit and may or may not come up with anything. But if you ask most developers for an emotional reaction to the code that they work on or work with, they will almost always have something. I ask questions like, “Is there some part of your job that you find really annoying?” “Is there some piece of code that’s always been frustrating to work with?” “Is there some part of the codebase that you’re afraid to touch because you think you’ll break it?” And to managers, “Is there some part of the codebase that developers are always complaining about?” You can adjust these questions to your situation, and remember that you want to be having a real conversation with developers—not just robotically reading off a list of questions. They are going to say things that you’re going to want more specifics on. You’ll probably want to take notes, and so forth.

After a while of doing this, you’ll start to get the idea that there is a common theme (or a few common themes) between the complaints. If you’ve read my book or if you’ve worked in engineering productivity for a while, you’ll usually realize that the real underlying cause of the problems is some sort of code complexity. But that’s not purely the theme we’re looking for—we could have figured that out without even talking to anybody. We’re looking for something a bit higher level, like “building the binary is slow.” There might be several themes that come up.

Now, you’ll have a bunch of data, and there are a few things you can do with it. Usually engineering management will be interested in some of this information that you’ve collected, and presenting it to them will make you real to the managers and hopefully foster some agreement that something needs to be done about the problem. That’s not necessary to do as part of this solution, but sometimes you’ll want to do it, based on your own judgment of the situation.

The first thing you should do with the data is find some problem that developers know they have, that you know you can do something about in a short period of time (like a month or two) and deliver that solution. This doesn’t have to be life-changing or completely alter the way that everybody works. In fact, it really should not do that. Because the point of this change is to make your work credible.

When you work in engineering productivity, you live or die by your personal credibility.

You see, at some point you need to be able to get down to the real problem. And the only way that you’re going to be able to do that is if the developers find you credible enough to believe you and trust you when you want to make some change. So you need to do something at first to become credible to the team. It’s not some huge, all-out change. It’s something that you know you can do, even if it’s a bit difficult. It helps if it’s something that other people have tried to do and failed, because then you also demonstrate that in fact something can be done about this mess that other people perhaps failed to handle (and then everybody felt hopeless about the whole thing and just decided they’d have to live with the mess forever, and it can’t be fixed and blah blah blah so on and so on).

Once you’ve established your basic credibility by handling this initial problem, then you can start to look at what problem the developers have and what you think the best solution to that would be. Now, often, this is not something you can implement all at once. And this is another important point—you can’t change everything about a team’s culture or development process all at once. You have to do it incrementally, deal with the “fallout” of the change (people getting mad because you changed something, or because it’s all different now, or because your first iteration of the change doesn’t work well) and wait for that to calm down before moving on to the next step. If you tried to change everything all at once, you’d essentially have a rebellion on your hands—a rebellion that would result in the end of your credibility and the failure of all your efforts. You’d be right back in the same pit that the other two, non-working solutions from above end you up in—being demoralized or ineffective. So you have to work in steps. Some teams can accept larger steps, and some can only accept smaller ones. Usually, the larger the team, the more slowly you have to go.

Now, sometimes at this point you run into somebody who is such a curmudgeon that you just can’t seem to make forward progress. Sometimes there is some person who is very senior who is either very set in their ways or just kind of crazy. (You can usually tell the latter because the crazy ones are frequently insulting or rude.) How much progress you can make in this case depends partly on your communication skills, partly on your willingness to persist, but also partly in how you go about resolving this situation. In general, what you want to do is find your allies and create a core support group for the efforts you are making. Almost always, the majority of developers want sanity to prevail, even if they aren’t saying anything.

Just being publicly encouraging when somebody says they want to improve something goes a long way. Don’t demand that everybody make the perfect change—you’re gathering your “team” and validating the idea that code cleanup, productivity improvements, etc. are valuable. And you have something like a volunteer culture or an open-source project—you have to be very encouraging and kind in order to foster its growth. That doesn’t mean you should accept bad changes, but if somebody wants to make things better, then you should at least acknowledge them and say that’s great.

Sometimes 9 out of 10 people all want to do the right thing, but they are being overruled by the one loud person who they feel they must bow down to or respect beyond rationality, for some reason. So you basically do what you can with the group of people who do support you, and make the progress that you can make that way. Usually, it’s actually even possible to ignore the one loud person and just get on with making things better anyway.

If you ultimately get totally stopped by some senior person, then either (a) you didn’t go about this the right way (meaning that you didn’t follow my recommendations above, there’s some communication difficulty, you’re genuinely trying to do something that would be bad for developers, etc.) or (b) the person stopping you is outright insane, no matter how “normal” they seem.

If you’re blocked because you’re doing the wrong thing, then figure out what would help developers the most and do that instead. Sometimes this is as simple as doing a better job of communicating with the person who’s blocking you. Like, for example, stop being adversarial or argumentative, but listen to what they person has to say and see if you can work with them. Being kind, interested, and helpful goes a long way. But if it’s not that, and you’re being stopped by a crazy person, and you can’t make any progress even with your supporters, then you should probably find another team to work with. It’s not worth your sanity and happiness to go up against somebody who will never listen to reason and who is dead set on stopping you at all costs. Go somewhere where you can make a difference in the world rather than hitting your head up against a brick wall forever.

That’s not everything there is to know about handling that sort of situation with a person who’s blocking your work, but it gives you the basics. Persist, be kind, form a group of your supporters, don’t do things that would cause you to lose credibility, and find the things that you can do to help. Usually the resistance will crumble slowly over time, or the people who don’t like things getting better will leave.

So let’s say that you are making progress improving productivity by incremental steps, and you are in some control over any situations that might stop you. Where do you go from there? Well, make sure that you’re moving towards the fundamental problem with your incremental steps. At some point, you need to start changing the way that people write software in order to solve the problem. There is a lot to know about this, which I’ve either written up before or I’ll write up later. But at some point you’re going to need to get down to simplifying code. When do you get to do that? Usually, when you’ve incrementally gotten to the point where there is a problem that you can credibly indicate refactoring as part of the solution to. Don’t promise the world, and don’t say that you’re going to start making a graph of improved developer productivity from the refactoring work that you are going to do. Managers (and some developers) will want various things from you, sometimes unreasonable demands born out of a lack of understanding of what you do (or sometimes from the outright desire to block you by placing unreasonable requirements on your work). No, you have to have some problem where you can say, “Hey, it would be nice to refactor this piece of code so that we can write feature X more easily,” or something like that.

From there, you keep proposing refactorings where you can. This doesn’t mean that you stop working on tooling, testing, process, etc. But your persistence on refactoring is what changes the culture the most. What you want is for people to think “we always clean up code when we work on things,” or “code quality is important,” or whatever it takes to get the culture that you want.

Once you have a culture where things are getting better rather than getting worse, the problem will tend to eventually fix itself over time, even if you don’t work on it anymore. This doesn’t mean you should stop at this point, but at the worst, once everybody cares about code quality, testing, productivity, etc. you’ll see things start to resolve themselves without you having to be actively involved.

Remember, this whole process isn’t about “building consensus.” You’re not going for total agreement from everybody in the group about how you should do your job. It’s about finding out what people know is broken and giving them solutions to that, solutions that they can accept and which improve your credibility with the team, but also solutions which incrementally work toward resolving the real underlying problems of the codebase, not just pandering to whatever developer need happens to be the loudest at the moment. If you had to keep only one thing in mind, it’s:

Solve the problems that people know they have, not the problems you think they have.

One last thing that I’ll point out, is that I’ve talked a lot about this as though you were personally responsible for the engineering productivity of a whole company or a whole team. That’s not always the case—in fact, it’s probably not the case for most people who work in engineering productivity. Some people work on a smaller part of a tool, a framework, a sub-team, etc. This point about solving the problems that are real still applies. Actually, probably most of what I wrote above can be adapted to this particular case, but the most important thing is that you not go off and solve the problem that you think developers have, but that instead you solve a problem that (a) you can prove exists and (b) that the developers know exists. Many of the engineering productivity teams that I’ve worked with have violated this so badly that they have spent years writing tools or frameworks that developers didn’t want, never used, and which the developers actually worked to delete when the person who designed them was gone. What a pointless waste of time!

So don’t waste your time. Be effective. And change the world.

-Max

The post Effective Engineering Productivity appeared first on Code Simplicity.

Measuring Developer Productivity

2017-01-03T20:42:41Z

Almost as long as I have been working to make the lives of software engineers better, people have been asking me how to measure developer productivity. How do we tell where there are productivity problems? How do we know if a team is doing worse or better over time? How does a manager explain to senior managers how productive the developers are? And so on and so on.

In general, I tended to focus on focus on code simplicity first, and put a lower priority on measuring every single thing that developers do. Almost all software problems can be traced back to some failure to apply software engineering principles and practices. So even without measurements, if you simply get good software engineering practices applied across a company, most productivity problems and development issues disappear.

Now, that said, there is tremendous value in measuring things. It helps you pinpoint areas of difficulty, allows you to reward those whose productivity improves, justifies spending more time on developer productivity work where that is necessary, and has many other advantages.

But programming is not like other professions. You can’t measure it like you would measure some manufacturing process, where you could just count the number of correctly-made items rolling off the assembly line.

So how would you measure the production of a programmer?

The Definition of “Productivity”

The secret is in appropriately defining the word “productivity.” Many people say that they want to “measure productivity,” but have never thought about what productivity actually is. How can you measure something if you haven’t even defined it?

The key to understanding what productivity is is realizing that it has to do with products. A person who is productive is a person who regularly and efficiently produces products.

The way to measure the productivity of a developer is to measure the product that they produce.

That statement alone probably isn’t enough to resolve the problem, though. So let me give you some examples of things you wouldn’t measure, and then some things you would, to give you a general idea.

Why Not “Lines of Code?”

Probably the most common metrics that the software industry has attempted to develop have been centered around how many lines of code (abbreviated LoC) a developer writes. I understand why people have tried to do this—it seems to be something that you can measure, so why not keep track of it? A coder who writes more code is more productive, right?

Well, no. Part of the trick here is:

“Computer programmer” is not actually a job.

Wait, what? But I see ads all over the place for “programmer” as a job! Well, yes, but you also see ads for “carpenter” all over the place. But what does “a carpenter” produce? Unless you get more specific, it’s hard to say. You might say that a carpenter makes “cut pieces of wood,” but that’s not a product—nobody’s going to hire you to pointlessly cut or shape pieces of wood. So what would be a job that “a carpenter” could do? Well, the job might be furniture repair, or building houses, or making tables. In each case, the carpenter’s product is different. If he’s a Furniture Repairman (a valid job) then you would measure how much furniture he repaired well. If he was building houses, you might measure how many rooms he completed that didn’t have any carpentry defects.

The point here is that “computer programmer,” like “carpenter,” is a skill, not a job. You don’t measure the practice of a skill if you want to know how much a person is producing. You measure something about the product that that skill produces. To take this to an absurd level—just to illustrate the point—part of the skill of computer programming these days involves typing on a keyboard, but would you measure a programmer’s productivity by how many keys they hit on the keyboard per day? Obviously not.

Measuring lines of code is less absurd than measuring keys hit on a keyboard, because it does seem like one of the things a programmer produces—a line of code seems like a finished thing that can be delivered, even if it’s small. But is it really a product, all by itself? If I estimated a job as taking 1000 lines of code, and I was going to charge $1000 for it, would my client pay me $1 if I only delivered one line of code? No, my client would pay me nothing, because I didn’t deliver any product at all.

So how would you apply this principle in the real world to correctly measure the production of a programmer?

Determining a Valid Metric

The first thing to figure out is: what is the program producing that is of value to its users? Usually this is answered by a fast look at the purpose of software—determine what group of people you’re helping do what with your software, and figure out how you would describe the result of that help as a product. For example, if you have accounting software that helps individuals file their taxes, you might measure the total number of tax returns fully and correctly filed by individuals using your software. Yes, other people contribute to that too (such as salespeople) but the programmer is primarily responsible for how easily and successfully the actual work gets done. One might want to pick metrics that focus closely on things that only the programmer has control over, but don’t go overboard on that—the programmer doesn’t have to be the only person who could possibly influence a metric in order for it to be a valid measurement of their personal production.

There could be multiple things to measure for one system, too. Let’s say you’re working on a shopping website. A backend developer of that website might measure something about the number of data requests successfully filled, whereas a frontend developer of a shopping cart for the site might measure how many items are put into carts successfully, how many people get through the checkout flow successfully every day, etc.

Of course, one would also make sure that any metric proposed also aligns with the overall metric(s) of the whole system. For example, if a backend developer is just measuring “number of data requests received at the backend” but not caring if they are correctly filled, how quickly they are filled, or whatever, they could design a poor API that requires too many calls and that actually harms the overall user experience. So you have to make sure that any metric you’re looking at, you compare it to the reality of helping your actual users. In this particular case, a better solution might be to count, say, how many “submit payment” requests are processed correctly, since that’s the end result. (I wouldn’t take that as the only possible metric for the backend of a shopping website, by the way—that’s just one possible thought.)

What About When Your Product Is Code?

There are people who deliver code as their product. For example, a library developer’s product is code. But it’s rarely a single line of code—it’s more like an entire function, class, or set of classes. You might measure something like “Number of fully-tested public API functions released for use by programmers” for a library developer. You’d probably have to do something to count new features for existing functions in that case, too, like counting every new feature for a function that improves its API as being a whole new “function” delivered. Of course, since the original metric says “fully tested,” any new feature would have to be fully tested as well, to count. But however you choose to measure it, the point here is that even for the small number of people whose product is code, you’re measuring the product.

What About People Who Work on Developer Productivity?

That does leave one last category, which is people who work on improving developer productivity. If it’s your job to help other developers move more quickly, how do you measure that?

Well, first off, most people who work on developer productivity do have some specific product. Either they work on a test framework (which you would measure in a similar fashion to how you would measure a library) or they work on some tool that developers use, in which case you would measure something about the success or usage of that tool. For example, one thing the developers of a bug tracking system might want to measure is number of bugs successfully and rapidly resolved. Of course, you would modify that to take into account how the tool was being used in the company—maybe some entries in the bug tracker are intended to live for a long time, so you would measure those entries some other way. In general, you’d ask: what is the product or result that we bring about in the world by working on this tool? That’s what you’d measure.

But what if you don’t work on some specific framework or tool? In that case, perhaps your product has something to do with software engineers themselves. Maybe you would measure the number of times an engineer was assisted by your work. Or the amount of engineering time saved by your changes, if you can reliably measure that (which is rarely possible). In general, though, this work can be much trickier to measure than other types of programming.

One thing that I have proposed in the past (though have not actually attempted to do yet) is, if you have a person who helps particular teams with productivity, measure the improvement in productivity that those teams experience over time. Or perhaps measure the rate at which the team’s metrics improve.

For example, let’s say that we are measuring a product purely in terms of how much money it brings in. (Note: it would be rare to measure a product purely by this metric—this is an artificial example to demonstrate how this all works.) Let’s say in the first week the product brought in $100. Next week $101, and next week $102. That’s an increase, so it’s not that bad, but it’s not that exciting. Then Mary comes along and helps the team with productivity. The product makes $150 that week, then $200, then $350 as Mary continues to work on it. It’s gone from increasing at a rate of $1 a week to increasing at a rate of $50, then $100, then $150 a week. That seems like a valid thing to measure for Mary. Of course, there could be other things that contribute to that metric improving, so it’s not perfect, but it’s better than nothing if you really do have a “pure” productivity developer.

Conclusion

There are lots of other things to know about how to measure production of employees, teams, and companies in general. The above points are only intended to discuss how to take a programmer and figure out what general sort of thing you should be measuring. There’s a lot more to know about the right way to do measurements, how to interpret those measurements, and how to choose metrics that don’t suck. Hopefully, though, the above should get you started on solving the great mystery of how to measure the production of individual programmers, teams, and whole software organizations.

-Max

The post Measuring Developer Productivity appeared first on Code Simplicity.

Two is Too Many

2015-12-21T20:21:35Z

There is a key rule that I personally operate by when I’m doing incremental development and design, which I call “two is too many.” It’s how I implement the “be only as generic as you need to be” rule from the Three Flaws of Software Design.

Essentially, I know exactly how generic my code needs to be by noticing that I’m tempted to cut and paste some code, and then instead of cutting and pasting it, designing a generic solution that meets just those two specific needs. I do this as soon as I’m tempted to have two implementations of something.

For example, let’s say I was designing an audio decoder, and at first I only supported WAV files. Then I wanted to add an MP3 parser to the code. There would definitely be common parts to the WAV and MP3 parsing code, and instead of copying and pasting any of it, I would immediately make a superclass or utility library that did only what I needed for those two implementations.

The key aspect of this is that I did it right away—I didn’t allow there to be two competing implementations; I immediately made one generic solution. The next important aspect of this is that I didn’t make it too generic—the solution only supports WAV and MP3 and doesn’t expect other formats in any way.

Another part of this rule is that a developer should ideally never have to modify one part of the code in a similar or identical way to how they just modified a different part of it. They should not have to “remember” to update Class A when they update Class B. They should not have to know that if Constant X changes, you have to update File Y. In other words, it’s not just two implementations that are bad, but also two locations. It isn’t always possible to implement systems this way, but it’s something to strive for.

If you find yourself in a situation where you have to have two locations for something, make sure that the system fails loudly and visibly when they are not “in sync.” Compilation should fail, a test that always gets run should fail, etc. It should be impossible to let them get out of sync.

And of course, the simplest part of this rule is the classic “Don’t Repeat Yourself” principle—don’t have two constants that represent the same exact thing, don’t have two functions that do the same exact thing, etc.

There are likely other ways that this rule applies. The general idea is that when you want to have two implementations of a single concept, you should somehow make that into a single implementation instead.

When refactoring, this rule helps find things that could be improved and gives some guidance on how to go about it. When you see duplicate logic in the system, you should attempt to combine those two locations into one. Then if there is another location, combine that one into the new generic system, and proceed in that manner. That is, if there are many different implementations that need to be combined into one, you can do incremental refactoring by combining two implementations at a time, as long as combining them does actually make the system simpler (easier to understand and maintain). Sometimes you have to figure out the best order in which to combine them to make this most efficient, but if you can’t figure that out, don’t worry about it—just combine two at a time and usually you’ll wind up with a single good solution to all the problems.

It’s also important not to combine things when they shouldn’t be combined. There are times when combining two implementations into one would cause more complexity for the system as a whole or violate the Single Responsibility Principle. For example, if your system’s representation of a Car and a Person have some slightly similar code, don’t solve this “problem” by combining them into a single CarPerson class. That’s not likely to decrease complexity, because a CarPerson is actually two different things and should be represented by two separate classes.

This isn’t a hard and fast law of the universe—it’s a more of a strong guideline that I use for making judgments about design as I develop incrementally. However, it’s quite useful in refactoring a legacy system, developing a new system, and just generally improving code simplicity.

-Max

The post Two is Too Many appeared first on Code Simplicity.

How to Handle Code Complexity in a Software Company

2015-01-04T11:46:25Z

Here’s an obvious statement that has some subtle consequences:

Only an individual programmer can resolve code complexity.

That is, resolving code complexity requires the attention of an individual person on that code. They can certainly use appropriate tools to make the task easier, but ultimately it’s the application of human intelligence, attention, and work that simplifies code.

So what? Why does this matter? Well, to be clearer:

Resolving code complexity usually requires detailed work at the level of the individual contributor.

If a manager just says “simplify the code!” and leaves it at that, usually nothing happens, because (a) they’re not being specific enough, (b) they don’t necessarily have the knowledge required about each individual piece of code in order to be that specific, and (c) part of understanding the problem is actually going through the process of solving it, and the manager isn’t the person writing the solution.

The higher a manager’s level in the company, the more true this is. When a CTO, Vice President, or Engineering Director gives an instruction like “improve code quality” but doesn’t get much more specific than that, what tends to happen is that a lot of motion occurs in the company but the codebase doesn’t significantly improve.

It’s very tempting, if you’re a software engineering manager, to propose broad, sweeping solutions to problems that affect large areas. The problem with that approach to code complexity is that the problem is usually composed of many different small projects that require detailed work from individual programmers. So, if you try to handle everything with the same broad solution, that solution won’t fit most of the situations that need to be handled. Your attempt at a broad solution will actually backfire, with software engineers feeling like they did a lot of work but didn’t actually produce a maintainable, simple codebase. (This is a common pattern in software management, and it contributes to the mistaken belief that code complexity is inevitable and nothing can be done about it.)

So what can you do as a manager, if you have a complex codebase and want to resolve it? Well, the trick is to get the data from the individual contributors and then work with them to help them resolve the issues. The sequence goes roughly like this:

Ask each member of your team to write down a list of what frustrates them about the code. The symptoms of code complexity are things like emotional reactions to code, confusions about code, feeling like a piece will break if you touch it, difficulties optimizing, etc. So you want the answers to questions like, “Is there a part of the system that makes you nervous when you modify it?” or “Is there some part of the codebase that frustrates you to work with?”
Each individual software engineer should write their own list. I wouldn’t recommend implementing some system for collecting the lists—just have people write down the issues for themselves in whatever way is easiest for them. Give them a few days to write this list; they might think of other things over time.

The list doesn’t just have to be about your own codebase, but can be about any code that the developer has to work with or use.

You’re looking for symptoms at this point, not causes. Developers can be as general or as specific as they want, for this list.
Call a meeting with your team and have each person bring their list and a computer that they can use to access the codebase. The ideal size for a team meeting like this is about six or seven people, so you might want to break things down into sub-teams.
In this meeting you want to go over the lists and get the name of a specific directory, file, class, method, or block of code to associate with each symptom. Even if somebody says something like, “The whole codebase has no unit tests,” then you might say, “Tell me about a specific time that that affected you,” and use the response to that to narrow down what files it’s most important to write unit tests for right away. You also want to be sure that you’re really getting a description of the problem, which might be something more like “It’s difficult to refactor the codebase because I don’t know if I’m breaking other people’s modules.” Then unit tests might be the solution, but you first want to narrow down specifically where the problem lies, as much as possible. (It’s true that almost all code should be unit tested, but if you don’t have any unit tests, you’ll need to start off with some doable task on the subject.)

In general, the idea here is that only code can actually be fixed, so you have to know what piece of code is the problem. It might be true that there’s a broad problem, but that problem can be broken down into specific problems with specific pieces of code that are affected, one by one.
Using the information from the meeting, file a bug describing the problem (not the solution, just the problem!) for each directory, file, class, etc. that was named. A bug could be as simple as “FrobberFactory is hard to understand.”
If a solution was suggested during the meeting, you can note that in the bug, but the bug itself should primarily be about the problem.
Now it’s time to prioritize. The first thing to do is to look at which issues affect the largest number of developers the most severely. Those are high priority issues. Usually this part of prioritization is done by somebody who has a broad view over developers in the team or company. Often, this is a manager.
That said, sometimes issues have an order that they should be resolved in that is not directly related to their severity. For example, Issue X has to be resolved before Issue Y can be resolved, or resolving Issue A would make resolving Issue B easier. This means that Issue A and Issue X should be fixed first even if they’re not as severe as the issues that they block. Often, there’s a chain of issues like this and the trick is to find the issue at the bottom of the stack. Handling this part of prioritization incorrectly is one of the most common and major mistakes in software design. It may seem like a minor detail, but in fact it is critical to the success of efforts to resolve complexity. The essence of good software design in all situations is taking the right actions in the right sequence. Forcing developers to tackle issues out of sequence (without regard for which problems underlie which other problems) will cause code complexity.

This part of prioritization is a technical task that is usually best done by the technical lead of the team. Sometimes this is a manager, but other times it’s a senior software engineer.

Sometimes you don’t really know which issue to tackle first until you’re doing development on one piece of code and you discover that it would be easier to fix a different piece of code first. With that said, if you can determine the ordering up front, it’s good to do so. But if you find that you’d have to get into actually figuring out solutions in order to determine the ordering, just skip it for now.

Whether you do it up front or during development, it’s important that individual programmers do realize when there is an underlying task to tackle before the one they have been assigned. They must be empowered to switch from their current task to the one that actually blocks them. There is a limit to this (for example, rewriting the whole system into another language just to fix one file is not a good use of time) but generally, “finding the issue at the bottom of the stack” is one of the most important tasks a developer has when doing these sorts of cleanups.
Now you assign each bug to an individual contributor. This is a pretty standard managerial process, and while it definitely involves some detailed work and communication, I would imagine that most software engineering managers are already familiar with how to do it.
One tricky piece here is that some of the bugs might be about code that isn’t maintained by your team. In that case you’ll have to work appropriately through the organization to get the appropriate team to take responsibility for the issue. It helps to have buy-in from a manager that you have in common with the other team, higher up the chain, here.

In some organizations, if the other team’s problem is not too complex or detailed, it might also be possible for your team to just make the changes themselves. This is a judgment call that you can make based on what you think is best for overall productivity.
Now that you have all of these bugs filed, you have to figure out when to address them. Generally, the right thing to do is to make sure that developers regularly fix some of the code quality issues that you filed along with their feature work.
If your team makes plans for a period of time like a quarter or six weeks, you should include some of the code cleanups in every plan. The best way to do this is to have developers first do cleanups that would make their specific feature work easier, and then have them do that feature work. Usually this doesn’t even slow down their feature work overall. (That is, if this is done correctly, developers can usually accomplish the same amount of feature work in a quarter that they could even if they weren’t also doing code cleanups, providing evidence that the code cleanups are already improving productivity.)

Don’t stop normal feature development entirely to just work on code quality. Instead, make sure that enough code quality work is being done continuously that the codebase’s quality is always improving overall rather than getting worse over time.

If you do those things, that should get you well on the road to an actually-improving codebase. There’s actually quite a bit to know about this process in general—perhaps enough for another entire book. However, the above plus some common sense and experience should be enough to make major improvements in the quality of your codebase, and perhaps even improve your life as a software engineer or manager, too.

-Max

P.S. If you do find yourself wanting more help on it, I’d be happy to come speak at your company. Just let me know.

The post How to Handle Code Complexity in a Software Company appeared first on Code Simplicity.

Test-Driven Development and the Cycle of Observation

2014-06-13T23:31:35Z

Today there was an interesting discussion between Kent Beck, Martin Fowler, and David Heinemeier Hansson on the nature and use of Test-Driven Development (TDD), where one writes tests first and then writes code.

Each participant in the conversation had different personal preferences for how they write code, which makes sense. However, from each participant’s personal preference you could extract an identical principle: “I need to observe something before I can make a decision.” Kent often (though not always) liked writing tests first so that he could observe their behavior while coding. David often (though not always) wanted to write some initial code, observe that to decide on how to write more code, and so on. Even when they talked about their alternative methods (Kent talking about times he doesn’t use TDD, for example) they still always talked about having something to look at as an inherent part of the development process.

It’s possible to minimize this point and say it’s only relevant to debugging or testing. It’s true that it’s useful in those areas, but when you talk to many senior developers you find that this idea is actually a fundamental basis of their whole development workflow. They want to see something that will help them make decisions about their code. It’s not something that only happens when code is complete or when there’s an bug—it’s something that happens at every moment of the software lifecycle.

This is such a broad principle that you could say the cycle of all software development is:

Observation → Decision → Action → Observation → Decision → Action → etc.

If you want a term for this, you could call it the “Cycle of Observation” or “ODA.”

Example

What do I mean by all of this? Well, let’s take some examples to make it clearer. When doing TDD, the cycle looks like:

See a problem (observation).
Decide to solve the problem (decision).
Write a test (action).
Look at the test and see if the API looks good (observation).
If it doesn’t look good, decide how to fix it (decision), change the test (action), and repeat Observation → Decision → Action until you like what the API looks like.
Now that the API looks good, run the test and see that it fails (observation).
Decide how you’re going to make the test pass (decision).
Write some code (action).
Run the test and see that it passes or fails (observation).
If it fails, decide how to fix it (decision) and write some code (action) until the test passes (observation).
Decide what to work on next, based on principles of software design, knowledge of the problem, or the data you gained while writing the previous code (decision).
And so on.

Another valid way to go about this would be to write the code first. The difference from the above sequence is that Step 3 would be “write some code” rather than “write a test.” Then you observe the code itself to make further decisions, or you write tests after the code and observe those.

There are many valid processes.

Development Processes and Productivity

What’s interesting is that, as far as I know, every valid development process follows this cycle as its primary guiding principle. Even large-scale processes like Agile that cover a whole team have this built into them. In fact, Agile is to some degree an attempt to have shorter Observation-Decision-Action cycles (every few weeks) for a team than previous broken models (Waterfall, aka “Big Design Up Front”) which took months or years to get through a single cycle.

So, shorter cycles seem to be better than longer cycles. In fact, it’s possible that most of the goal of developer productivity could be accomplished simply by shortening the ODA cycle down to the smallest reasonable time period for the developer, the team, or the organization.

Usually you can accomplish these shorter cycles just by focusing on the Observation step. Once you’ve done that, the other two parts of the cycle tend to speed up on their own. (If they don’t, there are other remedies, but that’s another post.)

There are three key factors to address in Observation:

The speed with which information can be delivered to developers. (For example, having fast tests.)
The completeness of information delivered to the developers. (For example, having enough test coverage.)
The accuracy of information delivered to developers. (For example, having reliable tests.)

This helps us understand the reasons behind the success of certain development tools in recent decades. Continuous Integration, production monitoring systems, profilers, debuggers, better error messages in compilers, IDEs that highlight bad code—almost everything that’s “worked” has done so because it made Observation faster, more accurate, or more complete.

There is one catch—you have to deliver the information in such a way that it can actually be received by people. If you dump a huge sea of information on people without making it easy for them to find the specific data they care about, the data becomes useless. If nobody ever receives a production alert, then it doesn’t matter. If a developer is never sure of the accuracy of information received, then they may start to ignore it. You must successfully communicate the information, not just generate it.

The First ODA

There is a “big ODA cycle” that represents the whole process of software development—seeing a problem, deciding on a solution, and delivering it as software. Within that big cycle there are many smaller ones (see the need for a feature, decide on how the feature should work, and then write the feature). There are even smaller cycles within that (observe the requirements for a single change, decide on an implementation, write some code), and so on.

The trickiest part is the first ODA cycle in any of these sequences, because you have to make an observation with no previous decision or action.

For the “big” cycle, it may seem like you start off with nothing to observe. There’s no code or computer output to see yet! But in reality, you start off with at least yourself to observe. You have your environment around you. You have other people to talk to, a world to explore. Your first observations are often not of code, but of something to solve in the real world that will help people somehow.

Then when you’re doing development, sometimes you’ll come to a point where you have to decide “what do I work on next?” This is where knowing the laws of software design can help, because you can apply them to the code you’ve written and the problem you observed, which lets you decide on the sequence to work in. You can think of these principles as a form of observation that comes second-hand—the experience of thousands of person-years compressed into laws and rules that can help you make decisions now. Second-hand observation is completely valid observation, as long as it’s accurate.

You can even view even the process of Observation as its own little ODA cycle: look at the world, decide to put your attention on something, put your attention on that thing, observe it, decide based on that to observe something else, etc.

There are likely infinite ways to use this principle; all of the above represents just a few examples.

-Max

The post Test-Driven Development and the Cycle of Observation appeared first on Code Simplicity.

The Purpose of Technology

2014-05-10T08:12:12Z

In general, when technology attempts to solve problems of matter, energy, space, or time, it is successful. When it attempts to solve human problems of the mind, communication, ability, etc. it fails or backfires dangerously.

For example, the Internet handled a great problem of space—it allowed us to communicate with anybody in the world, instantly. However, it did not make us better communicators. In fact, it took many poor communicators and gave them a massive platform on which they could spread hatred and fear. This isn’t me saying that the Internet is all bad—I’m actually quite fond of it, personally. I’m just giving an example to demonstrate what types of problems technology does and does not solve successfully.

The reason this principle is useful is that it tells us in advance what kind of software purposes or startup ideas are more likely to be successful. Companies that focus on solving human problems with technology are likely to fail. Companies that focus on resolving problems that can be expressed in terms of material things at least have the possibility of success.

There can be some seeming counter-examples to this rule. For example, isn’t the purpose of Facebook to connect people? That sounds like a human problem, and Facebook is very successful. But connecting people is not actually what Facebook does. It provides a medium through which people can communicate, but it doesn’t actually create or cause human connection. In fact, most people I know seem to have a sort of uncomfortable feeling of addiction surrounding Facebook—the sense that they are spending more time there than is valuable for them as people. So I’d say that it’s exacerbating certain human problems (like a craving for connection) wherever it focuses on solving those problems. But it’s achieving other purposes (removing space and time from broad communication) excellently. Once again, this isn’t an attack on Facebook, which I think is a well-intentioned company; it’s an attempt to make an objective analysis of what aspects of its purpose are successful using the principle that technology only solves physical problems.

This principle is also useful in clarifying whether or not the advance of technology is “good.” I’ve had mixed feelings at times about the advance of technology—was it really giving us a better world, or was it making us all slaves to machines? The answer is that technology is neither inherently good nor bad, but it does tend towards evil when it attempts to solve human problems, and it does tend toward good when it focuses on solving problems of the material universe. Ultimately, our current civilization could not exist without technology, which includes things like public sanitation systems, central heating, running water, electrical grids, and the very computer that I am writing this essay on. Technology is in fact a vital force that is necessary to our existence, but we should remember that it is not the answer to everything—it’s not going to make us better people, but it can make us live in a better world.

-Max

The post The Purpose of Technology appeared first on Code Simplicity.

The Secret of Fast Programming: Stop Thinking

2017-04-05T05:43:31Z

When I talk to developers about code complexity, they often say that they want to write simple code, but deadline pressure or underlying issues mean that they just don’t have the time or knowledge necessary to both complete the task and refine it to simplicity.

Well, it’s certainly true that putting time pressure on developers tends to lead to them writing complex code. However, deadlines don’t have to lead to complexity. Instead of saying “This deadline prevents me from writing simple code,” one could equally say, “I am not a fast-enough programmer to make this simple.” That is, the faster you are as a programmer, the less your code quality has to be affected by deadlines.

Now, that’s nice to say, but how does one actually become faster? Is it a magic skill that people are born with? Do you become fast by being somehow “smarter” than other people?

No, it’s not magic or in-born at all. In fact, there is just one simple rule that, if followed, will eventually solve the problem entirely:

Any time you find yourself stopping to think, something is wrong.

Perhaps that sounds incredible, but it works remarkably well. Think about it—when you’re sitting in front of your editor but not coding very quickly, is it because you’re a slow typer? I doubt it—“having to type too much” is rarely a developer’s productivity problem. Instead, the pauses where you’re not typing are what make it slow. And what are developers usually doing during those pauses? Stopping to think—perhaps about the problem, perhaps about the tools, perhaps about email, whatever. But any time this happens, it indicates a problem.

The thinking is not the problem itself—it is a sign of some other problem. It could be one of many different issues:

Understanding

The most common reason developers stop to think is that they did not fully understand some word or symbol.

This happened to me just the other day. It was taking me hours to write what should have been a really simple service. I kept stopping to think about it, trying to work out how it should behave. Finally, I realized that I didn’t understand one of the input variables to the primary function. I knew the name of its type, but I had never gone and read the definition of the type—I didn’t really understand what that variable (a word or symbol) meant. As soon as I looked up the type’s code and docs, everything became clear and I wrote that service like a demon (pun partially intended).

This can happen in almost infinite ways. Many people dive into a programming language without learning what (, ), [, ], {, }, +, *, and % really mean in that language. Some developers don’t understand how the computer really works. Remember when I wrote The Singular Secret of the Rockstar Programmer? This is why! Because when you truly understand, you don’t have to stop to think. It’s also a major motivation behind my book—understanding that there are unshakable laws to software design can eliminate a lot of the “stopping to think” moments.

So if you find that you are stopping to think, don’t try to solve the problem in your mind—search outside of yourself for what you didn’t understand. Then go look at something that will help you understand it. This even applies to questions like “Will a user ever read this text?” You might not have a User Experience Research Department to really answer that question, but you can at least make a drawing, show it to somebody, and ask their opinion. Don’t just sit there and think—do something. Only action leads to understanding.

Drawing

Sometimes developers stop to think because they can’t hold enough concepts in their mind at once—lots of things are relating to each other in a complex way and they have to think through it. In this case, it’s almost always more efficient to write or draw something than it is to think about it. What you want is something you can look at, or somehow perceive outside of yourself. This is a form of understanding, but it’s special enough that I wanted to call it out on its own.

Starting

Sometimes the problem is “I have no idea what code to start writing.” The simplest solution here is to just start writing whatever code you know that you can write right now. Pick the part of the problem that you understand completely, and write the solution for that—even if it’s just one function, or an unimportant class.

Often, the simplest piece of code to start with is the “core” of the application. For example, if I was going to write a YouTube app, I would start with the video player. Think of it as an exercise in continuous delivery—write the code that would actually make a product first, no matter how silly or small that product is. A video player without any other UI is a product that does something useful (play video), even if it’s not a complete product yet.

If you’re not sure how to write even that core code yet, then just start with the code you are sure about. Generally I find that once a piece of the problem becomes solved, it’s much easier to solve the rest of it. Sometimes the problem unfolds in steps—you solve one part, which makes the solution of the next part obvious, and so forth. Whichever part doesn’t require much thinking to create, write that part now.

Skipping a Step

Another specialized understanding problem is when you’ve skipped some step in the proper sequence of development. For example, let’s say our Bike object depends on the Wheels, Pedals, and Frame objects. If you try to write the whole Bike object without writing the Wheels, Pedals, or Frame objects, you’re going to have to think a lot about those non-existent classes. On the other hand, if you write the Wheels class when there is no Bike class at all, you might have to think a lot about how the Wheels class is going to be used by the Bike class.

The right solution there would be to implement enough of the Bike class to get to the point where you need Wheels. Then write enough of the Wheels class to satisfy your immediate need in the Bike class. Then go back to the Bike class, and work on that until the next time you need one of the underlying pieces. Just like the “Starting” section, find the part of the problem that you can solve without thinking, and solve that immediately.

Don’t jump over steps in the development of your system and expect that you’ll be productive.

Physical Problems

If I haven’t eaten enough, I tend to get distracted and start to think because I’m hungry. It might not be thoughts about my stomach, but I wouldn’t be thinking if I were full—I’d be focused. This can also happen with sleep, illness, or any sort of body problem. It’s not as common as the “understanding” problem from above, so first always look for something you didn’t fully understand. If you’re really sure you understood everything, then physical problems could be a candidate.

Distractions

When a developer becomes distracted by something external, such as noise, it can take some thinking to remember where they were in their solution. The answer here is relatively simple—before you start to develop, make sure that you are in an environment that will not distract you, or make it impossible for distractions to interrupt you. Some people close the door to their office, some people put on headphones, some people put up a “do not disturb” sign—whatever it takes. You might have to work together with your manager or co-workers to create a truly distraction-free environment for development.

Self-doubt

Sometimes a developer sits and thinks because they feel unsure about themselves or their decisions. The solution to this is similar to the solution in the “Understanding” section—whatever you are uncertain about, learn more about it until you become certain enough to write code. If you just feel generally uncertain as a programmer, it might be that there are many things to learn more about, such as the fundamentals listed in Why Programmers Suck. Go through each piece you need to learn until you really understand it, then move on to the next piece, and so on. There will always be learning involved in the process of programming, but as you know more and more about it, you will become faster and faster and have to think less and less.

False Ideas

Many people have been told that thinking is what smart people do, thus, they stop to think in order to make intelligent decisions. However, this is a false idea. If thinking alone made you a genius, then everybody would be Einstein. Truly smart people learn, observe, decide, and act. They gain knowledge and then use that knowledge to address the problems in front of them. If you really want to be smart, use your intelligence to cause action in the physical universe—don’t use it just to think great thoughts to yourself.

Caveat

All of the above is the secret to being a fast programmer when you are sitting and writing code. If you are caught up all day in reading email and going to meetings, then no programming happens whatsoever—that’s a different problem. Some aspects of it are similar (it’s a bit like the organization “stopping to think,”) but it’s not the same.

Still, there are some analogous solutions you could try. Perhaps the organization does not fully understand you or your role, which is why they’re sending you so much email and putting you in so many meetings. Perhaps there’s something about the organization that you don’t fully understand, such as how to go to fewer meetings and get less email. Maybe even some organizational difficulties can be resolved by adapting the solutions in this post to groups of people instead of individuals.

-Max

The post The Secret of Fast Programming: Stop Thinking appeared first on Code Simplicity.

Make It Never Come Back

2013-12-09T05:55:40Z

When solving a problem in a codebase, you’re not done when the symptoms stop. You’re done when the problem has disappeared and will never come back.

It’s very easy to stop solving a problem when it no longer has any visible symptoms. You’ve fixed the bug, nobody is complaining, and there seem to be other pressing issues. So why continue to do work on it? It’s fine for now, right?

No. Remember that what we care about the most in software is the future. The way that software companies get into unmanageable situations with their codebases is not really handling problems until they are done.

This also explains why some organizations cannot get their tangled codebase back into a good state. They see one problem in the code, they tackle it until nobody’s complaining anymore, and then they move on to tackling the next symptom they see. They don’t put a framework in place to make sure the problem is never coming back. They don’t trace the problem to its source and then make it vanish. Thus their codebase never really becomes “healthy.”

This pattern of failing to fully handle problems is very common. As a result, many developers believe it is impossible for large software projects to stay well-designed–they say, “All software will eventually have to be thrown away and re-written.”

This is not true. I have spent most of my career either designing sustainable codebases from scratch or refactoring bad codebases into good ones. No matter how bad a codebase is, you can resolve its problems. However, you have to understand software design, you need enough manpower, and you have to handle problems until they will never come back.

In general, a good guideline for how resolved a problem has to be is:

A problem is resolved to the degree that no human being will ever have to pay attention to it again.

Accomplishing this in an absolute sense is impossible–you can’t predict the entire future, and so on–but that’s more of a philosophical objection than a practical one. In most practical circumstances you can effectively resolve a problem to the degree that nobody has to pay attention to it now and there’s no immediately-apparent reason they’d have to pay attention to it in the future either.

Example

Let’s say you have a web page and you write a “hit counter” for the site that tracks how many people have visited it. You discover a bug in the hit counter–it’s counting 1.5 times as many visits as it should be counting. You have a few options for how you could solve this:

You could ignore the problem.

The rationale here would be that your site isn’t very popular and so it doesn’t matter if your hit counter is lying. Also, it’s making your site look more successful than it is, which might help you.

The reason this is a bad solution is that there are many future scenarios in which this could again become a problem–particularly if your site becomes very successful. For example, a major news publication publishes your hit numbers–but they are false. This causes a scandal, your users lose trust in you (after all, you knew about the problem and didn’t solve it) and your site becomes unpopular again. One could easily imagine other ways this problem could come back to haunt you.

You could hack a quick solution.

When you display the hits, just divide them by 1.5 and the number is accurate. However, you didn’t investigate the underlying cause, which turns out to be that it counts 3x as many hits from 8:00 to 11:00 in the morning. Later your traffic pattern changes and your counter is completely wrong again. You might not even notice for a while because the hack will make it harder to debug.

Investigate and resolve the underlying cause.

You discover it’s counting 3x hits from 8:00 to 11:00. You discover this happens because your web server deletes many old files from the disk during that time, and that interferes with the hit counter for some reason.

At this point you have another opportunity to hack a solution–you could simply disable the deletion process or make it run less frequently. But that’s not really tracing down the underlying cause. What you want to know is, “Why does it miscount just because something else is happening on the machine?”

Investigating further, you discover that if you interrupt the program and then restart it, it will count the last visit again. The deletion process was using so many resources on the machine that it was interrupting the counter two times for every visit between 8:00 and 11:00. So it counted every visit three times during that period. But actually, the bug could have added infinite (or at least unpredictable) counts depending on the load on the machine.

You redesign the counter so that it counts reliably even when interrupted, and the problem disappears.

Obviously the right choice from that list is to investigate the underlying cause and resolve it. That causes the problem to vanish, and most developers would believe they are done there. However, there’s still more to do if you really want to be sure the problem will never again require human attention.

First off, somebody could come along and change the code of the hit counter, reverting it back to a broken state in the future. Obviously the right solution for that is to add an automated test that assures the correct functioning of the hit counter even when it is interrupted. Then you make sure that test runs continuously and alerts developers when it fails. Now you’re done, right?

Nope. Even at this point, there are some future risks that have to be handled.

The next issue is that the test you’ve written has to be easy to maintain. If the test is hard to maintain–it changes a lot when developers change the code, the test code itself is cryptic, it would be easy for it to return a false positive if the code changes, etc.–then there’s a good chance the test will break or somebody will disable it in the future. Then the problem could again require human attention. So you have to assure that you’ve written a maintainable test, and refactor the test if it’s not maintainable. This may lead you down another path of investigation into the test framework or the system under test, to figure out a refactoring that would make the test code simpler.

After this you have concerns like the continuous integration system (the test runner)–is it reliable? Could it fail in a way that would make your test require human attention? This could be another path of investigation.

All of these paths of investigation may turn up other problems that then have to be traced down to their sources, which may turn up more problems to trace down, and so on. You may find that you can discover (and possibly resolve) all your codebase’s major issues just by starting with a few symptoms and being very determined about tracing down underlying causes.

Does anybody really do this? Yes. It might seem difficult at first, but as you resolve more and more of these underlying issues, things really do start to get easier and you can move faster and faster with fewer and fewer problems.

Down the Rabbit Hole

Beyond all of this, if you really want to get adventurous, there’s one more question you can ask: why did the developer write buggy code in the first place? Why was it possible for a bug to ever exist? Is it a problem with the developer’s education? Was it something about their process? Should they be writing tests as they go? Was there some design problem in the system that made it hard to modify? Is the programming language too complex? Are the libraries they’re using not well-written? Is the operating system not behaving well? Was the documentation unclear?

Once you get your answer, you can ask what the underlying cause of that problem is, and continue asking that question until you’re satisfied. But beware: this can take you down a rabbit hole and into a place that changes your whole view of software development. In fact, theoretically this system is unlimited, and would eventually result in resolving the underlying problems of the entire software industry. How far you want to go is up to you.

The post Make It Never Come Back appeared first on Code Simplicity.

The Philosophy of Testing

2013-05-10T17:53:36Z

Much like we gain knowledge about the behavior of the physical universe via the scientific method, we gain knowledge about the behavior of our software via a system of assertion, observation, and experimentation called “testing.”

There are many things one could desire to know about a software system. It seems that most often we want to know if it actually behaves like we intended it to behave. That is, we wrote some code with a particular intention in mind, does it actually do that when we run it?

In a sense, testing software is the reverse of the traditional scientific method, where you test the universe and then use the results of that experiment to refine your hypothesis. Instead, with software, if our “experiments” (tests) don’t prove out our hypothesis (the assertions the test is making), we change the system we are testing. That is, if a test fails, it hopefully means that our software needs to be changed, not that our test needs to be changed. Sometimes we do also need to change our tests in order to properly reflect the current state of our software, though. It can seem like a frustrating and useless waste of time to do such test adjustment, but in reality it’s a natural part of this two-way scientific method–sometimes we’re learning that our tests are wrong, and sometimes our tests are telling us that our system is out of whack and needs to be repaired.

This tells us a few things about testing:

The purpose of a test is to deliver us knowledge about the system, and knowledge has different levels of value. For example, testing that 1 + 1 still equals two no matter what time of day it is doesn’t give us valuable knowledge. However, knowing that my code still works despite possible breaking changes in APIs I depend on could be very useful, depending on the context. In general, one must know what knowledge one desires before one can create an effective and useful test, and then must judge the value of that information appropriately to understand where to put time and effort into testing.
Given that we want to know something, in order for a test to be a test, it must be asserting something and then informing us about that assertion. Human testers can make qualitative assertions, such as whether or not a color is attractive. But automated tests must make assertions that computers can reliably make, which usually means asserting that some specific quantitative statement is true or false. We are trying to learn something about the system by running the test–whether the assertion is true or false is the knowledge we are gaining. A test without an assertion is not a test.
Every test has certain boundaries as an inherent part of its definition. Much like you couldn’t design a single experiment to prove all the theories and laws of physics, it would be prohibitively difficult to design a single test that actually validated all the behaviors of any complex software system at once. If it seems that you have made such a test, most likely you’ve combined many tests into one and those tests should be split apart. When designing a test, you should know what it is actually testing and what it is not testing.
Every test has a set of assumptions built into it, which it relies on in order to be effective within its boundaries. For example, if you are testing something that relies on access to a database, your test might make the assumption that the database is up and running (because some other test has already checked that that part of the code works). If the database is not up and running, then the test neither passes nor fails–it instead provides you no knowledge at all. This tells us that all tests have at least three results–pass, fail, and unknown. Tests with an “unknown” result must not say that they failed–otherwise they are claiming to give us knowledge when in fact they are not.
Because of these boundaries and assumptions, we need to design our suite of tests in such a way that the full set, when combined, actually gives us all of the knowledge we want to gain. That is, each individual test only gives us knowledge within its boundaries and assumptions, so how do we overlap those boundaries so that they reliably inform us about the real behavior of the entire system? The answer to this question may also affect the design of the software system being tested, as some designs are harder to completely test than others.

This last point leads us into the many methods of testing being practiced today, in particular end to end testing, integration testing, and unit testing.

End to End Testing

“End to end” testing is where you make an assertion that involves one complete “path” through the logic of the system. That is, you start up the whole system, perform some action at the entry point of user input, and check the result that the system produces. You don’t care how things work internally to accomplish this goal, you just care about the input and result. That is generally true for all tests, but here we’re testing at the outermost point of input into the system and checking the outermost result that it produces, only.

An example end to end test for creating a user account in a typical web application would be to start up a web server, a database, and a web browser, and use the web browser to actually load the account creation web page, fill it in, and submit it. Then you would assert that the resulting page somehow tells us the account was created successfully.

The idea behind end to end testing is that we gain fully accurate knowledge about our assertions because we are testing a system that is as close to “real” and “complete” as possible. All of its interactions and all of its complexity along the path we are testing are covered by the test.

The problem of using only end to end testing is that it makes it very difficult to actually get all of the knowledge about the system that we might desire. In any complex software system, the number of interacting components and the combinatorial explosion of paths through the code make it difficult or impossible to actually cover all the paths and make all the assertions we want to make.

It can also be difficult to maintain end to end tests, as small changes in the system’s internals lead to many changes in the tests.

End to end tests are valuable, particularly as an initial stopgap for a system that entirely lacks tests. They are also good as sanity checks that your whole system behaves properly when put together. They have an important place in a test suite, but they are not, by themselves, a good long-term solution for gaining full knowledge of a complex system.

If a system is designed in such a way that it can only be tested via end-to-end tests, that is a symptom of broad architectural problems in the code. These issues should be addressed through refactoring until one of the other testing methods can be used.

Integration Testing

This is where you take two or more full “components” of a system and specifically test how they behave when “put together.” A component could be a code module, a library that your system depends on, a remote service that provides you data–essentially any part of the system that can be conceptually isolated from the rest of the system.

For example, in a web application where creating an account sends the new user an email, one might have a test that runs the account creation code (without going through a web page, just exercising the code directly) and checks that an email was sent. Or one might have a test that checks that account creation succeeds when one is using a real database–that “integrates” account creation and the database. Basically this is any test that is explicitly checking that two or more components behave properly when used together.

Compared to end to end testing, integration testing involves a bit more isolation of components as opposed to just running a test on the whole system as a “black box.”

Integration testing doesn’t suffer as badly from the combinatorial explosion of test paths that end to end testing faces, particularly when the components being tested are simple and thus their interactions are simple. If two components are hard to integration test due to the complexity of their interactions, this indicates that perhaps one or both of them should be refactored for simplicity.

Integration testing is also usually not a sufficient testing methodology on its own, as doing an analysis of an entire system purely through the interactions of components means that one must test a very large number of interactions in order to have a full picture of the system’s behavior. There is also a maintenance burden with integration testing similar to end to end testing, though not as bad–when one makes a small change in one component’s behavior, one might have to then update the tests for all the other components that interact with it.

Unit Testing

This is where you take one component alone and test that it behaves properly. In our account creation example, we could have a series of unit tests for the account creation code, a separate series of unit tests for the email sending code, a separate series of unit tests for the web page where users fill in their account information, and so on.

Unit testing is most valuable when you have a component that presents strong guarantees to the world outside of itself and you want to validate those guarantees. For example, a function’s documentation says that it will return the number “1” if passed the parameter “0.” A unit test would pass this function the parameter “0” and assert that it returned the number “1.” It would not check how the code inside of the component behaved–it would only check that the function’s guarantees were met.

Usually, a unit test is testing one behavior of one function in one class/module. One creates a set of unit tests for a class/module that, when you run them all, cover all behavior that you want to verify in that module. This almost always means testing only the public API of the system, though–unit tests should be testing the behavior of the component, not its implementation.

Theoretically, if all components of the system fully define their behavior in documentation, then by testing that each component is living up to its documented behavior, you are in fact testing all possible behaviors of the entire system. When you change the behavior of one component, you only have to update a minimal set of tests around that component.

Obviously, unit testing works best when the system’s components are reasonably separate and are simple enough that it’s possible to fully define their behavior.

It is often true that if you cannot fully unit test a system, but instead have to do integration testing or end to end testing to verify behavior, some design change to the system is needed. (For example, components of the system may be too entangled and may need more isolation from each other.) Theoretically, if a system were well-isolated and had guarantees for all of the behavior of every function in the system, then no integration testing or end to end testing would be necessary. Reality is often a little different, though.

Reality

In reality, there is a scale of testing that has infinite stages between Unit Testing and End to End testing. Sometimes you’re a bit between unit testing and integration testing. Sometimes your test falls somewhere between an integration test and an end to end test. Real systems usually require all sorts of tests along this scale in order to understand their behavior reliably.

For example, sometimes you’re testing only one part of the system but its internals depend on other parts of the system, so you’re implicitly testing those too. This doesn’t make your test an Integration Test, it just makes it a unit test that is also testing other internal components implicitly–slightly larger than a unit test, and slightly smaller than an integration test. In fact, this is the sort of testing that is often the most effective.

Fakes

Some people believe that in order to do true “unit testing” you must write code in your tests that isolates the component you are testing from every other component in the system–even that component’s internal dependencies. Some even believe that this “true unit testing” is the holy grail that all testing should aspire to. This approach is often misguided, for the following reasons:

One advantage of having tests for individual components is that when the system changes, you have to update fewer unit tests than you have to update with integration tests or end to end tests. If you make your tests more complex in order to isolate the component under test, that complexity could defeat this advantage, because you’re adding more test code that has to be kept up to date anyway.
For example, imagine you want to test an email sending module that takes an object representing a user of the system, and an sends email to that user. You could invent a “fake” user object–a completely separate class–just for your test, out of the belief that you should be “just testing the email sending code and not the user code.” But then when the real User class changes its behavior, you have to update the behavior of the fake User class–and a developer might even forget to do this, making your email sending test now invalid because its assumptions (the behavior of the User object) are invalid.
The relationships between a component and its internal dependencies are often complex, and if you’re not testing its real dependencies, you might not be testing its real behavior. This sometimes happens when developers fail to keep “fake” objects in sync with real objects, but it can also happen via failing to make a “fake” object as genuinely complex and full-featured as the “real” object.
For example, in our email sending example above, what if real users could have seven different formats of username but the fake object only had one format, and this affected the way email sending worked? (Or worse, what if this didn’t affect email sending behavior when the test was originally written, but it did affect email sending behavior a year later and nobody noticed that they had to update the test?) Sure, you could update the fake object to have equal complexity, but then you’re adding even more of a maintenance burden for the fake object.
Having to add too many “fake” objects to a test indicates that there is a design problem with the system that should be addressed in the code of the system instead of being “worked around” in the tests. For example, it could be that components are too entangled–the rules of “what is allowed to depend on what” or “what are the layers of the system” might not be well-defined enough.

In general, it is not bad to have “overlap” between tests. That is, you have a test for the public APIs of the User code, and you have a test for the public APIs of the email sending code. The email sending code uses real User objects and thus also does a small bit of implicit “testing” on the User objects, but that overlap is okay. It’s better to have overlap than to miss areas that you want to test.

Isolation via “fakes” is sometimes useful, though. One has to make a judgment call and be aware of the trade-offs above, attempting to mitigate them as much as possible via the design of your “fake” instances. In particular, fakes are worthwhile to add two properties to a test–determinism and speed.

Determinism

If nothing about the system or its environment changes, then the result of a test should not change. If a test is passing on my system today but failing tomorrow even though I haven’t changed the system, then that test is unreliable. In fact, it is invalid as a test because its “failures” are not really failures–they’re an “unknown” result disguised as knowledge. We say that such tests are “flaky” or “non-deterministic.”

Some aspects of a system are genuinely non-deterministic. For example, you might generate a random string based on the time of day, and then show that string on a web page. In order to test this reliably, you would need two tests:

A test that uses the random-string generation code over and over to make sure that it properly generates random strings.
A test for the web page that uses a fake random-string generator that always returns the same string, so that the web page test is deterministic.

Of course, you would only need the fake in that second test if verifying the exact string in the web page was an important assertion. It’s not that everything about a test needs to be deterministic–it’s that the assertions it is making need to always be true or always be false if the system itself hasn’t changed. If you weren’t asserting anything about the string, the size of the web page, etc. then you would not need to make the string generation deterministic.

Speed

One of the most important uses of tests is that developers run them while they are editing code, to see if the new code they’ve written is actually working. As tests become slower, they become less and less useful for this purpose. Or developers continue to use them but start writing code more and more slowly because they keep having to wait for the tests to finish.

In general, a test suite should not take so long that a developer becomes distracted from their work and loses focus while they wait for it to complete. Existing research indicates this takes somewhere between 2 and 30 seconds for most developers. Thus, a test suite used by developers during code editing should take roughly that length of time to run. It might be okay for it to take a few minutes, but that wouldn’t be ideal. It would definitely not be okay for it to take ten minutes, under most circumstances.

There are other reasons to have fast tests beyond just the developer’s code editing cycle. At the extreme, slow tests can become completely useless if they only deliver their result after it is needed. For example, imagine a test that took so long, you only got the result after you had already released the product to users. Slow tests affect lots of processes in a software engineering organization–it’s simplest for them just to be fast.

Sometimes there is some behavior that is inherently slow in a test. For example, reading a large file off of a disk. It can be okay to make a test “fake” out this slow behavior–for example, by having the large file in memory instead of on the disk. Like with all fakes, it is important to understand how this affects the validity of your test and how you will maintain this fake behavior properly over time.

It is sometimes also useful to have an extra suite of “slow” tests that aren’t run by developers while they edit code, but are run by an automated system after code has been checked in to the version control system, or run by a developer right before they check in their code. That way you get the advantage of a fast test suite that developers can use while editing, but also the more-complete testing of real system behavior even if testing that behavior is slow.

Coverage

There are tools that run a test suite and then tell you which lines of system code actually got run by the tests. They say that this tells you the “test coverage” of the system. These can be useful tools, but it is important to remember that they don’t tell you if those lines were actually tested, they only tell you that those lines of code were run. If there is no assertion about the behavior of that code, then it was never actually tested.

Overall

There are many ways to gain knowledge about a system, and testing is just one of them. We could also read its code, look at its documentation, talk to its developers, etc., and each of these would give us a belief about how the system behaves. However, testing validates our beliefs, and thus is particularly important out of all of these methods.

The overall goal of testing is to gain valid knowledge about the system. This goal overrides all other principles of testing–any testing method is valid as long as it produces that result. However, some testing methods are more efficient–they make it easier to create and maintain tests which produce all the information we desire. These methods should be understood and used appropriately, as your judgment dictates and as they apply to the specific system you’re testing.

-Max

The post The Philosophy of Testing appeared first on Code Simplicity.

Users Have Problems, Developers Have Solutions

2013-05-02T10:18:29Z

In the world of software, it is the job of software developers to solve the problems of users. Users present a problem, and the developers solve it. Whenever these roles are reversed, trouble ensues.

If you ever want to see a bloated, useless, complex piece of software, find one where the developers implemented every solution that any user ever suggested. It’s true that the users are the people who know what the problem is, and that sometimes, they have novel ideals for solutions. But the people making the final decision on how a problem is to be solved should always be the developers of the system, not its users.

This problem can be particularly bad when you’re writing software for a small number of users internally at an organization. The users who you are writing for often have inordinate power over you, by virtue of being executives or being close to executives. They can, quite literally, tell you what to do. However, if they want a solution that is actually good for them, they should try to refrain from this practice. If you trust a team enough to have them write software for you, then you should also trust them enough to make decisions about that software. If you don’t trust them, why are they working at your organization? A group of people who distrust each other is usually a highly inefficient group–perhaps not even really a “group” at all, but merely a collection of individuals all trying to defend themselves from each other. That’s no way to run an organization or to have anybody in it lead a happy life.

If a user wants to influence a developer’s decision, the best thing they can do is offer data. Developers need information in order to make good decisions for their users, and that information often comes from the users themselves. If you as a user think that a piece of software is going the wrong direction, provide information about the problem that you would like solved, and explain why the current software doesn’t solve it. Get information about how many other people have this problem. The best is if you can show numbers, but sometimes even anecdotes can be helpful when a developer is trying to make a decision. Developers should judge data appropriately (hard data about lots of users is obviously better than an anecdote from a single user) but they usually appreciate all the information given to them when it’s offered as data and not as a demand for a specific solution.

Developers, on the other hand, often have the opposite problem. If you want to see a piece of software that users hate, find one where the developers simply imagined that the users had a problem, and then started developing a solution for that problem. Problems come from users, not from developers. Sometimes the developers of a piece of software are also users of it, and they can see obvious problems that they themselves are experiencing. That’s fine, but they should offer that up as data, from the viewpoint of a user, and make sure that it’s something that other people are also actually experiencing. Developers should treat their own opinions as somewhat more valuable than the average user’s (because they see lots of user feedback and they work with their program day in and day out) but still as an opinion that came from a user.

When you solve the developers’ problems instead of the users’ problems, you’re putting lots of effort into something that isn’t going to help people in the best possible way. It may be enjoyable to assert one’s opinion, be the smartest person in the room, and cause the team to solve your problem, but it feels terrible to release software that ends up not helping people. Also, I usually find that solving the developers’ problems leads to a lot more complexity than solving the users’ problems. So it would actually have been easier to find out from the user what was wrong and fix that than to imagine a problem and grind away at it.

Now, I’m not saying that no developer has ever come up with a valid problem, and that no user has ever come up with a valid solution. Sometimes these things do happen. But the judgment about these things should lie on the appropriate sides of the equation. Only users (and preferably, a large number of them, or data about a large number of them) can truly tell you what problem they are experiencing, and only somebody on the development side (preferably, an individual who is tasked with making this decision after understanding the problem and possibly getting feedback from his peers) can correctly decide which solution should be implemented.

-Max

The post Users Have Problems, Developers Have Solutions appeared first on Code Simplicity.

The Accuracy of Future Predictions

2013-01-18T07:05:07Z

One thing we know about software design is that the future is important. However, we also know that the future is very hard to predict.

I think that I have come up with a way to explain exactly how hard it is to predict the future of software. The most basic version of this theory is:

The accuracy of future predictions decreases relative to the complexity of the system and the distance into the future you are trying to predict.

As your system becomes more and more complex, you can predict smaller and smaller pieces of the future with any accuracy. As it becomes simpler, you can predict further and further into the future with accuracy.

For example, it’s fairly easy to predict the behavior of a “Hello, World” program quite far into the future. It will, most likely, continue to print “Hello, World” when you run it. Remember that this is a sliding scale–sort of a probability of how much you can say about what the future holds. You could be 99% sure that it will still work the same way two days from now, but there is still that 1% chance that it won’t.

However, after a certain point, even the behavior of “Hello World” becomes unpredictable. For example, “Hello World” in Python 2.0 the year 2000:

print "Hello, World!"

But if you tried to run that in Python 3, it would be a syntax error. In Python 3 it’s:

print("Hello, World!")

You couldn’t have predicted that in the year 2000, and there isn’t even anything you could have done about it if you did predict it. With things like this, your only hope is keeping your system simple enough that you can update it easily to use the new syntax. Not “flexible,” not “generic,” but simply simple to understand and modify.

In reality, there’s a more expanded logical sequence to the rule above:

The difficulty of predicting the future increases relative to the total amount of change that occurs in the system and its environment across the future one is attempting to predict. (Note that the effect of the environment is inversely proportional to its logical distance from the system.)

The amount of change a system will undergo is relative to the total complexity of that system.

Thus: the rate at which prediction becomes difficult increases relative to the complexity of the system one is attempting to predict the behavior of.

Now, despite this rule, I want to caution you against basing design decisions around what you think will happen in the future. Remember that all of these happenings are probabilities and that any amount of prediction includes the ability to be wrong. When we look only at the present, the data that we have, and the software system that we have now, we are much more likely to make a correct decision than if we try to predict where our software is going in the future. Most mistakes in software design result from assuming that you will need to do something (or never do something) in the future.

The time that this rule is useful is when you have some piece of software that you can’t easily change as the future goes on. You can never completely avoid change, but if you simplify your software down to the level of being stupid, dumb simple then you’re less likely to have to change it. It will probably still degrade in quality and usefulness over time (because you aren’t changing it to cope with the demands of the changing environment) but it will degrade more slowly than if it were very complex.

It’s true that ideally, we’d be able to update our software whenever we like. This is one of the great promises of the web, that we can update our web apps and web sites instantaneously without having to ask anybody to “upgrade.” But this isn’t always true, for all platforms. Sometimes, we need to create some piece of code (like an API) that will have to stick around for a decade or more with very little change. In this case, we can see that if we want it to still be useful far into the future, our only hope is to simplify. Otherwise, we’re building in a future unpleasant experience for our users, and dooming our systems to obsolescence, failure, and chaos.

The funny part about all this is that writing simple software usually takes less work than writing complex software does. It sometimes requires a bit more thought, but usually less time and effort overall. So let’s take a win for ourselves, a win for our users, and a win for the future, and keep it as simple as we reasonably can.

-Max

The post The Accuracy of Future Predictions appeared first on Code Simplicity.

Code Simplicity, Second Revision

2012-08-23T06:03:19Z

In June, I released a second revision of Code Simplicity. Some of you probably already know, but I thought that I should let everybody else know, too.

The most important change is that book gets into the laws and rules of software design much more quickly now. It starts with a completely re-written Preface that tells the story of how I developed the principles in Code Simplicity, and why you might be interested in them. Then it gets into a much shorter Chapter 1 that distills everything from the old Chapter 1 into a few short pages, skips the old Chapter 2 (which was a long discussion about what it means for something to be a science) and goes right into the laws.

Particularly if you’ve read the original version, I’d really love to hear your feedback on how the starting content of the new revision feels to you!

-Max

P.S. If you bought the ebook from O’Reilly, you get every new revision for free, and there will probably be even more revisions than this one! If you got the ebook elsewhere, there’s a little link inside of the book itself that will let you “upgrade” to the O’Reilly editions for pretty cheap, so that you can get this revision and every other future revision for free, too. I’m not partial to any particular method of you getting the book, but the O’Reilly editions are definitely the best way to get the new revisions as they come out.

The post Code Simplicity, Second Revision appeared first on Code Simplicity.

Software as Knowledge

2012-05-13T05:24:41Z

I don’t often dive deep into the philosophical underpinnings of Code Simplicity, but I’ve been realizing more and more that there are a few philosophical principles behind the writings that would be valuable to share. Also, some of these philosophies haven’t been fully formed until I sat with the work for a long time, applied it in a lot of situations, and talked about it with many people. This one–a theory that I have developed over time about how software can be thought of and worked with in the mind–has sort of been percolating with me for quite a while now. It’s time to get at least least part of it out on “paper,” in a blog post. So here you go:

Software is, fundamentally, a solid object that is made of knowledge. It follows all the rules and laws of knowledge. It behaves exactly as knowledge behaves in just about any given situation, except that it’s in concrete form. For example, when software is complex it tends to be mis-used. When software is wrong (i.e., has a bug), it tends to cause harm or problems. When people don’t understand some code, they tend to alter it incorrectly. One could say these things of knowledge just as one could say them of software. Bad data causes people to misbehave; bad code causes computers to misbehave. I’m not saying that computers and people can be compared–I’m saying that software and knowledge can be.

One wishes to have knowledge in a sensible and logical form. Similarly, one should also desire to have software–particularly the code–in a sensible and logical form. Because code is knowledge, it should translate to knowledge in one’s mind almost immediately upon viewing it. If it doesn’t, then some part of it is too complex–perhaps the underlying programming language or systems, but more likely the structure of the code as created by its designer.

When we desire knowledge, there are numerous ways to acquire it. One could read about it, think about it, perform observations, do experiments, talk about it, etc. In general, we could divide these methods into acquiring the data for ourselves (via observation, experiment, thought, etc.) or getting data from somebody else (reading, talking, etc.).

There are some situations in which we must get data for ourselves, particularly when it applies to us in some unique way that we couldn’t rely on others to work out correctly. As an extreme example, walking on my own legs likely took tremendous amounts of personal experimentation when my body was much smaller. I probably had some assistance, but that knowledge had to be developed by me.

There are far more situations, however, in which we must rely on secondhand data. If one wants to do a good job at living, there’s a lot to know–one simply could not acquire so much information on their own. This is where the help of others comes in: the data they know, the lessons they’ve learned and can teach us.

It seems likely that these same principles describe when one should write code themselves or use existing code. You pretty much couldn’t write all the code yourself down to the hardware level and come up with some of the most useful software we have today. For sure, there are some things that only we are uniquely qualified to write–usually the specific logic of the product that we’re working on. But there are many more things that we must rely on existing code for, just like we must rely on existing secondhand knowledge to survive as individuals.

It’s also possible we could use this principle somewhat for deciding how to divide up work between developers. Would it be faster for somebody to create a piece of code out of their firsthand knowledge, or would it be faster for a group of people to look at the existing system (secondhand knowledge) and start to contribute their own parts (which will, in time, essentially become their firsthand knowledge)? The answer depends on the situation, obviously, and though the basic idea here may not be too novel (some programmers already know the system better than others and so they’re faster) the way we came to the idea is what matters. We first theorize that software is knowledge, and then suddenly we can see a clear logical line down to some existing principle that is already known to be generally true. Pretty handy, and indicates we could likely derive other, more useful information from this principle.

Of course, this is not, by itself, a science or a scientific system. It’s just an idea, one that seems to work well for deriving principles about development. I would say it is one of the broadest philosophical theories that I’ve been able to develop about software, in fact. It seems to cover all aspects and explain all behaviors. I could actually sit here and theorize about this idea all day, but I’m not here to ramble, just to give a brief summary and then see what you have to say about it.

-Max

The post Software as Knowledge appeared first on Code Simplicity.

Code Simplicity: The Science of Software Development

2012-05-10T07:03:21Z

What if every software developer could gain the knowledge of long experience without having to go through the pain of repeated failure? What if, instead of being a continuous chaos of complexity and argument, the process of software development could be a sane, orderly progression that was well-understood by every single programmer involved? What if all programmers and their managers shared a common ground for discussing software development decisions–a common ground that was based on facts instead of opinion or authority, and that was actually helpful in deciding what to do on a day-to-day basis with your software project?

What if software development was a science–one with laws, rules, facts, and definitions that told you with certainty which directions to take and which directions to avoid? Not a dogmatic system which restricted you only to some particular methodology, but a series of principles that freed you to think for yourself and make the right decisions for your situation?

What if then, all of this was in a book, that book was only 90 pages long, and it was understandable by every single person working in the software industry, programmer or not? Would it make the world a different place? Find out for yourself:

Code Simplicity: The Science of Software Development.

I’ve spent the last several years developing, testing, and refining a series of scientific laws for software development. Some of what I’ve been doing, you’ve seen in this blog, but the book isn’t just a regurgitation of these articles here. It’s a complete, organized treatise on this new science–a series of principles that I hope will not just change your software, but also bring sanity, order, and happiness to your life as a software developer. Then, once your team reads it, it will bring understanding and insight to your group’s direction and discussions. And finally, when every software developer has read it, it will change the world of software development.

But it all starts here, with you. Help me change the world. All you have to do is read a book, and then if you think you got something out of it, tell other people about it, so maybe they will read it too.

Direct from O’Reilly, available in print and for all e-readers, on sale for 50% off until April 4.
In the Kindle Store
And available in print at numerous other online bookstores, as well.

-Max

The post Code Simplicity: The Science of Software Development appeared first on Code Simplicity.

Clues to Complexity

2012-03-27T08:22:03Z

Here are some clues that tell you that your code may be too complex:

You have to add “hacks” to make things keep working.
Other developers keep asking you how some part of the code works.
Other developers keep mis-using your code, and causing bugs.
Reading a line of code takes longer than an instant for an experienced developer.
You feel scared to modify this part of the code.
Management seriously considers hiring more than one developer to work on a single class or file.
It’s hard to figure out how to add a feature.
Developers often argue about how things should be implemented in this part of the code.
People make utterly nonsensical changes to this part of the code very often, which you catch only during code review, or only after the change has been checked in.

That’s what I can come up with off the top of my head. What are some others?

-Max

The post Clues to Complexity appeared first on Code Simplicity.

Developer Hubris

2011-11-15T07:38:46Z

Your program is not important to me. I don’t care about its user interface. I don’t care what its name is. I don’t care that you made it, or what version it is.

The only thing I care about is that your program helps me accomplish my purpose. That’s a truly remarkable feat, and if your program does it, you should be proud. There’s no need to make your program take up more of my attention just because you think it’s important.

Now of course, your program important to you! When you work on code for a long time, it’s easy to become attached to it. It was so hard to write. Your cleverness is unbounded, shadowing lesser mortals in the mountain of your intellect. You have overcome some of the greatest mental obstacles man has ever faced. Truly, you must shout this from the tops of every tower, through the streets of every city, and even unto the caves of the Earth.

But don’t. Because your users do not care. Your fellow developers might be interested, but your users are not.

When you’re truly clever, what will show up for users is that program is awesome. It’s so awesome, the user hardly notices it’s there. That is true brilliance.

The worst offenders against this ideal are programs that pop up a window every time my computer starts. I know your software is there. I installed it. You really don’t need to remind me. If my purpose is to start up my computer so I can use it, how is your pop up window helping me accomplish that? It’s not, so get rid of it.

There are smaller ways to cause problems, too, that all revolve around asking for too much time or attention from the user:

“Users will definitely be okay with clicking through three screens of forms before they can use my product.”
“I’m sure that users will want to learn all the icons I invented for this program, so taking away the text labels for those icons is fine!”
“I’m sure it’s okay to stop the user from working by popping up these dialog boxes.”
“Users will totally want to search through this huge page for a tiny little piece of text so they can click on it.”
“Why should we make this simpler? That would be a lot of work, and it’s already pretty easy…for me.”

And so on.

The true humility required of a developer is the willingness to remove their identity from the user’s world. Stop telling the user the program is there. Don’t think that the user cares about your program, wants to spend time using its interface, or wants to learn about it. It’s not your program that they care about–it’s their purpose. Help them accomplish that perfectly, and you will have created the perfect program for them.

-Max

The post Developer Hubris appeared first on Code Simplicity.

Open Source Community, Simplified

2011-02-01T10:00:42Z

Growing and maintaining an open-source community depends essentially on three things:

Getting people interested in contributing
Removing the barriers to entering the project and contributing
Retaining contributors so that they keep contributing

If you can get people interested, then have them actually contribute, and then have them stick around, you have a community. Otherwise, you don’t.

If you are just starting a project or need to improve the community of an existing project, you should address these points in reverse order. If you get people interested in a project before you do the later two steps, then people won’t be able to enter and won’t stick around when they do enter. You won’t actually expand your community. So first, we want to be sure that we can retain both existing and new contributors. Once we’ve done that, then we want to remove the barriers to entry, so that interested people can actually start contributing. Only then do we start worrying about getting people interested.

So let’s talk about how you accomplish each step in reverse order:

Retaining Contributors

For the Bugzilla Project, this was our biggest challenge. Once somebody started contributing, what made them keep contributing? How did we keep people around?

Well, we had an interesting advantage in answering these questions, in that we are one of the older open-source projects in existence, having been around since late 1998. So we had a tremendous wealth of actual data to work with. We mined this data in two ways: First, we did a survey of all our past developers who had left the project, asking them why they had left. This was just a free-form survey, allowing people to answer any way they wanted. Then, we created a graph of the number of contributors over time, for the whole ten years of the project, and correlated the rise and fall of the graphs to various actions we took or didn’t take over time.

Once all this was done, I sent an email that out to the developers Bugzilla Project, describing the results of the research. You can read the whole email if you’d like, but I’ll summarize the findings here:

Don’t freeze the trunk for long periods.
The Bugzilla Project has a fairly-standard system of having stable branches that receive little change (for example, the “3.4” branch where we commit bug fixes and do minor releases like 3.4.1, 3.4.2, etc.), and a main-line “trunk” repository where all new features go, and which eventually becomes our next major release.

In the past, before a major release, we would “freeze” the trunk. This meant that no new features could be developed for several weeks or months until we felt that trunk was stable enough to call a “release candidate.” Then we would create a new stable branch from the trunk and re-open the main-line trunk for features. However, while trunk was frozen, there was no feature development happening anywhere in the Bugzilla Project.

Graph analysis showed very clearly that every time we would freeze, the community would shrink drastically and it would take several months after we un-froze for the size of the community to recover. It happened uniformly, every single time we would freeze, over many years and many releases.

Traditional wisdom in open-source is that people like to work on features and don’t like to fix bugs. I wouldn’t say that that’s exactly true, but I would say that if you only let people fix bugs, then most of them won’t stay around.

We addressed this issue by never freezing the trunk. Instead, we branch immediately at the point that we normally would have “frozen” the trunk. The trunk always stays open for new feature development. Yes, this means that for a while, our attention becomes split between the trunk and the latest branch. We’re committing the same bug fixes to the branch and the trunk. We are also doing feature development on the trunk simultaneously with those bug fixes. However, we’ve found that not only does the community expand more rapidly this way, but we also actually get our releases out more quickly than we used to. So it’s a win-win situation.
Turnover is inevitable.
The survey found that the number one reason that contributors leave is that they no longer have time to contribute, or that they were contributing as part of their job and now they have changed jobs. Essentially, it is inevitable that most contributors eventually leave.

So if it community members are definitely going to be leaving, the only way to consistently expand the community is to figure out how to retain new contributors. If you don’t get new members to stick around, then the community will continuously shrink as old contributors leave, no matter what else you do.

So while retaining existing contributors is important–after all, you want people to stick around and contribute for as long as reasonably possible–what matters the most is retaining new contributors. How do you do that? Well, that’s a lot of what the rest of these points are about.
Respond to contributions immediately.
The Bugzilla Project has a system of code reviews that requires that all new contributions be reviewed by an experienced developer before they can become part of Bugzilla. There have been various complaints about the system over the years, but analyzing the survey data showed that people leave the project because getting a review takes too long, not because the reviews are too hard. In fact, the reviews can be as hard as you want as long as they happen almost instantly after somebody submits a contribution.

People don’t (usually) mind having to revise a contribution. They even generally don’t mind revising it several times. But they do mind if they post a patch, don’t get a review for three months, and then they have to revise it, only to wait another three months to be told that they have to revise it again. It’s the delay that matters, not the level of quality control.

There are other ways of responding rapidly to contributions, too. For example, immediately thanking somebody for posting a patch can go a long way toward retaining new contributors and “converting” them into long-term developers.
Be extremely kind and visibly appreciative.
For nearly every person who responded to our survey, the factors involved in not staying–beyond “my job changed” or “I didn’t have time”–were surprisingly personal. I know that we all work with computers, and perhaps we’d like to think that engineering should be a totally cold scientific profession where we all do our jobs correctly according to the requirements of the machine, and not worry about our emotional or personal involvements. However, nothing could be further from the truth–the personal interactions that people have with community members, the amount they feel appreciated, and the amount they feel assaulted, are actually the most important aspects of retaining community members.

When people contribute on a volunteer basis, they aren’t getting paid in money, they are getting paid in admiration, appreciation, the sense of a job well done, and the knowledge that they are helping create a product that affects millions of people. When somebody has contributed a patch, you need to thank them. It doesn’t matter if the patch is total crap and has to be re-written entirely, you need to thank them. They have put some work into this, and if you don’t appreciate that, they will leave before they even start. After all, most people get little enough appreciation at their workplace–they stay there because they get paid in money! They don’t need to work for free with some other organization if it also doesn’t appreciate their work, or even worse, assaults every aspect of their contribution before even thanking them for it.

Of course, you still need to correct people on the faults in their contributions. “Kindness” does not include putting bad code into your system. That isn’t kind to anybody, including the contributor whose skills probably need to improve, and who may go on believing that something they did in error was in fact correct. You have to still be careful reviewers and very good coders.

What this does mean is that in addition to telling people what’s wrong with their contribution, it’s important to appreciate what’s right about their contribution, even if it’s simply the fact that they took the time to contribute. And you have to actually tell the contributor that you appreciate the contribution. The more frequently and genuinely that you do this, the more likely you are to retain the contributor.
Encourage a total absence of personal negativity.
One thing that drives people away from a project with lightning speed is when they get personally attacked for attempting to do something positive. A “personal attack” can be as little as an unpleasant joke about their code, instead of just a straightforward technical description of what is wrong. Saying something like, “What is wrong with you?” instead of actually leaving some helpful comment. Disguising personal criticism as “an attempt to help them code better” or “help them get along with others.” No matter how well-justified these actions may seem to be, they are all personal attacks that are extremely dangerous to your community.

Now truthfully, coding and working on a collaborative project with people who have different viewpoints can get really frustrating sometimes, and I’ve been an offender in this area just as much as anybody has been. But we all have to learn that it’s not okay to insult other developers as people just because we’re personally frustrated with them.

The solution isn’t just to say “everybody, now bottle up your frustrations until you explode,” though. There are lots of practical solutions. One of the best is to set up some specific system for handling problematic contributors. If there’s some contributor that Bob just can’t live with, there needs to be somebody in the community who Bob can go to to help work things out. We’ll call this go-to person the “community moderator.” So Bob tells the moderator about the problem, and maybe the moderator sees that other contributor really was being a terrible person or bad coder, and so this “community moderator” gently corrects that contributor. But it’s also possible that there was some communication problem between Bob and the other contributor that the moderator just needs to help resolve.

This “moderator” system isn’t the only way to deal with the problem. You can resolve the problem in numerous ways–the most important thing is that you do resolve it. Without some channel or method for dealing with personal frustrations, individual contributors will take these frustrations out on each other. You will in fact foster an environment where it’s okay for one contributor to personally attack another contributor, because that’s the only avenue they have to resolve their problems, and nobody’s stopping them.

Basically, those last two points can be summed up as: be really, abnormally, really, really kind, and don’t be mean.

We’ve been applying all of these principles in the Bugzilla Project for the past several months, and we saw an increase in the number of retained contributors almost immediately after we started applying them. I’m finally starting to feel like the community is growing again, after shrinking almost continuously since 2005 due to violations of all of the above points.

Removing the Barriers

The next step is to remove the barriers to entry. What prevents people from getting started on the project?

Usually, the biggest barrier is a lack of documentation and direction. When people already want to contribute, their next step is figuring out how to contribute. They will go to your project’s website and look around. They will wonder, “Who do I talk to about this? How do I start contributing? What do you guys want me to work on?”

For the Bugzilla Project, we solved this problem in several ways:

A list of easy starting projects.
Whenever we see a bug or feature request that looks like it would be easy for a newcomer to solve, we tag it as a “good intro bug” in our bug tracker. This gives us a list of good introductory projects that anybody can come and look at without having to ask us “where do I get started?”
Create and document communication channels.
People will almost immediately want to talk to somebody else about the project. You should have email lists and also some method of instantaneous communication like an IRC channel. For example, we have an email list for Bugzilla developers and also an IRC channel where almost all our contributors hang out. In fact, we don’t just have a normal IRC channel–we also have a web page that people can use to chat in that IRC channel. That way, people don’t have to install an IRC client just to come talk to us. Setting up that web page enormously increased the number of new people coming into the channel and communicating with us. (And the increase was entirely positive–I can’t think of a single person who used the web gateway to cause us trouble.)

Then once you have these channels, they need to be documented! People have to know how to get into them, they need to know that they exist. We have a wiki page that explains how to talk to us if you want to contribute. (Note that this is separate from our support page that describes how to get support for the project.)

Also, as a final but perhaps obvious point, the existing community has to use the communication channels. If the main contributors do all their work in an office and just talk to the people next to then and you don’t use the mailing lists or IRC channels, then the community members aren’t going to want to use those communication systems either. After all, the new contributors aren’t there to talk to each other–they’re there to talk to you!
Excellent, complete, and simple documentation describing exactly how a contribution should be done.
Fully document every step of your development process, and put that documentation onto a public web site. Don’t invent a new process, just document out what the existing actual process is. How do people get the code? How can they submit patches or other contributions to you? How do those contributions become an official part of the system?

We have a very simple page that describes the basic steps of our whole process, and links to documents that describe each step in more detail. It also specifically encourages people to get into communication with us, so that we know that they are there and want to help.
Make all this documentation easy to find.
This is a simple final step, but sometimes projects forget it! You can have all the wonderful developer documentation in the world, but if new contributors can’t find it super-easily, then you’re not actually removing any barriers to entry! We have a big “Contribute!” button on our website that describes all the different ways that people can contribute (not just code!) and links to more information about each of those.

We saw a definite upswing in the number and quality of contributions once we completed all these steps. Also, having everything documented and clearly stated on a public website meant that we no longer had to personally explain it all, every time, to every new contributor.

Direction and documentation aren’t the only things you can do though. Ask yourself, “What is stopping people from contributing?” and remove all the barriers there that you reasonably can.

Getting People Interested

How do you make people think, “Gee, I want to contribute to this project?” That’s the first step they have to take before they can become become contributors. Well, traditional wisdom states that people contribute to open-source projects because:

They like helping.
They enjoy being part of a community.
They want to give back.
They think that something is wrong and they need/want to fix it.

So you may want to make it apparent that help is needed, that an enjoyable community is there, that giving back is appropriate and appreciated, and that there are problems that need solving.

Now, to be fair, this is an area that I don’t have fully mapped out or figured out for the Bugzilla Project, yet. So I don’t have a lot of personal experience to draw on. But if we analyze other projects, we can see that some good ways of getting contributors are:

Be a super-popular product.
This may seem obvious, but it is indeed the primary way of getting new contributors. If a zillion people use your product, it’s statistically likely that many of them will want to contribute. The Linux Kernel and WordPress are good examples of this–they have millions of users, so there’s just bound to be a lot of contributors, provided that the “barriers to entry” and the “retaining contributors” aspects of the project have also been handled.

One way to become a super-popular product–even if you’re just starting out–is to be heavily needed. The Linux Kernel was very much needed when it was first written, which is probably one of the reasons that it became popular as quickly as it did. It desperately needed to exist and didn’t exist yet.
Be written in a popular programming language.
Generally, people are more likely to contribute to a project if it’s written in a language that they already know. WordPress has a huge contributor community, and it’s in PHP. Say what you will about PHP, it is extremely popular. There’s a large number of people who already know the language, which increases the likelihood that some of them will start supplying patches for your code.

This not the only reason you should choose a particular programming language, but it’s certainly a major motivator if you’re going to have an open-source project. I may think that Eiffel is a remarkable language, but if I wrote an open-source project in it, I would have a very hard time getting contributors.

Beyond those points, there are lots of clever ways of getting people interested in contributing to your projects, including speaking at conferences, publishing blogs, encouraging people on a one-to-one basis, and other methods that basically add up to “contact and encourage.”

I’d love to hear some of your ideas in this area, though. How do you get new people interested in contributing to your project? Has anything been particularly successful?

Summary

An open-source community is somewhat of a fluid thing–there are always going to be people coming and going for one reason or another. What’s important is that the rate of people entering and staying is greater than the rate of people leaving. All of these points help assure that, and hopefully they also make our communities productive and enjoyable places to be for everybody, even ourselves!

-Max

The post Open Source Community, Simplified appeared first on Code Simplicity.

Readability and Naming Things

2011-01-26T21:12:19Z

Many people think that the readability of code has to do with the letters and symbols used. They believe it is the adding, removing, or changing of those symbols that makes code more readable. In some sense, they’re right. However, the underlying principle is:

Readability of code depends primarily on how space is occupied by letters and symbols.

What does that mean? Well, it means two things:

Code should have the proper amount of white space around it. Not too much, not too little.

There should be the proper amount of space within a line of code to separate out the different parts. Separate actions should generally be on separate lines. Indentation should be used appropriately to group blocks of code.

With this principle, it’s actually the absence of code that makes things readable. This is a general principle of life–for example, if there was no space at all between letters and words in a book, it would be hard to read. On the other hand, it’s easy to see the moon against the clear night, because there’s a lot of clear black space that isn’t the moon. Similarly, when your code has the right amount of space in it, you can tell where and what the code is easily.

For example, this code is hard to read:

x=1+2;y=3+4;z=x+y;print"hello world";print"z is"+z;if(z>y+x){print"error";}

Whereas with the proper spacing in, around, and between the lines, it becomes easy to read:

x = 1 + 2;
y = 3 + 4;
z = x + y;
print "hello world";
print "z is" + z;
if (z > y + x) {
    print "error";
}

There can also be too much or wrong space, however. This code is also hard to read:

    x            =          1+        2;
y = 3            +4;


  z = x    +      y;
print    "hello world"         ;
 print "z is " + z;
if (z  >     y+x)
 {        print "error" ;
        }

Code itself should take up space in proportion to how much meaning it has.

Basically, tiny symbols that mean a lot make code hard to read. Very long names that don’t mean much also make code hard to read. The amount of meaning and the space taken up should be closely related to each other.

For example, this code is unreadable because the names are too small:

q = s(j, f, m);
p(q);

The space those names take up is very little compared to how much meaning they have. However, with appropriately-sized names, it becomes more apparent what that block of code is doing:

quarterly_total = sum(january, february, march);
print(quarterly_total);

On the other hand, if the names are too long compared to how much meaning they represent, then the code becomes hard to read again:

quarterly_total_for_company_x_in_2011_as_of_today = add_all_of_these_together_and_return_the_result(january_total_amount, february_total_amount, march_total_amount);
send_to_screen_and_dont_wait_for_user_to_respond(quarterly_total_for_company_x_in_2011_as_of_today);

This principle applies just as well to entire blocks of code as it does to individual names. We could replace the entire block of code above with a single function call:

print_quarterly_total();

And that is even more readable than any of the previous examples. Even though the name we used–print_quarterly_total–is a bit longer than our other names for things, that’s okay because it represents more meaning than other pieces of code do. In fact, it’s even more readable than our block of code was, by itself. Why is that? Because the code block took up a lot of space for, effectively, very little meaning, and the function takes up a more reasonable amount of space for the same meaning.

If a block of code takes up a lot of space but doesn’t actually have much meaning, then it’s a good candidate for refactoring. For example, here’s a block of code that handles some user input:

x_pressed = false;
y_pressed = false;
if (input == "x") {
    print "You pressed x!";
    x_pressed = true;
}
else if (input == "y") {
    if (not y_pressed) {
        print "You pressed y for the first time!";
        y_pressed = true;
        if (x_pressed) {
            print "You pressed x and then y!";
        }
    }
}

If that were our whole program, that would probably be readable enough. However, if this is within a lot of other code, we could make it more readable like this:

x_pressed = false;
y_pressed = false;
if (input == "x") {
    handle_x(x_pressed);
}
else if (input == "y") {
    handle_y(x_pressed, y_pressed);
}

And we could make it even more readable by reducing it to this:

handle_input(input);

Reading “handle_input” in the middle of our code is much easier than trying to read that whole first block, above, because “handle_input” is taking up the right amount of space, and the block is taking up too much space. Note, however, if we’d done something like h(input) instead, that would be confusing and unreadable because “h” is too short to properly tell us what the code is doing. Also, handle_this_input_and_figure_out_if_it_is_x_or_y_and_then_do_the_right_thing(input) would not only be annoying for a programmer to type, but would also make for unreadable code.

Naming Things

It was once said by a famous programmer that naming things was one of the hardest problems in computer science. These principles of readability give us some good clues on how to name things, though. Basically, the name of a variable, function, etc. should be long enough to fully communicate what it is or does, without being so long that it becomes hard to read.

It’s also important to think about how the function or variable is going to be used. Once we start putting it into lines of code, will it make those lines of code too long for how much meaning they actually have? For example, if you have a function that is only called once, on one line all by itself, with no other code in that line, then it can have a fairly long name. However, a function that you’re going to use frequently in complex expressions should probably have a name that is short (though still long enough to fully communicate what it does).

-Max

The post Readability and Naming Things appeared first on Code Simplicity.

The Power Of No

2011-01-23T09:05:49Z

How many times have you used a piece of software that was full of incredibly convoluted features, strange decisions, and unusable interfaces? Have you ever wanted to physically or verbally abuse a computer because it just wouldn’t do things right, or you couldn’t figure out how to make it function properly? And how often have you thought, “How could any programmer think this was a sane idea?”

Well if you’ve ever experienced any of those things, your next thought might have been something like “**** this computer” or “**** the silly programmer who made it behave this way”. After all, aren’t programmers and hardware designers to blame for the crazy behavior of the system? Well, yes, to some extent they are. But after being intimately involved in software design for many years, I now have another reaction to poorly-implemented features. Instead of becoming angry with the programmer who implemented the system, I ask myself, “Who was the software designer who authorized this feature?” Who stood by silently and let this feature happen when they had the power to stop it?

Granted, sometimes there is no software designer at all, in which case you’re practically guaranteed to have a broken system. But when there is a software designer, they are ultimately responsible for how the system is put together. Now, quite a bit of this job involves designing the structure of features before they go into the system. But there’s also another part of the job of a software designer–preventing bad ideas from being implemented. In fact, if there’s any lesson I’ve learned from my years in the software industry, it’s this:

The most important word in a software designer’s vocabulary is “no”.

The problem is that if you give a group of humans total freedom to implement any random idea that comes into their mind, then nearly every time they will implement bad ideas. This isn’t a criticism of developers, it’s more of a fact of life. I have great faith in the intelligence and capability of individual developers. I admire developers’ struggles and achievements in software development. It’s just an unfortunate fact of existence that without some central guidance, people in a group tend to evolve complex systems that don’t help their users as well as they could.

An individual designer, however, is usually capable of creating a consistent and enjoyable experience for the users and developers both. But if that individual designer never steps up and say “no” when another developer starts to do something the wrong way, then the system will collapse on itself and become a chaotic mess of bad ideas. So it is very important to have a software designer who has the power to say “no”, and then it’s important for that designer to actually use that power whenever it is appropriate.

It is truly amazing how much you can improve your product just by saying “no” to any idea that really deserves a “no”.

Recognizing Bad Ideas

Before you can apply this principle, there is one thing that you have to know: how to recognize bad ideas. Thankfully, there are a lot of software design principles that help clue you in on what is a bad idea, and lead you to saying “no” when it’s truly needed. For example:

If the implementation of the feature violates the laws of software design (for example, it’s too complex, it can’t be maintained, it won’t be easily changeable, etc.) then that implementation is a bad idea.
If the feature doesn’t help the users, it’s a bad idea.
If the proposal is obviously stupid, it’s a bad idea.
If some change doesn’t fix a proven problem, it’s a bad idea.
If you aren’t certain that it’s a good idea, it’s a bad idea.

Also, one tends to learn over time what is and isn’t a good idea, particularly if you use the above as guidelines and understand the laws of software design.

Having No Better Idea

Now, sometimes a designer can recognize a bad idea, but they still implement it because they can’t think of a better idea right now. This is a mistake. If you can think up only one solution to a problem but it is obviously stupid, then you still need to say no to it.

At first this may seem counter-intuitive–don’t problems need to be solved? Shouldn’t we solve this problem in any way we can?

Well, here’s the problem: if you implement a “bad idea”, your “solution” will rapidly become a worse disaster than the original problem ever was. When you implement something terrible, it “works”, but the users complain, the other programmers all sigh, the system is broken, and the popularity of your software starts to decrease. Eventually, the “solution” becomes such a problem that it requires other bad “solutions” to “fix” it. These “fixes” then become enormous problems in themselves. Continue down this path, and eventually you end up with a system that is bloated, confused, and difficult to maintain, just like many existing software systems today.

If you often find yourself in a situation where you feel forced to accept bad ideas, it’s likely that you’re actually near the end of this chain of events–that is, you’re actually building on a series of pre-existing bad ideas from the system’s past. In that case, the solution is not to keep “patching” over the bad ideas, but to instead find the most fundamental, underlying bad ideas of the system and redesign them to be good, over time.

Now ideally, when you reject a bad idea, you should provide an alternate, good idea in its place–that way you’re being constructive and moving the project forward, instead of being viewed as a roadblock on the path of development. But even if you can’t come up with a better idea right now, it’s still important to say no to bad ideas. A good idea will come eventually. Maybe it will take some study, or perhaps it will suddenly come to you while you’re standing in the shower one day. I have no idea where the idea will come from or what it will be. But don’t worry too much about it. Just trust that there is always some good way to solve every problem, and keep looking for it until you find it. Don’t give up and accept bad ideas.

Clarifications: Acceptance and Politeness

So it’s important to say “no”, but there are a few clarifications required on what I really mean, there. I’m not saying that every suggestion is wrong. In fact, developers are usually very bright people, and sometimes they really do nail it. Many developers make perfect suggestions and do excellent implementations. And even the worst solutions can have good parts, despite not being excellent as a whole. So many times, instead of actually saying “no”, what you’ll be saying is something more like, “Wow, there’s a part of this idea that is really good, but the rest of it is not so great. We should take the best parts of this idea and build them up into something awesome by doing more work on them.” You do have say no to the parts of an idea that are bad, though. Just because one part of the idea is good doesn’t mean that the whole idea is good. Take what’s intelligent about the idea, refine it, and build good ideas around it until the solution you’ve designed really is great.

Also, it is still critically important that you communicate well with the rest of your team–having the responsibility of saying “no” doesn’t give you the right to be rude or inconsiderate. If you continuously say “no” without any hint of kindness, you are going to fracture your team, cause upsets, and ultimately end up wasting hours of your time in argument with the people you’ve upset. So when you have to say “no”, it’s best to find a polite way to communicate it–a way that expresses appreciation for the input, positive suggestions of how to improve things, and lets the person down easily. I understand how frustrating it can be to have to slow down and explain things–and even more frustrating to repeat the explanation over and over to somebody who doesn’t get it the first time–but if that’s what it takes to have an effective development team while still saying “no” to bad features, then that’s what you have to do.

-Max

The post The Power Of No appeared first on Code Simplicity.

Before You Begin….

2011-01-17T10:11:16Z

One of the major goals that I have with researching software design is the hope that we can take people who are “bad programmers” or mediocre programmers and, with some simple education and only a little experience, bring them into being good programmers or great programmers. I want to know–what are the fundamental things you have to teach somebody to make them into a great programmer? What if somebody’s been programming for years and hasn’t gotten any better–how can you help them? What are they missing? So I’ve written quite a bit about that, particularly in some of my recent articles.

However, before somebody can even start on the path of becoming a better software developer, one thing has to be true:

In order to become an excellent programmer, you must first want to become an excellent programmer. No amount of training will turn somebody who does not want to be excellent into an excellent programmer.

If you are a person who is passionate about software development–or even just somebody who likes being good at their job–it may be hard to understand the viewpoint of somebody who simply doesn’t want to get any better. To fully grasp it, it can be helpful to imagine yourself trying to learn about some area that you personally have no desire to be great in.

For example, although I admire athletes, enjoy playing soccer, and sometimes enjoy watching sports in general, I’ve never had a desire to be a great athlete. There’s no amount of reading or education that will ever turn me into a great athlete, because I simply don’t want to be one. I wouldn’t even read the books in the first place. If you forced me to take some classes or go to some seminar, it would leave my mind as soon as I took it in, because I would simply have no desire to know the data. Even if I was playing sports every day for a living, I’d think, “Ah well, I don’t have any passion for athletics, so this information simply isn’t important to me. Someday I will be doing some other job, or some day I will retire and not have to care, and until then I’m just going to do this because they pay me and it’s better than starving.”

As hard as this can be to imagine, that is what happens in the minds of many “bad” programmers when you tell them how or why they should write better code. If they don’t sincerely want to be the best programmers that they can be, it does not matter how much education you give them, how many times you correct them, or how many seminars they go to, they will not get better.

So what do you do? To be fair, I may not be the best person to ask–if I’m going to do something, I feel that I should do my best to excel in it. Perhaps the best thing you can do is encourage people to follow that concept. You could say to them something like: “If you’re going to be doing it anyway, why not do it well? Wouldn’t it at least be more enjoyable to be doing this if you were more skilled at it? What if some other people were impressed with your work, how would that feel? Would it be nice to go home at the end of the day and feel that you had done something well? Would your life be better than it is now, even if only a little? Would your life get worse?”

At the very worst, you could ask somebody to list off all of the consequences of “being a great programmer” until they felt some relief on the subject or started to see the idea differently. You could ask them something like, “What would happen if you were a great programmer?” and keep asking for more answers to that question until they felt better about it or started really seeing how good it could be to be excellent. You don’t have to respond to their answers with any positive or negative comments, just listen and politely acknowledge the things they’re saying. The idea is to give them the chance to really examine the possibility for themselves, and maybe come to some new and interesting conclusions, by themselves–not by you telling them what to think or disagreeing with their answers, but just by communicating to you what would really happen if they became great.

However you do it, the bottom line is that people must be interested in improving themselves before they can get better. How you bring them up to that level of interest doesn’t really matter, as long as they get there before you waste a lot of time giving them an education that they’re just going to throw away as soon as they hear it.

-Max

The post Before You Begin…. appeared first on Code Simplicity.

Software Design, In Two Sentences

2011-08-15T20:01:02Z

In the context of The Equation of Software Design, it is now possible to reduce the primary principles of software design into just two statements:

It is more important to reduce the Effort of Maintenance than it is to reduce the Effort of Implementation.
The Effort of Maintenance is proportional to the complexity of the system.

And that is pretty much it. If all you knew about software design were those two principles and the purpose of software, you could evolve every other general principle of software development.

-Max

The post Software Design, In Two Sentences appeared first on Code Simplicity.

The Equation of Software Design

2010-01-08T17:24:38Z

So today I was playing around with a little equation that may in fact explain nearly all of the principles of software design. I don’t know that it’s actually mathematically solvable in terms of numbers, but it does demonstrate the factors present in software development decisions and how they relate to each other. Before I go into the equation, though, I have to define the factors that are present when a designer is deciding whether or not to implement something, or how to implement it:

Potential Value of Implementation (V_i for short): How “valuable” could it be to implement this? For example, if we add something to the program that could directly prevent somebody from dying, that’s very valuable. If it simply might prevent a future typo in a single error message, that’s hardly valuable at all.
The “value” can be for the end user or for other programmers. When we’re talking about design decisions that affect only code, and not the end user, flexibility is one of the major values–how important could this flexibility be?

The potential value is separate from how likely it is that the situation will occur where you will need it. That’s the next issue.
Probability of Value (P_v for short): What is the chance that this value will be, in fact, realized by an end user (for a feature or functional change) or by another developer (for some design decision)? If we’re adding in code flexibility to allow for potential contact with extraterrestrial apes, that’s not a very probable occurrence. If we’re adding in a feature that is immediately useful to every single user, that’s 100% probability. (The number of users that a feature will be useful to is also part of the Probability of Value.)
Effort of Implementation (E_i for short): How hard will it be to implement this? This is a one-time cost–the immediate difficulty of performing the work required to create this thing the very first time. This would probably be measured in person-hours.
Effort of Maintenance (E_m for short): How much effort will it require to maintain this in the future? (This includes any effort added to maintaining the entire program by implementing this.) Will this complicate maintenance for the whole system? This is an amount that increases over time. Similar to Effort of Implementation, this would be most likely measured in person-hours.

What we’re trying to determine is the Desirability Of Implementation (D for short). This answers the questions “Is this something we should do or not?” and “What should the priority of implementing this be?”

The simplest form of the equation is:

D = (P_v * V_i) / (E_i + E_m)

Or, in English:

The Desirability of Implementation is directly proportional to the Probability of Value and the Potential Value of Implementation, and inversely proportional to the total effort, consisting of the Effort of Implementation plus the Effort of Maintenance.

However, there is a critical factor missing from the simple form of this equation: time. What we actually want to know is the limit of this equation as time approaches infinity, and that gives us the true Desirability of Implementation. So let’s look at this from a logical standpoint:

The Effort of Implementation is a one-time cost, and never changes, so is mostly unaffected by time.

The Value of Implementation may increase or decrease over time, depending on the feature. It’s not predictable, and so we can assume for the sake of this equation that it is a static value that does not change with time (though if that’s not the case in your situation, keep this factor in mind as well). One could even consider that the Effort of Maintenance is actually “the effort required to maintain this exact level of Value,” so that the Value would indeed remain totally fixed over time.

The Probability of Value, being a probability, approaches 1 (100%) as time goes to infinity.

The Effort of Maintenance, being based on time, approaches infinity as time goes to infinity.

At first glance, that might sound as though design is hopeless, because maintenance becomes infinite effort–an amount that no Potential Value could surpass–and it seems like every possibility must be accounted for, because given infinite time the probability seems to indicate that every possibility will occur. Those are not true statements, though, because you have to think about the rate at which both of those items increase. If the fundamental effort of maintenance is very small, then even as time goes on, it will remain small. You could say that there is a “coefficient of maintenance” on any design decision or feature, and that that determines how rapidly maintenance effort will accumulate over time. As far as the Probability of Value goes, if it is a very tiny number, it may remain tiny until thousands or millions of years have passed–so if the Effort of Maintenance increases at a great rate, then it will easily outstrip the Probability of Value and the Desirability of Implementation will approach zero as time approaches infinity.

What this equation actually tells us is that the most important factors to balance, in terms of time, are probability of value vs. effort of maintenance. If the probability of value is high and the effort of maintenance is low, the desirability is then dependent only upon the Potential Value of Implementation vs. the Effort of Implementation–a decision that a product manager can easily make. If the probability of value is low and the effort of maintenance is high, the only justification for implementation would be a near-infinite Potential Value of Implementation.

This interestingly indicates why small improvements in programming languages and development frameworks result in such enormous changes in the resulting products–because tiny reductions in the Effort of Maintenance can make tremendous changes in the Desirability of Implementation. Features that otherwise would be thrown away by a product manager as impossible become part of the basic design plan. Polishing the UI becomes more desirable, because it requires less effort for both implementation and maintenance.

And finally, I think that this communicates most exactly and truthfully why simplicity is so important–because simplicity is what determines the “coefficient of maintenance” that I talked about above. The effort involved in maintaining simple code increases very slowly over time–sometimes so slowly that you never will have to put any maintenance effort into it in your lifetime.

There’s a lot more that could be said about this equation. What are your thoughts on it? Anybody have any ideas of how Value of Implementation could be numerically calculated, or if it might break down into a set of other numerically-calculable factors? Anything you have to say about it, I’m interested.

-Max

The post The Equation of Software Design appeared first on Code Simplicity.

Privacy, Simplified

2011-09-30T07:21:20Z

So, there’s a lot of talk on the Internet about privacy. Some people say that privacy is only desired by those who have something to hide. Some people insist that privacy is a human right that should never be violated without consent.

There’s only one problem with this whole debate: what is privacy, and why would anybody want it? This is rarely defined–most people just seem to assume that “everybody knows” that privacy is, so why would it have to be explained?

Well, I’m not a big fan of “everybody knows.” And in fact, it turns out that privacy actually means two different things, which many people use interchangeably without specifying what they’re actually talking about. So to help clear up some of the debate online, and to hopefully shed some light on how it can all be resolved, here are some clear definitions and discussions of what privacy is, and why people would want it.

Privacy of Space

The first type of privacy is “privacy of space”. This is the ability to control who does and does not enter a particular physical space, probably because you’re in the space and you don’t want certain others in that space. “Enter the space” in that definition includes any method of being able to perceive the space–so, for example, if somebody stands outside the door with their ear pressed to it, they’re violating your privacy. If somebody installs a camera in your room without your consent, they’re violating your privacy.

This form of privacy is not metaphorical. It does not apply to anything other than physical space. It literally means, “I do or do not want you to be perceiving this physical location, and I have the choice and ability to control that.”

The most common reason that we want this form of privacy is that we want to protect somebody or something from harm, most commonly ourselves. This harm can be minor (we don’t want to be annoyed by people walking through our house all the time), it can be purely social (we close the door when we go to the bathroom because we know others don’t want to perceive us going to the bathroom, and we may also not want to be perceived in such a state), or it can be extreme (a man with a mask and a chainsaw should not be in my closet).

One interesting thing about this form of privacy is that we don’t usually consider animals, plants, or material objects to be capable of violating it, even if they enter a space without our permission. It might be annoying if the cat comes in the room when you don’t want it to, but you’re not going to complain that the cat is “violating your privacy”, right?

So, when it comes to computer programs, this is not the form of privacy we’re talking about, since we don’t consider that a computer program being in the same room with us is a violation of our privacy of space. My word processor is not violating my physical privacy of space, even though it’s “in the room” with me, because it does not, itself, perceive. The only exception would be a computer program that was transmitting perceptions (sound or sight) to some location that we didn’t want to send it to–that would be a privacy violation, because someone could perceive our space through it when we didn’t want them to. When it comes to that sort of privacy, violations are pretty pretty cut-and-dry. If a computer program sends perceptions of my space anywhere without my permission, it is absolutely violating my privacy, it’s not useful to me, and it should stop immediately.

But on the Internet, that’s not usually the type of privacy we’re talking about.

Privacy of Information

The second type of privacy is “privacy of information.” This is the ability to control who knows certain things. When we talk about computer programs and the Internet, this is the most common type of privacy we’re talking about.

So why would somebody want privacy of information? Is it just because they’re doing something that they want to hide from others? Is it just for committing crimes or for hiding harmful acts? Well, sometimes it is, yes. There are many people who use the concept of “privacy” to protect themselves from the law or the moral rejection of others. It is probably because of these individuals that the concept of privacy is a muddy subject–as long as it’s unclear quite what “privacy” is, it’s much easier for those who have have committed harmful acts to invoke “privacy” as a defense.

But is that the only reason that somebody would want privacy of information? What about a normal person, who isn’t doing anything harmful–would they ever want to keep certain information private?

Well, there is absolutely a rational reason that people would want privacy of information, and interestingly, it’s the same reason that people want privacy of space:

An individual or group desires privacy of information because they believe that other people knowing that information could or would be more harmful than them not knowing it.

Here’s a very straightforward example: I consider that a criminal knowing my credit card number would be harmful–far more harmful than them not knowing it.

In certain countries, the fact that I read a certain website or talked to certain people on the Internet could get me killed or put in jail. So, in that situation, other people knowing my browser history could be very harmful, no question about it.

Of course, if one kept everything private, one could not live. If you pay for a piece of candy with a quarter, the person receiving that quarter now knows that you had a quarter. They may know that you kept it in a waller, or that you pulled it out of your pants. They probably know what you look like, if you’re not wearing a mask. They most likely also know that you have five fingers, and that you were in their store at a certain time. In short, no matter what you do, in order to live, you must exchange information with other people. The more things you do, the more information you will have to exchange.

In fact, usually, the more information that others know about you, the more helpful they can be. The bank knows all the transactions that I made, so they can help me by creating an online system that shows me my transactions and lets me search them. That information can be seen by bank employees, but I don’t consider that to be potentially harmful enough to outweigh the obvious benefits of the bank having it.

The web browsers that I use know my passwords to certain sites, so they can help me by putting those passwords into the box, saving me some typing. Potentially, somebody could steal that information from my computer, but the chance of that happening is small enough, and the benefit is significant enough, so I consider it acceptable to save my passwords in the browser.

The examples like this go on and on–the appropriate use of information is extremely beneficial. The inappropriate use is what’s harmful.

So who decides what what’s an appropriate use and what’s an inappropriate use? What information should be sent and stored, and what information should be kept private? Well, these are the fundamental questions being asked when people debate privacy issues–who gets to choose whether my knowledge becomes somebody else’s knowledge? Should I be asked before my information is sent, or should I just be given the option to opt-out and delete the information? Is there some information that should never be sent? What information is more important to keep private than other information?

Though this is all far less cut-and-dry than “privacy of space” issues, these questions can generally be answered by the “help vs harm” equation. The basic sort of questions one might want to ask would be:

Will sending and storing this information harm any users, immediately or potentially? (Remember, “potentially” is pretty broad–what happens if somebody with bad intentions steals that information from you? What happens if somebody buys your company and decides to use that information in a way that you think is bad?)
Would it help your users more than harm them to take this information?
Taking all the above into account, should sending this information be optional? (This is largely determined by how broadly it could be harmful to collect the information.)
If sending the information is optional, should it be opt-out or opt-in? (That is, should it automatically be on, and people have to turn it off if they don’t want to send the info, or should it be off and people have to choose to turn it on?)
If it’s opt-in, will the feature still be helpful to enough of your users to justify implementing it?

There are some people who will claim that no information should ever be sent or stored about the user, that all privacy options should always be opt-in, and that all information is so potentially harmful that no debate about this can be accepted. That is, frankly, a ridiculous proposition. It’s so obviously untrue that there’s almost no way to argue with it, because it’s such a shocking irrationality. Just like the fact that somehow, liquids could harm somebody (so you can’t bring liquids on an airplane in the USA) it’s true that there are situations in which almost any piece of information could be dangerous. That doesn’t mean that all information is dangerous, though.

My martial artist friends have frequently joked that they shouldn’t be allowed to bring any object on an airplane, because they could kill somebody with any of them. Similarly, given almost any piece of information, somebody could do something harmful with it, somewhere, at some point. If I know you have a quarter in your pocket, I’m sure there’s some situation in which I could use that information to get you in some serious trouble. But that doesn’t make that information realistically harmful, even potentially.

Even the idea of “every single piece of information should be opt-in” is ridiculous. Do you want the web browser to ask you, “May I send this page your IP address?” every time you load a web page? Well, if you’re a spy in a hostile country, maybe you do. But if you’re like most people, that would probably just annoy you–you’d stop using that web browser and switch to another one. And if you are a spy or a resistance fighter, then you probably know how to use Tor to avoid being tracked.

So when we’re talking about privacy, it’s not an issue of “in some incredibly unlikely situation, this information could be very harmful,” it’s an issue of balancing help vs. harm in real-world situations. Real-world situations can be pretty strange and unexpected, but they at least are real, and can be balanced and talked about. Doing so, you can make good decisions about how to protect your users’ privacy–how much information to take, how you inform them about the information you’re taking, and what you do with that information when you have it.

So no, this is not a casual issue or something that we should brush-off and just ignore the dangerous implications of, but it’s also not an extreme unsolvable situation where we have to decide to keep everything private because we can’t make up our minds about it. Privacy is simply something that we should be able to analyze factually, based on real-world situations and data, and come to some practical and useful decision about.

-Max

The post Privacy, Simplified appeared first on Code Simplicity.

Why Programmers Suck

2009-12-02T01:03:30Z

A long time ago, I wrote an essay called “Why Computers Suck” (it was given the title “Computers” and “What’s Wrong With Computers” in two later revisions, and the original title never saw the light of day). The article was fairly long, but it basically came down to the idea that computers suck because programmers create crazy complicated stuff that nobody else can understand, and complexity builds on complexity until every aspect of a program becomes unmanageable.

What I didn’t know at the time was why programmers did this. It was obvious that they did do it, but why would the software development industry produce so many crazy, complex masses of unreadable code? Why did it keep happening, even when developers should have learned their lesson after their first bad experience? What was it that made programmers not just make bad code, but keep on making bad code?

Well, this was a mystery, but I didn’t worry too much about it at first. Just the revelation that “bad programs are caused entirely by bad programmers”, as simple and obvious as it may seem, was enough to fuel an entire investigation and study into the field of programming, one which had some pretty good results (that’s mostly what I’ve written about on this blog, and it’s also the subject of a book that’s in development). The problem had been defined (bad programmers who create complexity), it seemingly had a solution (describe laws of software design that would prevent this), and that was enough for me.

But it still baffled me that the world’s universities, technical schools, and training programs could turn out such terrible programmers, even with all of the decades of advancement in software development techniques. Sure, a lot of the principles of software design hadn’t been codified, but a lot of good advice was floating around, a lot of it very common. Even if people hadn’t gone to school, didn’t they read any of this advice?

Well, the truth was beyond my imagination, and it took almost five years of working on the Bugzilla Project with a vast number of separate contributors until one day I suddenly realized an appalling fact:

The vast majority (90% or more) of programmers have absolutely no idea what they are doing.

It’s not that they haven’t read about software design (though they likely haven’t). It’s not that the programming languages are too complex (though they are). It’s that the vast majority of programmers didn’t have the first clue what they were really doing. They were just mimicking the mistakes of other programmers–copying code and typing more-or-less meaningless incantations at the machine in the hope that it would behave like they wanted, without any real understanding of the mechanics of the computer, the principles of software design, or the meanings of each individual word and symbol they were typing into the computer.

That is a bold, shocking, and offensive statement, but it has held up in my experience. I have personally reviewed and given feedback on the code of scores of programmers. I have read the code of many others. I have talked to many, many programmers about software development, and I’ve read the writings of hundreds of developers. The number of programmers who really understand what they are doing comprise only about 10% of all the programmers I’ve ever talked to, worked with, or heard about.

In open source, we get the cream of the crop–people who want to program in their spare time. And even then, I’d say only about 20% of open source programmers have a really good handle on what they are doing.

So why is this? What’s the problem? How could there be so many people working in this field who have absolutely no clue what they’re doing?

Well, that sounds a bit like they’re somehow “stupid.” But what is stupidity? People are not stupid simply for not knowing something. There’s a lot of stuff that everybody doesn’t know. That doesn’t make them stupid. That may make them ignorant about certain things, but it doesn’t make them stupid. No, stupidity, real stupidity, is not knowing that you don’t know. Stupid people think they know something when they don’t, or they have no idea that there is something more to know.

This sort of stupidity is something that can be found in nearly every field, and software development is no exception. Many programmers simply don’t know that there could be laws or general guidelines for software development, and so they don’t even go looking for them. At many software companies, there’s no attempt to improve developers’ understanding of the programming language they’re using–perhaps simply because they think that the programmers must “already know it if they were hired to do it”.

Unfortunately, it’s particularly harmful to have this sort of mindset in software development, because there is so much to know if you really want to be good. Anybody who thinks they already know everything (or who has a “blind spot” where they can’t see that there’s more to learn) is having their ability to produce excellent code crippled by a lack of knowledge–knowledge they don’t even know exists and that they don’t even know they lack.

No matter how much you know, there is almost always more to know about any field, and computer programming is no exception. So it’s always wrong to think you know everything.

Sometimes it’s hard to figure out what one should be learning about, though. There’s so much data, where does one start? Well, to help you out, I’ve come up with a few questions you can ask yourself or others to help figure out what areas might need more study:

Do you know as much as possible about every single word and symbol on every page of code you’re writing?
Did you read and completely understand the documentation of every single function you’re using?
Do you have an excellent grasp of the fundamental principles of software development–such a good grasp that you could explain them flawlessly to novice programmers at your organization?
Do you understand how each component of the computer functions, and how they all work together?
Do you understand the history of computers, and where they’re going in the future, so that you can understand how your code will function on the computers that will be built in the future?
Do you know the history of programming languages, so that you can understand how the language you’re using evolved and why it works like it does?
Do you understand other programming languages, other methods of programming, and other types of computers than the one you’re using, so that you know what the actual best tool for each job is?

From top to bottom, those are the most important things for any programmer to know about the code they’re writing. If you can truthfully answer “yes” to all those questions, then you are an excellent programmer.

It may seem like an overwhelming list. “Wow, the documentation for every single function? Reading that is going to take too long!” Well, you know what else takes a long time? Becoming a good programmer if you don’t read the documentation. You know how long it takes? Forever, because it never happens.

You will never become a good programmer simply by copying other people’s code and praying that it works right for you. But even more importantly, investing time into learning is what it takes to become good. Taking the time now will make you a much faster programmer later. If you spend a lot of time reading up on stuff for the first three months that you’re learning a new technology, you’ll probably be 10 times faster with it for the next 10 years than if you’d just dived into it and then never read anything at all.

I do want to put a certain limiter on that, though–you can’t just read for three months and expect to become a good programmer. First of all, that’s just too boring–nobody wants to just study theory for three months and not get any actual practice in. Very few people would keep up with that for long enough to become programmers at all, let alone good programmers. So I want to point out that understanding comes also from practice, not just from study. But without the study, understanding may never come. So it’s important to balance both the study and the practice of programming.

This is not an attack on any programmer that I’ve worked with personally, or even an attack on any individual programmer at all. I admire almost every programmer I’ve ever known, as a person, and I expect I’d admire the rest were I to meet them, as well. Instead, this is an open invitation to all programmers to open your mind to the thought that there might always be more to know, that both knowledge and practice are the key to skill, and that it’s not shameful at all to not know something–as long as you know that you don’t know it, and take the time to learn it when necessary.

-Max

The post Why Programmers Suck appeared first on Code Simplicity.