Post

Trade-Offs: How Much Internal Documentation?

Trade-Offs: How Much Internal Documentation?

As part of my series on trade-offs in software development, I want to discuss what might impact how much and what internal documentation a team should write.

As with any piece of this series, many things in here might not matter to you and your team. Yet something that’s a key factor for you, might be missing. The world is complex and diverse, but this article will help you get thinking about the trade-offs that matter to you and your context.

The Documentation Process - Nicolas Régnier, c. 1622 - oil on canvas

The Documentation Process - Nicolas Régnier, c. 1622 - oil on canvas via classicprogrammerpaintings.com

What Are We Even Talking About?

To define the scope of this discussion there are two key items to scope. What is “internal” and what is “documentation”.

Internal

“Internal” in this context means internal to the team developing a product, library or platform. It’s documentation that customers won’t read, but people on the team itself. Your customers might be other developers or teams within the same company. In that case, documentation you write for them, is out of scope here.

Documentation

This can be anything that makes it easier to understand and/or operate your code base and its intended behavior. Obvious types of documentation are:

  • API docs

  • Guides

  • Run books

  • Readmes

  • Architecture diagrams

  • Architectural Decision Records (ADRs)

  • RFCs

Other things are less easy to categorize. If you follow BDD (Behavior-Drive Development), you will put some effort towards making the intended user-experience very clear. You will write a lot of text targeting humans, rather than being compiled for machines.

Well-kept meeting notes can help understand past decisions that seem incomprehensible in hindsight.

I hope these last two examples make clear that documenting doesn’t have to be a distinct activity.

Even if you identify that there is little need for documentation right now, that might change in the future. This means you have to be disciplined about on occasion considering going back and filling in missing documentation. You also have to do this before you forget important details.

What Impacts How Much And What To Document

Frequency and Likelihood of Change

If you are working on a new product or feature that doesn’t have product-market fit yet there is a good chance you’ll have to change the code. I worked on a product where what we thought was the main page was complicated and got completely changed twice. Then deleted it and then we resurrected it because the sales team liked it. You can write fabulous documentation but if what you document is gone or changed fundamentally soon, it was likely a waste.

If something hardly ever changes you also might want to document it. Otherwise, you run the risk of having to re-learning the area every time you come back to it. I recently made a small change to a project I hadn’t touched in 6+ months and had forgotten how it got deployed. After almost an hour I found that I had set up a full pipeline on Google Build (I usually use GitHub Actions for this). I hadn’t documented it because “It’s all automated! Just push to the prod branch!”. One sentence in the readme could have saved me that time.

Team attrition

If your team has an issue with frequent departures of team members, this of course is a reason to document more.

Cost of change

Writing technical documentation upfront and using it as a conversation piece can increase the likelihood of making the right decision. Decisions that are hard to undo (aka architecture), are a good use case for writing architectural proposals/documentation upfront and discussing them. When I worked as part of the Apache Geode open-source community, we sometimes struggled with this. Someone in the community would invest a lot of time making a change. Once the change was ready for a pull request, a debate would start. Sometimes the consensus ended up being that the approach was flawed or the change undesirable. I introduced a lightweight RFC process that front-loaded these discussions, but also resulted in decent internal documentation on why something was done a certain way. This can prevent major re-work, and you get documentation of what you built and why for posterity. If the cost of change is high, you also know you are less likely to change it and have to rewrite the documentation.

Maintenance

Like everything else, documentation needs maintenance. Worse than no documentation is documentation that’s wrong in non-obvious ways. Writing and updating documentation takes time. You also need to remember to update documentation. This can inform how you choose to document. The further removed the documentation is from the work, the less likely you’ll remember to update it. For this reason I love living documentation. This can be tests that focus on readability or a styleguide that uses your project’s actual HTML and CSS.

Living documentation isn’t always an option. Find other ways to bring the documentation close to the workflow that causes the changes. For many things, excellent commit messages can be a decent solution for documentation. I was skeptical of this till I consulted for a startup that was adamant about good commit messages. I remember my first PR being rejected multiple times solely because of the commit message being of insufficient detail. I thought this was lunacy. However, working on the code base it was quite useful to be able to git annotate a section of the code that was confusing and usually get an explanation right there in the IDE. I like that this solution has a direct place in the workflow. A hybrid solution might be linking external architecture documents or similar in git messages.

Team Seniority

If your team has a lot of junior team members, or you expect to hire them soon, documenting more might be a good way to help them ramp up and prevent them getting stuck. You might benefit from documenting things you usually wouldn’t. This can be anything from documenting more of your process to how to run a server for local development to linking to useful resources to understand design decisions. If your team is mostly senior, or even better, has worked together for a long time, much of that might be a waste of time.

Will This Be Used During Stressful Situations?

When your service is down at 2am, any documentation on how to identify and address possible error states is a godsend. Chances are, you might not remember some commands because you are tired and stressed. It’s also better to spend 30 minutes on a Wednesday morning looking up commands and writing down where to find logs that nobody will ever look at, then spending 2 minutes looking up a command you need right now which means an additional 2 minutes of service outage. This is mostly an argument for having a runbook. Don’t push off writing one till your learned the hard way that you need one. By then you also might forgotten some of the things that should be in there!

Synergy With Other Practices

Work Assignment Strategies

If your team works largely as a collective you might need less documentation than if the approach is highly individualistic.

An extreme example of a collective approach would be that all work gets assigned to the team and the team picks up work as pairs that frequently rotate and deliberately rotate between different areas of work. This will result in everyone on your team sharing a high level of familiarity with every part of the code base that was touched recently. It might be sufficient to catch up everyone by just describing how some outstanding issue was solved the day before. Everyone will understand it and the associated trade-offs because they know the area of the code. Even if you haven’t touched an area of the code in a few days you’ll probably find your way around pretty quick, since you know the general architecture. For critical architectural decision the team still should have discussions that center around a shared document that can later function as documentation.

On the other end of this, we have teams that assign areas of the code base, or even small services, to individuals. The individual isolation is even more amplified if there is a culture of long-running feature branches without pull-request reviews of changes on the branch. If somebody else has to touch that area, they’ll likely need guidance to get up to speed. They might miss out on potential gotchas without the original author explaining it or calling it out in documentation.

Your team’s actual way of working is probably somewhere in the middle, but the more individualist your approach, the more and the sooner you need documentation. This is of course not to say that even the most collectivist team can get away without any documentation in the long run. In the fullness of time, people forget things, new people join and team members leave. You still have production outages, etc.

Readable Tests

I touched a little bit in the beginning on how things like BDD can lead to your test functioning as living documentation. It doesn’t have to be BDD though. You can make a readable test suite with pretty much any test framework. Even if your tests fail when a bad change is made, it’s sometimes very helpful if the test title or a comment makes clear that the behavior is core to what the test is testing and not an oversight.

Comments

Why bother trying to write tests that are descriptive, if I can do something much simpler and just write a comment? I honestly see code comments as a last resort. Unlike tests they aren’t executable and thus easily can get out of date. This leads us back to that wrong documentation can easily be worse than no documentation. That said, they have the benefit of being right there with your code and if your team is disciplined, they can be great.

Closing

As always, I hope this got you thinking about what trade-offs work best for your particular circumstances. Let me know below what you think about this and if there are any areas you’d like me to discuss next.

All rights reserved by the author.