Do Repeat Yourself
One of the first things we teach aspiring software engineers is the principle Don’t Repeat Yourself. The notion that as software engineers we should never write the same code twice. Duplication makes our code harder to change. So to prevent this, we learn to take all cases of the oh so dreaded duplication away.
Most of the time, the principle applies and will make the codebase better. So, after learning this, it’s quite likely each and every case of duplication becomes one we fight. Before we know it we are waging war against duplication and exterminating it left and right using our favourite weapons - refactoring it into a function, utility class or even a standalone library for others to use.
But is it always worth it? Is abstracting each and every duplication away worthy of your precious time? And if not for time - is it always beneficial for maintainability?
Here, I’ll say it:
In some cases DRY is not worth it.
In some cases DRY does not help, but harms maintainability.
Before burning me at the stake 🔥, please let me explain…
Abstractions
Refactoring something specific, like extracting duplicated lines to a generic function, is an abstraction - albeit on a granular scale.
Refactoring a duplicated piece of code into the right abstraction is very satisfying. Especially when the moment of change inevitably knocks on your door. The business changes, so the code needs to change. Oh, how convenient! We abstracted this code into a function and now we only have to change it in one place instead of having to change it throughout the whole codebase.
It’s the moment of validation for our choices of the past. The indisputable proof we made the right choice. 💪
In this case it was worth the time that was invested into the refactor. It made our lives easier when change was needed.
A bad abstraction
But in other cases creating an abstraction that looks very similar to the abstraction above, can make our lives harder.
Let’s take a look at an example.
Once again we extract some duplication between two classes into a helper function while simultaneously dreaming of the moment of validation - which will surely arrive soon. We wait a while and surely enough, the business changes again. Of course, now we need to update the software.
We refine the change and give it quite a low estimate, as we expect to be able to easily implement the change. This is part of the clean code territory in the project after all! 🎉
But instead of being met with the happy experience of convenience, we notice that it’s now a little harder to add the new business rules than before.
Apparently, only the behaviour of one of the two original classes needs to be changed. As we already invested the time to create this beautiful abstraction, we absolutely can’t revert back to the old ways. That would feel bad, since then we would be admitting we made the wrong choice in the past. 🤔
So we decide to just add a parameter to the extracted function. I mean, just one more function parameter is still readable. We can implement this change pretty quickly. Nothing wrong with just a single parameter, right? And anyway, we might be able to benefit from the abstraction in the future.
And today, a bad abstraction was born.
And the 6 other engineers that touched the code after us? They will surely keep the abstraction in place. Someone spent effort on creating this abstraction, so it should be there for a good reason! Undoing this work and going back to the original duplication sounds like madness, it’d be almost insulting to the person who created it.
So some of them add a few positional parameters, others add parameters with default values and the last one even adds a few keyword arguments:
public void generate(Data data, MailData mailData, sendMail = false, type = Type.A, alsoGenerateY = true) /* etc... */ {
generateX(data);
// Parameters and checks keep getting added.
// Complexity keeps rising over time, at some
// point it becomes unmaintainable.
if (sendMail) {
if (type != Type.B && mailData != null) {
sendMail(mailData);
}
if (alsoGenerateY && type != Type.A && mailData.X == true) {
generateY(data);
}
}
}
The code is DRYer than the Sahara at high noon, but the cycle goes on, and on, and on…
What do you think about this? Let me know in the comments 👇