Scholars of trivia

One of the main things that bothers me about the C++ programming language is the vast amount of trivia you have to know just to be able to read the code written by other programmers. Since I only occasionally dabble in it, I usually curse the show-offs that are making my life “interesting” for no good reason. Here’s a chunk of code that I have recently encountered in the wild:

std::set<std::string> updatedChildUris;

// messy ball of code that is operating on the children array (vector)

std::transform(children.begin(), children.end(),
    std::inserter(updatedChildUris, updatedChildUris.end()),
    [](const DataPoint &dataPoint) {
    return dataPoint.getUri();
});

Given enough eyeballs all unreadable messes become readable and eventually you will be able to decipher this code. Apparently the whole purpose is to insert some uris from an array into a set in order to get rid of the duplicated uris.

So far so good, but is there a better way of doing that?

std::set<std::string> updatedChildUris;

// messy ball of code that is operating on the children array (vector)

for (auto &dataPoint : children) {
    updatedChildUris.insert(dataPoint.getUri());
}

This for loop is doing the exact same thing as the code from the initial example, except that it’s shorter, easier to read and it most likely also compiles faster. Since I don’t use C++ that often and the transform example kept repeating throughout the file, I went to the author in order to learn more about the dark secrets of their favorite programming language. Their response was rather unexpected: “It’s the best practice in modern C++. Once you learn more about these patterns, you will see it’s the right approach.”

Of course this justification is a complete nonsense. There is absolutely no benefit behind using the transform example, except maybe to brag about using “modern” features of your programming language 1. Even if there is some “special optimization” going on behind the scenes, in this specific example that didn’t matter as the number of elements was low and the execution speed was never an issue.

So, why would anyone repeatedly write code in a much more complicated way instead of sticking with a simple for loop? Just because it’s a best practice according to an unnamed authority is hardly a satisfactory answer.

When somebody calls themselves an engineer, the least I would expect from them is to question what they have read and make their own little experiments to verify the claims. Is the code easier to read if I use this feature? Does it run faster? Is it consuming less memory? If we hire an average programmer, will they be able to understand it and modify it according to our needs or will they have to spend the next few years training in the mountain hideaway with Donald Knuth?

If the answer to these questions is not satisfactory, perhaps there is no point in using a certain feature of your programming language. Considering that making something much more complicated than it should be is such a widely seen problem in this industry, it might be worth ruminating on this issue for a while.

I can only guess why the complexity is so highly sought after, but I think it has something to do with the way the software is being built. In case you haven’t noticed so far, most of the software is only expanding and very rarely you will be able to remove anything from it. There is always another feature or edge case to handle, thus more code has to be written.

The old code, however, should still be working. You don’t want to break the existing functionality as that would annoy your users who might migrate to another project that is not breaking all the time. Imagine a programming language that would break your old code with every release; no one would ever use it for building a long term project.

Now imagine a programming language that would be 100% backwards compatible. Such a programming language could only add new functionality as any changes to its existing APIs would break the old code. As a consequence of keeping the backwards compatibility promise, the programming language would accumulate all sorts of cruft that developers would have to learn and navigate around. The developers would have to become scholars of trivia, as otherwise they might trip over some badly designed API.

One of the main fears that lingers in developer’s brain is to be perceived old and stale. You have to keep up with times or else you might have trouble finding your next employment. The easiest way to keep up with this “progress” 2 is to incorporate it in your day to day work.

So you will encounter excited developers that will try to convert existing for loops into little map reduce chains, because that is more, spins wheel, “maintainable.” Not to mention that for loops are dangerous; somebody might accidentally mutate your data during the iteration and you better prevent such accidents from happening.

As much hate as the Go programming language gets for not having certain features, it does a pretty good job of keeping the amount of trivia one has to learn to the minimum. If only more developers subscribed to this idea.

Notes


  1. If you can even call a decade old feature modern. Somebody not knowing anything about a decade old feature might sound bizarre to the person who has specialized in that specific programming language, but occasionally you just have to go and fix things that are outside your comfort zone. ↩︎

  2. We can hardly call it progress, as programmers keep rediscovering the concepts and giving them new names: “No that’s not a monolith, it’s a macro service.” ↩︎