DSL is not the answer

Large software projects are often renowned for being late and vastly over budget while not doing half of the things that were imagined in their conception. The stakeholders will claim the unwashed programmer masses failed to deliver an usable software, while the grunts will complain about the suits having no clue about the domain related problems that the software was supposed to solve.

In order to overcome this problem some too smart for their own good people have come up with a new way of defining the software requirements. Instead of endless back and forths between the stakeholders and the engineers, the stakeholders would simply specify the expected behavior from the user’s perspective in plain English like language. The developers would only have to write and tweak the software to behave according to the expected behavior. Once all the specified scenarios are passing, that means the project is done. At the same time the test scenarios also serve as an up to date documentation.

A catchy term was coined (BDD - Behavior Driven Development) and a bunch of testing frameworks and consultancy shops appeared overnight all claiming on being the one true way of getting your project in order. Their benefits were proudly displayed on the front page:

  • The domain experts and even non-programmers can easily define software requirements and write tests in plain English like language, the so called Domain Specific Language (DSL).

  • The collaboration between business and technical people is improved, as the requirements are clearly defined in a form of the acceptance tests.

  • The DSL will improve your iteration time since the tests are written in higher level language. The people writing them will focus on the problem of the domain and will not have to worry about the underlying details.

  • The cost of the project is lower as you can employ non-programmers for writing your tests.

  • At the end of the test suite you get a nice colorful report in HTML format that reports the current health status of the software.

What’s not to like about this advertisement? It’s great! Developers around the world will get detailed specifications on how the world should behave and all they have to do is tweak the software and make it behave that way.

Let’s say we are simulating the airport. The rules of our airport are so simple, everybody can reason about them:

  • After a successful check-in, a person wearing a red shirt should be pointed to plane A.
  • After a successful check-in, a person wearing a blue hat should be pointed to plane B.
  • After a successful check-in, a person not wearing a hat should be pointed to plane C.

The airport scheduling mastermind will simply describe the expected airport scheduling behavior which will be tested against the actual system.

    Scenario1:
        Given a person wearing a red shirt
        When they checked in successfully
        Then the person should be pointed to plane A

    Scenario2:
        Given a person wearing a blue hat
        When they checked in successfully 
        Then they should be pointed to plane B

    Scenario3:
        Given a person not wearing a hat
        When they checked in successfully
        Then they should be pointed to plane C

Look how easy it is. Anybody can write and understand test cases like that.

Problems with DSLs

If the plain English like DSLs for writing tests would really bring such an incredible advantage over the old testing methodologies we would see them everywhere. Considering that even after more than a decade since their introduction that’s still not the case, there must be a reason for that right?

Requirements are clearly defined in a form of the acceptance tests and even non-programmers can write them

Any programmer worth their salt will see the test scenarios mentioned above and realize they are full of holes:

  • What happens if a person is wearing a red shirt and a blue hat? Which one of those rules has precedence over the other?

  • What happens if a person is wearing a red shirt, but is not wearing a hat?

  • What happens if a person is wearing a yellow hat? Does the presence of the hat affect us or is only the color that matters?

  • What happens if a person is shirtless?

  • HAS ANYBODY EVEN THOUGHT ABOUT THIS PROBLEM FOR MORE THAN 5 SECONDS?

Back in the prehistoric times of computing it was once believed that the programmers will be replaced by the UML drawing programs. The architects would simply draw the necessary UML diagrams and the rest of the program would be automatically generated. This thought didn’t age that well, once it was rediscovered that the main source of software complexity are the underlying details and the endless edge cases.

If someone is thorough enough to imagine all the possible edge cases, then they are smart enough to learn a proper programming language and go from there. Otherwise a half assed tests written in a DSL are no better than half assed test plan dumped into a text file.

The whole idea of picking whoever comes first to write the tests in order to save the expensive developer’s time is ridiculous. While there are people who are doing just that (e.g system engineers), they are not any less expensive.

Domain experts can define requirements and write tests

I have a hard time imagining a 50+ years old medical doctor who knows the ins and outs of the human body to be dealing with writing tests for a medical software. Except in the salesmen fairy tales, that just doesn’t happen. The SQL was once meant to be simple enough to be used by non-programmers such as accountants. When was the last time you saw one of those writing SQL?

But for the sake of argument, let’s say our project is special and the domain expert is really writing the acceptance tests. Since they are probably not writing them on a daily basis (or maybe it’s even their first time), they will more likely come up with a weird structural choices that the programmers will have to fix anyway.

In the end both sides will be pretty bitter about the whole experience. The domain experts will have to get familiar with these new concepts instead of just being there to provide the necessary information about the current state of the art of their field. The developers will have to do all this extra fixing work on top of the actual testing code that has to exist behind the scenes. If that’s the case then it doesn’t really matter if the developers write the test cases as well.

DSL will improve your iteration time

When you are writing tests in proper programming language only the testing code has to be written. As soon as you introduce another layer of abstraction (DSL), some schmuck will have to write the plumbing layer between the DSL and the actual testing code that the computer understands. Every little change that happens in the DSL layer also has to be updated downstream in the testing code that is interacting with your system. It’s annoying, bug prone and extremely time consuming.

So you don’t get anything for free here. The testing code reuse should be extremely high to justify the expense of writing this additional plumbing layer. It almost never is, but that is never mentioned in the online propaganda materials.

The fact that these testing frameworks are needy, and somehow always end up doing everything from starting services to asserting the test results brings in another problem. They are extremely hard to debug, because they don’t have a proper debugger.

Sometimes there is a bug in your business logic. Sometimes you have a flaky test. Sometimes you hit a weird multi threading problem. Now you are in a world of pain. Your test suite is hundreds of lines long written in this stupid pseudo programming language interfacing with your complicated business logic and all you have in your arsenal is printf debugging.

What an improvement. This is like going back to the software cavemen days.

More pain

The editor support for these DSLs is usually very poor as they are often treated as a regular text. You will hit weird bugs in the underlying testing framework. Exceptions will be silently swallowed. The internal stdio buffer will be filled which will cause tests to randomly hang 1.

If not a DSL then what?

A proper programming language in which the rest of your software is already written in. It has a great editor support and it’s easy to write complex scenarios with it. No matter how hairy problem you encounter, there is a debugger waiting for you. If you are using a static typed language, the compiler will catch all the typos and other problems before your tests are allowed to run. No more run and pray it works.

The testing code in Scenario1 could be written just as easily:

    // is this really way harder to understand?
    Person person = createPerson().with(Shirt.RED);
    Plane assignedPlane = afterSuccessfulCheckIn(person);
    assertEquals(Plane.A, assignedPlane);

It’s baffling to think that someone capable of wielding Excel formulas would somehow not be capable of writing simple programs like the one above. Instead of wasting all this time with parsing the plain text and handling weird edge cases, invest some of your time to build a decent library for your domain. I’ve seen way to often big shops using some crappy open source library and then complaining how it doesn’t quite fit the mold.

The enormous costs of large software projects usually do not come from the tests that would take too much time to write. The large costs come from the stakeholders not entirely understanding the problem that they are trying to solve and laundry list of ever changing features that nobody needs.

The DSL is tackling some of the large project problems, but at the cost of introducing a whole other set of problems that completely dwarf the benefits 2. A grumpy business analyst that will mercilessly cut out the unnecessary features will help. Another hyped up tool in your toolbox will not.

Notes


  1. One of the test cases produced a very large amount of debug logs that were also displayed on the console. The DSL testing framework had an internal buffer that gathered the stdio of the test cases in the suite. Due to large amount of logs that buffer was filled and the entire process locked up completely. Debugging this problem took days for no good reason. ↩︎

  2. Here is an anecdote from the trenches. On a large and complex software project, the client insisted that the tests should be written in one of those testing DSLs. The team that was working on this project consisted of experienced developers that also wrote all the tests for the project.

    The testing process took tremendous amount of time. In fact the whole testing debacle took way more time than writing the software itself. The vast majority of time was spent in screwing around with this stupid testing framework and its DSL, where we hit every single problem described in this post. The testing scenarios were not easy to read due to poor editor support and when something went wrong they were impossible to debug.

    For the next phase of the project, we simply wrote the entire scenarios in a proper programming language and the DSL was only used for calling the test cases that returned either 0 (success) or 1 (failure). When a certain test didn’t pass, you could simply put a breakpoint directly in the editor and inspect what was going on without all the DSL framework nonsense.

    If the DLS works for you, that’s okay. You should still be aware of the baggage that the DSL brings to your project. ↩︎