I’ve now been working for more than 8 months on a project where we’ve been using BDD and creating high-level tests written in Gherkin before actually going on to implement the features themselves. It’s the first time I’m using BDD, so it’s been very interesting. The main conclusion is that it’s a very useful thing to do – writing a high level test really helps me determine how things should work, and I think it makes the implementation cleaner. I can’t say I have conclusive evidence that we’ve got fewer regression bugs as a result of our BDD tests, but at least we’ve got comprehensive regression/acceptance tests that are fully automated. Another BDD reflection is that I think it’ll take me some more time before I really feel I understand it; I’m still not sure I get things like how best to structure the testing code from the perspective of making tests very robust to changes in features. That may be a topic for a future blog post, if and when I start feeling I have some insight into that.
So, BDD is a good thing, as far as I am concerned. This post is not about that; it’s about our experiences of using two different frameworks to execute the Gherkin tests: a Python-based one called Lettuce and Cucumber-JVM, using Java for the step definitions. We evaluated both during our project initiation, and made the decision to go with Lettuce. The primary reason was that we’re using Python for a lot of our automated testing company-wide, and it was felt that it would be easier to move from project to project if we kept to the same language. Cucumber-JVM wasn’t actually released yet at the time, but it felt stable (and the code is really well written) and was close to a release, so that didn’t factor heavily in the decision. For me, being essentially a Java-only developer, I felt that was an OK decision. I felt I would personally be more productive using Java than Python, and that Cucumber-JVM integrated better with our Maven-centric builds, but that it would be interesting to have a good excuse learn a new language.
Now, a few months down the line, we’ve made the decision to stop developing new tests using Lettuce and instead use Cucumber-JVM for new features. (More precisely, we will keep doing our big end-to-end tests using Lettuce, since we’ve got some really nice infrastructure code written in Python that helps us manage deployments and stuff, but the single-service tests will be Cucumber-JVM from now on).
The reasons for this decision are as outlined below – note that it’s hard for me having done about 20 times more Java development than Python to be completely objective about the languages as languages. I’ve tried to be explicit about those points that are clearly subjective.
- In my personal opinion, Python as a language lends itself better to ‘small, quick’ types of problems. Throughout the last few months, I’ve repeatedly been impressed by how often ‘just trying something’ in Python leads to the behaviour I was looking for. Also, since it is interpreted, experimenting can be really quick. However, in the case of running BDD tests, you’re actually not working on a ‘small, quick’ type of problem. Frequently, running the tests takes minutes because they require starting/restarting services, setting up databases, etc. That means that mistakes that the Java compiler/IDE would have told you about as you typed them can take a minute or so to discover, and that’s not great at all. I think dynamic languages require quick turnaround to give you any benefits from the reduced amount of typing you have to do to write code in them, and BDD doesn’t feel like a problem that lends itself to quick turnaround times.
- It’s hard to troubleshoot tests that are run via Lettuce. You can’t debug Python, and something (nose or Lettuce itself, I’m not sure) takes over the standard out, which means it’s hard to print out debug output. In our current implementation, we’re getting assertion errors divorced from the debug output, which makes it quite awkward to figure out what was the actual issue.
- We’ve had to spend a lot of effort on getting the necessary infrastructure together that allows us to do things like share code between BDD tests for different services, ensure that different environments run the same libraries, that we don’t have to install too many dependencies locally before being able to run a build on a given machine, etc. So we’re using things like pip to download and install dependencies, virtualised buildouts to ensure that we actually have the right versions of stuff without conflicting with the machine-global Python installation, and an internal cheeseshop for distributing our internal test utilities code. This is stuff that basically comes for free and in a better implementation when using Maven. When I say ‘better implementation’, I’m talking particularly about the dependency management, where Maven’s versioning of libraries feels like it is more mature and works better than Python eggs do.
- Since it’s possible to run Cucumber-JVM tests via JUnit, they slot directly into all the existing infrastructure around build tools and processes. This means that you can immediately view test results from the BDD tests in the same way as your unit tests, which is very useful (although there are still issues). For our Lettuce tests, on the other hand, we’ve had to hand-craft tools to plug in the test execution and result notification into Maven. And while it should be possible, we haven’t gotten around to actually integrating the output of those tools fully into Jenkins, Sonar, etc. This integration again just comes for free with Cucumber-JVM.
- Another advantage of having the tests runnable via JUnit is that it is trivial to run them from inside the IDE, which means you can use a debugger to do troubleshooting, that you can easily select which ones to run, etc.
- Finally, although we have no data to confirm this, it feels like running tests with Cucumber-JVM is faster than running via Lettuce. And that’s important as it makes it less painful to run the BDD tests more frequently. (EDIT: see comment below for some actual data to support this point)
For me, as an experienced Java hand and far from a polyglot programmer, it’s been very interesting to spend a few months with another language. I think I can safely summarise most of the advantages of Cucumber-JVM over Lettuce as coming from “Java vs Python as a platform” rather than “Java vs Python as a language” – with some implementation choices thrown in there and the fact that BDD as a problem doesn’t lend itself to quick turnaround, which makes static analysis more valuable.
In general, BDD has been a great experience so far, as has Cucumber-JVM and to a lesser degree Python. But from my perspective, Lettuce isn’t a great tool for writing BDD tests for Maven-based projects.