Posts Tagged Unit Testing

Don’t use @Test(expected = …)

The one feature of JUnit that I really don’t like is the ability to say that a test is expected to throw a particular exception. Here’s an example:

public class MyClass {

   private final Something something;

   public MyClass(Something something) {
      this.something = something;
   }

   public void doWhatever(String aString) {
      something.useString(aString);
   }
}

...

public class MyClassTest {

  @Test(expected = NullPointerException.class)
  public void shouldThrowNullPointerExceptionWhenStringIsNull() throws Exception {
    MyClass myClass = new MyClass(null);
    myClass.doWhatever(null);
  }

This example is not as bad as they can be, but it illustrates the point. The method call will indeed throw a NullPointerException, but the reason won’t be that the string parameter is null – the NPE is thrown because the ‘something’ member is null. The general problem with the ‘expected’ parameter to the @Test annotation is that you don’t know which particular case led to the exception, and general exceptions such as NullPointerException (or, worse RuntimeException, or perhaps some generic exception you use for wrapping all kinds of checked exception in some code) can be thrown for many reasons.
I very much prefer code along these lines:


  @Test
  public void throwsExceptionWithMeaningfulInformation() throws Exception {
     IOException expected = new IOException("expected");

     when(collaborator).method("some parameter").thenThrow(expected);

     try {
        classUnderTest.methodThatCallsCollaborator("some parameter");
        fail("expected exception");
     }
     catch (WrappingException e) {
        assertThat(e.getCause(), equalTo((Throwable) expected));
        assertThat(e.getMessage(), containsString("some parameter"));
        assertThat(e.getMessage(), containsString("something else the code should report"));
     }
  }

EDIT: based on Dominic’s comment, I now normally prefer the idiom below – the rest of the post still applies, but using @Rule and ExpectedException is more concise and readable. The only thing missing is an easily accessible Matcher<Throwable> that would allow you to match the cause as well. I’ll probably create one, for internal use, at least.

EDIT: JUnit 4.11-SNAPSHOT now has support for cause matching using ExpectedException. :)


  @Rule
  public ExpectedException thrown = ExpectedException.none();

  @Test
  public void throwsNPEIfSecondParameterIsNull() throws Exception {
     thrown.expect(NullPointerException.class);
     thrown.expectMessage("secondParameter");

     new ClassThatDisallowsNullForTheSecondParameter("hi", null);
  }

Both these idioms give you the ability to verify that you’ve in fact triggered the error you wanted to trigger, and it also gives you the ability to ensure that the code is reporting the problem in a good way by checking the message and cause of the thrown exception. Sure, it requires a little more typing than just using the @Test annotation, but you get that time back the first time that exception is actually thrown and somebody needs to figure out what went wrong. Both of the 2 minutes it took to do the extra typing!

Advertisement

, ,

4 Comments

Composition vs Inheritance

One of the questions I often ask when interviewing candidate programmers is “what is the difference between inheritance and composition”. Some people don’t know, and they’re obviously out right then and there. Most people explain that “inheritance represents an is-a relationship and composition a has-a”, which is correct but insufficient. Very few people have a good answer to the question that I’m really asking: what impact does the choice of one or the other have on your software design? The last few years it seems there has been a trend towards feeling that inheritance is over-used and composition to a corresponding degree under-used. I agree, and will try to show some concrete reasons why that is the case. (As a side note, the main trigger for this blog post is that about a year ago, I introduced a class hierarchy as a part of a refactoring story, and it was immediately obvious that it was an awkward solution. However, it (and the other changes) represented a big improvement over what went before, and there wasn’t time to fix it back then. It’s bugged me ever since, and thanks to our gradual refactoring, we’re getting closer to the time when we can put something better in place, so I started thinking about it again.)

One guy I interviewed the other day had a pretty good attempt at describing when one should choose one or the other: it depends on which is closest to the real world situation that you’re trying to model. That has the ring of truth to it, but I’m not sure it is enough. As an example, take the classically hierarchical model of the animal kingdom: Humans and Chimpanzees are both Primates, and, skipping a bunch of steps, they are Animals just like Kangaroos and Tigers. This is clearly a set of is-a relationships, so we could model it like so:


public class Animal {}

public class Primate extends Animal {}

public class Human extends Primate {}

public class Chimpanzee extends Primate {}

public class Kangaroo extends Animal {}

public class Tiger extends Animal {}

Now, let’s add some information about their various body types and how they walk. Both Humans and Chimpanzees have two arms and two legs, and Tigers have four legs. So we could add two legs and two arms at the Primate level, and four legs at the Tiger level. But then it turns out that Kangaroos also have two legs and two arms, so this solution doesn’t work without duplication. Perhaps a more general concept involving limbs is needed at the animal level? There would still be duplication of the ‘legs + arms’ classification in both Kangaroo and Primate, not to mention what would happen if we introduce a RattleSnake into our Eden (or an EarthWorm if you prefer an animal that has never had limbs at any stage of its evolutionary history).

A similar problem is shown by introducing a walk() method. Humans do a bipedal walk, Chimpanzees and Tigers walk on all fours, and Kangaroos hop using their rear feet and tail. Where do we locate the logic for movement? All the animals in our sample hierarchy can walk on all fours (although humans tend to do so only when young or inebriated), so having a quadrupedWalk() method at the Animal level that is called by the walk() methods of Tiger and Chimpanzee makes sense. At least until we add a Chicken or Halibut, because they will now have access to a type of locomotion they shouldn’t.

The root problem here is of course that the model is bad even though it seems like a natural one. If you think about it, body types and locomotion are not defined by the various creatures’ positions in the taxonomy – there are both mammals and fish that swim, and both fish, lizards and snakes can be legless. I think it is natural for us to think in terms of templates and inherited properties from a more general class (Douglas Hofstadter said something very similar, I think in Gödel, Escher, Bach). If that kind of thinking is a human tendency, that could explain why it seems so common to initially think that a hierarchical model is the best one. It is often a mistake to assume that the presence of an is-a relationship in the real world means that inheritance is right for your object model.

But even if, unlike the example above, a hierarchical model  is a suitable one to describe a certain real-world scenario, there are issues. A big one, in my opinion, is that the classes that make up a hierarchical model are very tightly coupled. You cannot instantiate a class without also instantiating all of its superclasses. This makes them harder to test – you can never  ‘mock’ a superclass, so each test of a leaf in the hierarchy requires you to do all the setup for all the classes above. In the example above, you might have to set up the limbs of your RattleSnake and the data needed for the quadrupedWalk() method of the Halibut before you could test them.

With inheritance, the smallest testable unit is larger than it is when composition is used and you increase code duplication in your tests at least. In this way, inheritance is similar to use of static methods – it removes the option to inject different behaviour when testing. The more code you try to reuse from the superclass/es, the more redundant test code you will typically get, and all that redundancy will slow you down when you modify the code in the future. Or, worse, it will make you not test your code at all because it’s such a pain.

There is an alternative model, where each of the beings above have-a BodyType and possibly also have-a Locomotion that knows how to move that body. Probably, each of the animals could be differently configured instances of the same Animal class.


public interface BodyType {}

public TwoArmsTwoLegs implements BodyType {}

public FourLegs implements BodyType {}

public interface Locomotion<B extends BodyType> {
  void walk(B body);
}

public class BipedWalk implements Locomotion<TwoArmsTwoLegs> {
  public void walk(TwoArmsTwoLegs body) {}
}

public class Slither implements Locomotion<NoLimbs> {
  public void walk(NoLimbs body) {}
}

public class Animal {
   BodyType body;
   Locomotion locomotion;
}

Animal human = new Animal(new TwoArmsTwoLegs(), new BipedWalk());

This way, you can very easily test the bodies, the locomotions, and the animals in isolation of each other. You may or may not want to do some automated functional testing of the way that you wired up your Human to ensure that it behaves like you want it to. I think composition is almost universally preferable to inheritance for code reuse – I’ve tried to think of cases where the only natural model is a hierarchical one and you can’t use composition instead, but it seems to be beyond me.

In addition to the arguments here, there’s the well-known fact that inheritance breaks encapsulation by exposing subclasses to implementation details in the superclass. This makes the use of subclassing as a means to integrate with a library a fragile solution, or at least one that limits the future evolution of the library.

So, there it is – that summarises what I think are the differences between composition and inheritance, and why one should tend to prefer composition over inheritance. So, if you’re reading this because you googled me before an interview,  this is one answer you should get right!

, , ,

12 Comments

Code Sharing Wrap-up

As I suspected, when I planned to write a series of posts about code sharing, I’ve realised that I won’t write them all. The main reason is that I started out with the juiciest bits, where I felt I had something interesting to say, and the rest of the subjects feel too dry and I don’t think I can write interesting posts about them individually. So I’ll lump them together and describe briefly what I mean by them in this wrap-up post instead.

The bullet points that I don’t think are ‘big’ enough to warrant individual posts are:

  1. Use JUnit.
  2. Use Hudson.
  3. Manage Dependencies.
  4. Communicate.

Let’s tackle them one by one. The first one, ‘Use JUnit’, is not so much intended to say that JUnit is the only unit testing framework out there (TestNG is as good, in my opinion). It is rather a statement about the importance of good automated tests when sharing code. The obvious motivation is that almost every conflicting change between two teams is a regression error and therefore possible to catch with automated tests. If each team ensures that the use cases they want from a shared library are tested automatically (note that I don’t call the tests unit tests; they are more functional than unit tests) with each build, they can guard their desired functionality from breaking due to changes made by another team. A functional test that is broken intentionally due to a change desired by one team should trigger communication between teams to ensure that it is changed in a way that works for all clients of the library.

I’ve never tried formalising the use of different sets of functional tests owned by different clients of a library as opposed to just having a single comprehensive set of unit tests. But it feels like a potentially quite attractive proposition, so it might be interesting to try. It might require some work in terms of getting it into the build infrastructure in a good way. I’d love to be able to see how that works at some point, but simply having a single comprehensive set of unit tests works really well in terms of guarding functionality, too.

‘Use Hudson’ says that continuous integration (CI) is vital when sharing code. It feels like everybody knows that these days, so I don’t think I need to make the case for CI in general. In the context of sharing libraries, the obvious benefit of CI is that you will detect failures sooner than you would have if you just rely on individual developers’ builds. This is especially true of linkage-type errors. You’ll catch most errors that would break a unit test in the library you’re working on by just running the build locally, but CI servers tend to be better at checking that the library works with the latest snapshots of related libraries and vice versa. Of the CI servers I’ve used (includes Continuum and Cruise Control), Hudson has been by a wide margin the best. Hudson’s strength relative to the others is primarily in the ease of managing build lines – the way we use it, anybody can and does create and modify builds for some project almost weekly. I haven’t used the others in a couple of years, so it may have changed, but earlier what you can do in 30 seconds with Hudson used to take at least an hour or more depending on how well you remember the tricks to use with them.

I think that I touched on most of the arguments I wanted to make about ‘Manage Dependencies’ in the post I wrote titled Divide and Conquer. Essentially, the graph of dependencies between shared libraries that you introduce is something that is going to be very hard and expensive to change, so it is well worth spending some time thinking hard about what it should be like before you finalise it. The Divide and Conquer post contains some more detail on what makes it hard to evolve that graph as well as some tips about how to get it right.

The final point is ‘Communicate’. I sometimes think that communication is the hardest thing that two people can try to do, and of course it gets quadratically harder as you add more people. It is interesting to note how much of business hierarchies and processes are aimed at preventing or fixing communication problems. In the particular case of code sharing, the most important communication problems to solve are:

  • Proactive notifications – if one team is going to make a change to a shared library, many problems can easily be avoided if other teams are notified before those changes are made so that they get the opportunity to give feedback about how that change might affect them. At Shopzilla, we’re using a mailing list where each team is obliged to send three kinds of messages:
    • After each sprint planning session, a message saying either “We’re not planning to make any changes to shared code”, or “We’re anticipating making the following changes to shared code: a), b) and c)”. The point of always sending an email is that it is very easy to forget about this type of communication, so always having to do it should mean forgetting it less often.
    • If a need to make changes is detected later than sprint planning (which happens often), a specific notification of that.
    • If changes have been made by another team that led to problems, a description of the changes and problems. This is so that we can continuously improve, not in order to point fingers at people that misbehave.
  • Understanding requirements and determining correct solutions – it is often not obvious from just looking at some code why it has been implemented the way it is. In that scenario, it is important to have an easy way of getting hold of the person/people that have written the code to understand what requirements they were trying to meet when writing it so that one can avoid breaking things when making modifications. This is often made harder by client evolution: shared code may not be modified to remove some feature as clients stop using it, so dead code is relatively common. Again, I think that a mailing list (or one per some sub-category of shared code) is a useful tool.
  • Last but probably most important: a collaborative mindset – this is arguably not ‘just’ a communication problem, but it can definitely be a problem for communication. It is possible to get into a tragedy of the commons-type situation, where the shared code is mismanaged because everybody focuses primarily on their own products’ needs rather than the shared value. This can manifest itself in many ways, from poor implementations of changes in the shared code, to lack of responsiveness when there is a need for discussions and decisions about how to evolve it. To get the benefits of sharing, it is crucial that the teams sharing code want to and are allowed to spend enough time on shared concerns.

So, that concludes the code sharing series. In summary, it’s a great thing to do if done right, but there’s a lot of things that can go wrong in ways that you might not expect beforehand – the benefits of sharing code are typically more obvious than the costs.

, ,

2 Comments

Unit Tests Are Good For You

Some 5-6 years ago, I became a mother, biologically implausible as it seems. At the time, I was CTO at Jadestone and that’s when I started to think that programmers writing their own unit tests is a paradigm shift of the same magnitude as object-orientation was in the 90s. So I started trying to convert the developers (and others) to the unit testing mentality, with varying success – some became as convinced as myself, others were very reluctant to write any tests at all, and most seemed to end up in some in-between state: “I guess it’s a good thing to do, and I will if there’s time, but…” It seems like a lot of developers end up somewhere in that middle state, with writing unit tests being something a bit analogous to cleaning your room every week. You know it’s a good thing to do, and you do it because Mum (yep, that’s me) is telling you you have to, but you don’t really want to.

Even at Shopzilla, where the builds fail unless a certain percentage of the code is covered by unit tests (this is a great practice despite the shortfalls of code coverage as a measurement of test quality), there are issues with people’s motivation to write unit tests. Far too much of our code is being ‘exercised’ by the unit tests with no checking of expected results, and/or without sufficient branch coverage.

I remember creating slides at Jadestone with what I still think are really good reasons to write lots of unit tests: it improves the code quality by helping you find bugs, making your code testable forces you to design it in a modular way, having great regression tests gives you the courage to do the aggressive refactoring that is required for long-term productivity, and so on. But after a few years of working with it, I’ve come to the conclusion that the main reason really is – hold your ears, I can feel years of nagging coming out all at once:

IT’S MAKING *YOU* MORE PRODUCTIVE. DAMMIT!

People thinking “I will write unit tests if I have the time” or “when I’m done with the code” are just plain wrong. True, just writing the code needed for a feature or bug fix takes less time than also writing unit tests. But checking that the fix/feature works if it is a part of a slightly complex system? Starting or restarting one of the Shopzilla websites takes something like 3-7 minutes. Once you’re up, you can typically test almost anything by navigating to a single URL, but if you didn’t get the fix right, you have to re-fix, re-build and re-start. If the fix is in a shared library, you need to re-build and re-install the library, then re-build and re-start the site. This takes ages. Usually a lot more than the time it takes to verify the fix with a unit test.

And with multiplayer games, for instance, the situation is even harder. To verify a Shopzilla-style website bug, all you need is normally a single URL to reproduce the problem. With more complex clients, you may have to bring a couple of clients and a server into a given error state in order to reproduce/verify a bug. This can be a nightmare. Even with unit tests, you’ll obviously have to do this, but if you do TDD, you’ll have written a test that reproduces the observed error before trying to fix it. Seeing that the unit test used to be broken but isn’t any longer is great for your confidence and usually means you only need to do the work to reproduce the bug “in reality” once. I’m not sure it is necessary to write tests before you write all your code, but I think it is a good idea to try to do so. And you definitely don’t want to write all your unit tests after you’ve written all the feature code.

I don’t try to convert people into unit tests as much as I used to, but if you do use unit tests properly, not only will you write code that is of higher quality, more modular and easier to refactor to make it support future features and changes. You’ll be adding working features quicker, too. Really, kids, you should clean your rooms every week.

,

Leave a comment