Archive for November, 2013

Keeping Classes Simple using the SRP

The Single Responsibility Principle – that a class should have one and only one reason to change – is, I think, one of the most important principles to follow in order to make high quality and easily maintainable software. It’s also one of the principles that I see abused the most often (by myself as well as others). This post is cross-post of something I wrote for the internal Spotify blog, and talks about a technique to make it easier to adhere to the SRP and thus create simple classes, in the Rich Hickey sense: ‘easy’ is a subjective term describing the effort it takes some particular person to do something, ‘simple’ is an objective term describing the resulting artefact. Simple software is valuable (and complex software is costly and causes frustration), so it is very useful to have good tools that make it easy to create simple things.

I’ll describe a concrete technique that I think of as composing behaviours; I think it’s close to if not identical to the decorator pattern. I’ll be using some code I recently wrote in a library wrapping DNS lookups at Spotify. I needed to create something that would let us do SRV lookups, and if the results were empty or there was another kind of failure when doing the lookup, it should retain the previous data. This is to reduce the risk of service outages caused only by DNS failures. I also wanted to ensure that we had metrics on the time we spent on DNS lookups, the rate of failures, and how common empty responses were.

The most common reaction when faced with a problem description like that is to create a class, (DnsSrvResolver, let’s say) that implements those features. In a sense, this class does a single thing – DNS SRV lookups that are metered and whose results can be kept in case of problems. But I would argue that that ‘thing’ is too big, and it’s better to create multiple classes doing smaller things. Here’s how.

First, an interface (the code in the post is Java, but of course the technique can be applied to code in any programming language):

public interface DnsSrvResolver {
  List resolve(String fqdn);
}

Then, an implementation class that does the actual lookup of SRV records using a standard DNS library:

class XBillDnsSrvResolver implements DnsSrvResolver {
  private final LookupFactory lookupFactory;

  XBillDnsSrvResolver(LookupFactory lookupFactory) {
    this.lookupFactory = lookupFactory;
  }

  @Override
  public List resolve(final String fqdn) {
    Lookup lookup = lookupFactory.forName(fqdn);
    Record[] queryResult = lookup.run();

    if (lookup.getResult() != Lookup.SUCCESSFUL) {
      throw new DnsException(
        String.format("Lookup of '%s' failed with code: %d - %s ",
           fqdn, lookup.getResult(), lookup.getErrorString()));
    }

    return toHostAndPorts(queryResult);
  }

  private List toHostAndPorts(Record[] queryResult) {
    ImmutableList.Builder builder =
      ImmutableList.builder();

    if (queryResult != null) {
      for (Record record: queryResult) {
        if (record instanceof SRVRecord) {
          SRVRecord srvRecord = (SRVRecord) record;
          builder.add(HostAndPort.fromParts(
                srvRecord.getTarget().toString(),
                srvRecord.getPort())
          );
        }
      }
    }

    return builder.build();
  }
}

So far, nothing out of the ordinary. Here comes the trick: now, add the metrics-tracking and result-storing behaviours in new classes that delegate the actual DNS lookup to some other, unknown class. Metrics can be done like this:

class MeteredDnsSrvResolver implements DnsSrvResolver {
  private final DnsSrvResolver delegate;
  private final Timer timer;
  private final Counter failureCounter;
  private final Counter emptyCounter;

  MeteredDnsSrvResolver(DnsSrvResolver delegate,
                        Timer timer,
                        Counter failureCounter,
                        Counter emptyCounter) {
    this.delegate = delegate;
    this.timer = timer;
    this.failureCounter = failureCounter;
    this.emptyCounter = emptyCounter;
  }

  @Override
  public List<HostAndPort> resolve(String fqdn) {
    final TimerContext context = timer.time();
    boolean success = false;

    try {
      List<HostAndPort> result = delegate.resolve(fqdn);
      if (result.isEmpty()) {
        emptyCounter.inc();
      }
      success = true;
      return result;
    }
    finally {
      context.stop();
      if (!success) {
        failureCounter.inc();
      }
    }
  }
}

And retaining old data like this:

class RetainingDnsSrvResolver implements DnsSrvResolver {
  private final DnsSrvResolver delegate;
  private final Map<String, List<HostAndPort>> cache;

  RetainingDnsSrvResolver(DnsSrvResolver delegate) {
    this.delegate = delegate;
    cache = new ConcurrentHashMap<String, List<HostAndPort>>();
  }

  @Override
  public List<HostAndPort> resolve(final String fqdn) {
    try {
      List<HostAndPort> nodes = delegate.resolve(fqdn);

      if (nodes.isEmpty()) {
        nodes = firstNonNull(cache.get(fqdn), nodes);
      }
      else {
        cache.put(fqdn, nodes);
      }

      return nodes;
    }
    catch (Exception e) {
      if (cache.containsKey(fqdn)) {
        return cache.get(fqdn);
      }

      throw Throwables.propagate(e);
    }
  }
}

Note how small and simple the classes are. That makes them very easy to test – only the top level one requires complex setup, due to the nature of the dnsjava library. The others are easily tested like so:

  @Before
  public void setUp() throws Exception {
    delegate = mock(DnsSrvResolver.class);

    resolver = new RetainingDnsSrvResolver(delegate);

    nodes1 = nodes("noden1", "noden2");
    nodes2 = nodes("noden3", "noden5", "somethingelse");
  }

  @Test
  public void shouldReturnResultsFromDelegate() throws Exception {
    when(delegate.resolve(FQDN)).thenReturn(nodes1);

    assertThat(resolver.resolve(FQDN), equalTo(nodes1));
  }

  @Test
  public void shouldReturnFreshResultsFromDelegate() throws Exception {
    when(delegate.resolve(FQDN))
      .thenReturn(nodes1)
      .thenReturn(nodes2);

    resolver.resolve(FQDN);

    assertThat(resolver.resolve(FQDN), equalTo(nodes2));
  }

  @Test
  public void shouldRetainDataIfNewResultEmpty() throws Exception {
    when(delegate.resolve(FQDN))
      .thenReturn(nodes1)
      .thenReturn(nodes());

    resolver.resolve(FQDN);

    assertThat(resolver.resolve(FQDN), equalTo(nodes1));
  }

  @Test
  public void shouldRetainDataOnFailure() throws Exception {
    when(delegate.resolve(FQDN))
      .thenReturn(nodes1)
      .thenThrow(new DnsException("expected"));

    resolver.resolve(FQDN);

    assertThat(resolver.resolve(FQDN), equalTo(nodes1));
  }
  // etc..

The fact that the functionality is implemented using several different classes is an implementation detail that shouldn’t leak through to users of the library. (You may have noticed that the implementation classes are package-private.) So to make it easy for clients, I like to provide a simple fluent API, something like this perhaps:

    DnsSrvResolver resolver =
        DnsSrvResolvers.newBuilder()
          .cachingLookups(true)
          .metered(true)
          .retainingDataOnFailures(true)
          .build();

In summary, the advantages of splitting up behaviours into different classes are:

  1. Each class becomes really simple, which makes it easy to test them exhaustively. Some of the code above does non-trivial things, but testing it is still easy.
  2. Since each behaviour lives in a single place only, it’s trivial to, for instance, change the way metrics are recorded, or to replace the underlying DNS library without affecting the rest of the chain.
  3. Each class becomes more understandable because it is so small. Since behaviours are loosely coupled, the amount of stuff you need to keep ‘in working memory’ to work with a class is minimised.
  4. Varying behaviour in different situations becomes easier as it becomes a matter of composing different classes rather than having conditional logic in a single class. Basically, only the builder implementation in the API example above needs to worry about the value of ‘metered’. The DnsSrvResolver implementation doesn’t need to do a conditional check on ‘if (metered)’.

Doing this is not completely free, though:

  1. With more classes, you get more code to search through if you want to find out how, for instance, data is retained. It may be hard to understand the more dynamic run-time object graph that is behind the DnsSrvResolver you’re using. If the whole picture is there statically in code, it’s easier to find it.
  2. Instead of having complexity within a class, you have (more) complex interactions between instances of different classes. The complexity doesn’t go away entirely, it is moved.

Things that should get you thinking about whether you should split out behaviours into different classes include having too many (> 3-4 or so) collaborators injected into the constructor, or using the word ‘and’ when describing the ‘single’ responsibility of a class. Describing a class as “doing SRV lookups and retaining data and collecting statistics” is a giveaway that you don’t actually think of it as doing one thing.

In my experience, the benefits of using this technique very often outweigh the costs. It is easier to make stable software from a number of small, simple, composable building blocks than to do it using larger, more complex building blocks. Dealing with complexity in how things are put together is easier than trying to compose complex things. Writing code like this doesn’t take longer than writing it in a single class – the only added difficulty is in learning to see when there might be a need to split things out in order to make them simpler.

,

1 Comment