Share Code Selectively

(This is item 1 in the code sharing cookbook)

Since shared code leads to free features, one might think that more sharing is always better. That is actually not true. Sharing code or technology between products has some very obvious benefits and some much less obvious costs. The sneakiness of the costs leads to underestimating them, which in turn can lead to broken attempts at sharing things. I’ll try to give my picture of what characterises things that are suitable for sharing and how to think about what not to share in this post. Note that the perspective I have is based on an organisation whose products are some kind of service (I’ve mostly been developing consumer-oriented web services for the last few years) as opposed to shrink-wrapped products, so probably a lot of what I say isn’t applicable everywhere.

These are some of the main reasons why you want to share code:

You get features for free – this is almost always the original reason why you end up having some shared code between different products. Product A is out there, and somebody realises that there is an opportunity to create product B which has some similarities with A. The fastest and cheapest way to get B out and try it is to build it on A, so let’s do that.
You get bug fixes for free – of course, if product A and B share code and a bug is fixed for product A, when B starts using the fixed version of the shared code, it is fixed for B as well.
Guaranteed consistent behaviour between products in crucial functional areas. This is typically important for backoffice-type functions, where, for instance, you want to make sure that all your products feed data into the data warehouse in a consistent way so the analysts can actually figure out how the products are doing using the same tools.
Using proven solutions and minimising risk. Freshly baked code is more likely to have bugs in it than stuff that has been around for a while.
Similarity of technology can typically reduce operational costs. The same skill sets and tools can be used to run most or all of your products and you can share expensive environments for performance testing, etc. This also has the effect of making it easier for staff to move between products as there is less new stuff to learn in order to get productive with your second product.

All of those reasons are very powerful and typically valid. But they need to be contrasted against some of the costs that you incur from sharing code:

More communication overhead and slower decision making. To change a piece of code, one needs to talk to many people to ensure that it doesn’t break their planned or existing functionality. Decisions about architecture may require days instead of minutes due to the need to coordinate multiple teams.
More complicated code. Code that needs to support a single product with a single way of doing things can be simpler than code that has to support multiple products with slight variations on how they do things. This complexity tends to increase over time. Also, every change has to be made with backwards compatibility in mind, which adds additional difficulties to working with the code.
More configuration management overhead. Managing different branches and dependencies between different shared libraries is time consuming, as are merges when they have to happen. Similarly, you need to be good at keeping track of which versions of shared libraries are used for a particular build.
More complex projects, especially when certain pieces of shared technology can only be modified by certain people or teams. If there are two product teams (A and B) and a team that delivers shared functionality (let’s call them ‘core’), the core team’s backlog needs to be prioritised based on both the needs of A and B. Also, both team A and B are likely to end up blocked waiting for changes to be made by the core team – during times like that, people don’t typically become idle, they just work on other things than what is really important, leading to reduced productivity and a lack of the ‘we can do anything’ energy that characterises a project that runs really well.
More mistakes – all the above things lead to a larger number of mistakes being made, which costs in terms of frustration and time taken to develop new features.

The problem with the costs is that they are insidious and sneak up on you as you share more and more stuff between more products, whereas the benefits are there from day 1 – especially on day 1, when you release your second product which is ‘almost the same’ as the first one and you want as many free features as you can get.

So sharing code can give you lots of important or even vital benefits, but done wrong, it can also make your organisation into a slow-moving behemoth stuck in quicksand due to the dependencies it creates between the products and the teams that should develop them. The diagram below shows how while shared libraries can be building blocks for constructing products, they also create ties between the products. These ties will need management to prevent them from binding your arms behind your back.

The way I think about it, products represent things that you make money from, so changing your products should lead to making more money. Shared code makes it possible for you to develop your products at a lower cost or with a lower risk, but reduces the freedom to innovate on the product level. So in the diagram, the blue verticals represent a money-making perspective and the red horizontals a cost-saving perspective. I guess there is something smart to be said about when in a product’s or organisation’s lifecycle one is more important than the other, but I don’t really know what – probably more mature organisations or products need more sharing and less freedom to develop?

Anyway, getting back to the core message of this post: I’m saying that even if code is identical between two products, that doesn’t necessarily mean that it should be shared. The reason is that it may well start out identical, but if it is likely to be something that one or both product teams will want to modify in the future to maximise their chances of making money, both products will be slowed down by having to worry about what their changes are doing to the other team. Good candidates for sharing tend to have:

A low rate of change – so the functionality is mature and not something you need to tweak frequently in order to tune your business or add features to your product.
A tight coupling to other parts of the company’s ecosystem – reporting/invoicing systems, etc. This usually means tight integration into business processes that are hard to change.
A high degree of generality – the extreme examples of such general systems are of course things like java.util.Set or log4j. Within a company, you can often find things that are very generic in the context of the business.

Of course, those three factors are related. I have found that simply looking at the first one by checking the average number of commits over some period of time gives a really good indication. If there are many changes, don’t share the code. If there are few, you might want to share it. I think the reason why it works is partly that rate of change is a very good indicator of generality and partly because if you try to share something that changes a lot, you’ll incur the costs of sharing very frequently.

Sharing is great, but it’s definitely not something that should be maximised at all costs. Any sharing of functionality between two products creates dependencies between them in the form of feature interaction that adds to the cost of features and schedule interaction between projects/teams that get blocked by each other due to changes to something that is shared.

It is often useful to think of sharing at different levels: maybe it isn’t a great idea to create a shared library from some code that you believe will be modified frequently by two different projects. As an alternative, you can still gain free features by copying and pasting that code from one product to the other, and then letting the two versions lead their own lives. So share code, but be smart about it and don’t share everything!

Code Sharing

This entry was posted on March 23, 2010, 08:34 and is filed under Code Management. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.

Petter Måhlén's Blog

Leave a comment Cancel reply

Archives

Blogs I read

Friends that blog

Tag Cloud

Email Subscription

Admin

Google profile

Petter Måhlén's Blog