Posts Tagged Code Sharing
As I suspected, when I planned to write a series of posts about code sharing, I’ve realised that I won’t write them all. The main reason is that I started out with the juiciest bits, where I felt I had something interesting to say, and the rest of the subjects feel too dry and I don’t think I can write interesting posts about them individually. So I’ll lump them together and describe briefly what I mean by them in this wrap-up post instead.
The bullet points that I don’t think are ‘big’ enough to warrant individual posts are:
- Use JUnit.
- Use Hudson.
- Manage Dependencies.
Let’s tackle them one by one. The first one, ‘Use JUnit’, is not so much intended to say that JUnit is the only unit testing framework out there (TestNG is as good, in my opinion). It is rather a statement about the importance of good automated tests when sharing code. The obvious motivation is that almost every conflicting change between two teams is a regression error and therefore possible to catch with automated tests. If each team ensures that the use cases they want from a shared library are tested automatically (note that I don’t call the tests unit tests; they are more functional than unit tests) with each build, they can guard their desired functionality from breaking due to changes made by another team. A functional test that is broken intentionally due to a change desired by one team should trigger communication between teams to ensure that it is changed in a way that works for all clients of the library.
I’ve never tried formalising the use of different sets of functional tests owned by different clients of a library as opposed to just having a single comprehensive set of unit tests. But it feels like a potentially quite attractive proposition, so it might be interesting to try. It might require some work in terms of getting it into the build infrastructure in a good way. I’d love to be able to see how that works at some point, but simply having a single comprehensive set of unit tests works really well in terms of guarding functionality, too.
‘Use Hudson’ says that continuous integration (CI) is vital when sharing code. It feels like everybody knows that these days, so I don’t think I need to make the case for CI in general. In the context of sharing libraries, the obvious benefit of CI is that you will detect failures sooner than you would have if you just rely on individual developers’ builds. This is especially true of linkage-type errors. You’ll catch most errors that would break a unit test in the library you’re working on by just running the build locally, but CI servers tend to be better at checking that the library works with the latest snapshots of related libraries and vice versa. Of the CI servers I’ve used (includes Continuum and Cruise Control), Hudson has been by a wide margin the best. Hudson’s strength relative to the others is primarily in the ease of managing build lines – the way we use it, anybody can and does create and modify builds for some project almost weekly. I haven’t used the others in a couple of years, so it may have changed, but earlier what you can do in 30 seconds with Hudson used to take at least an hour or more depending on how well you remember the tricks to use with them.
I think that I touched on most of the arguments I wanted to make about ‘Manage Dependencies’ in the post I wrote titled Divide and Conquer. Essentially, the graph of dependencies between shared libraries that you introduce is something that is going to be very hard and expensive to change, so it is well worth spending some time thinking hard about what it should be like before you finalise it. The Divide and Conquer post contains some more detail on what makes it hard to evolve that graph as well as some tips about how to get it right.
The final point is ‘Communicate’. I sometimes think that communication is the hardest thing that two people can try to do, and of course it gets quadratically harder as you add more people. It is interesting to note how much of business hierarchies and processes are aimed at preventing or fixing communication problems. In the particular case of code sharing, the most important communication problems to solve are:
- Proactive notifications – if one team is going to make a change to a shared library, many problems can easily be avoided if other teams are notified before those changes are made so that they get the opportunity to give feedback about how that change might affect them. At Shopzilla, we’re using a mailing list where each team is obliged to send three kinds of messages:
- After each sprint planning session, a message saying either “We’re not planning to make any changes to shared code”, or “We’re anticipating making the following changes to shared code: a), b) and c)”. The point of always sending an email is that it is very easy to forget about this type of communication, so always having to do it should mean forgetting it less often.
- If a need to make changes is detected later than sprint planning (which happens often), a specific notification of that.
- If changes have been made by another team that led to problems, a description of the changes and problems. This is so that we can continuously improve, not in order to point fingers at people that misbehave.
- Understanding requirements and determining correct solutions – it is often not obvious from just looking at some code why it has been implemented the way it is. In that scenario, it is important to have an easy way of getting hold of the person/people that have written the code to understand what requirements they were trying to meet when writing it so that one can avoid breaking things when making modifications. This is often made harder by client evolution: shared code may not be modified to remove some feature as clients stop using it, so dead code is relatively common. Again, I think that a mailing list (or one per some sub-category of shared code) is a useful tool.
- Last but probably most important: a collaborative mindset – this is arguably not ‘just’ a communication problem, but it can definitely be a problem for communication. It is possible to get into a tragedy of the commons-type situation, where the shared code is mismanaged because everybody focuses primarily on their own products’ needs rather than the shared value. This can manifest itself in many ways, from poor implementations of changes in the shared code, to lack of responsiveness when there is a need for discussions and decisions about how to evolve it. To get the benefits of sharing, it is crucial that the teams sharing code want to and are allowed to spend enough time on shared concerns.
So, that concludes the code sharing series. In summary, it’s a great thing to do if done right, but there’s a lot of things that can go wrong in ways that you might not expect beforehand – the benefits of sharing code are typically more obvious than the costs.
When I was at Jadestone, one of the objectives that the CEO explicitly gave me was to come up with a great way to share code between our products. I spent a fair amount of time thinking about and working on how to do that, but I left the company before most of the ideas came to be fully used throughout the company. Just over a year later, I joined Shopzilla, and found that to a very large extent the same ideas that I had introduced as concepts at Jadestone were being used or recommended in practice there. So a lot of the articles I’ve written about code sharing describe ideas and practices from Jadestone and Shopzilla, although focusing specifically on things I personally find important. This one will cover some areas were I think we still have a bit of work to do to really nail things at Shopzilla, although we probably have the beginnings of many of the practices down.
You can’t include all your code into every top-level product you build. This means you’ll need to break your ecosystem down into some sort of sub-structures and pick and choose the parts you want to use in each product. I think it makes sense to have three kinds of overlapping structures: libraries, layers and services. Libraries are the atoms of sharing and since you use Maven, one library corresponds to a single artifact. A library will build on functionality provided by other libraries, and to manage this, they should be organised into layers, where libraries in one layer are more general than libraries in layers above. A lot of the time, you’ll want to have a coarser structure than libraries, for both code management, architectural and operational reasons, and that’s when you create a service that provides some sort of function that will be used in the top-level products.
I’ve found that a lot of the time, code sharing isn’t planned, but emerges. You start with one product, and then there is an opportunity to create a new product that is similar to the first, so you build it on parts of the first. This means that you normally haven’t got a carefully planned set of libraries with well-defined and thought out dependencies between each other and can give rise to some problems:
- Over-large and incoherent libraries. Typical indications of this are a high rate of change and associated high frequency of conflicting changes, forced inclusion of code that you don’t really need into certain builds because stuff you need is packaged with unrelated and irrelevant other code, and difficulties figuring out what dependencies you should have and where to add code for some new feature.
- Shared libraries that contain code that isn’t really a good candidate for sharing. Trying to include that code in different products typically leads to a snarled code in the library with lots of specific conditions or APIs that haven’t made a decision on what their clients actually should use them for. Sometimes, the library will contain some code that is perfect for sharing, and some that definitely isn’t.
- A poor dependency structure: a library that is perfect for sharing might have a dependency on one that you would prefer to leave as product-specific, or there might be circular dependencies between libraries.
Once a library structure is in place and used by multiple teams, it is hard to change because a) it involves making many backwards-incompatible changes, b) it is work that is boring and difficult for developers, c) it gives no short-term value for business owners, and d) it requires cross-team schedule coordination. However, it’s one of those things where the longer you leave it, the higher the accumulated costs in terms of confusion and lack of synergies, so if you’re reasonably sure that the code in question will live for a long time, it’s is likely to eventually be a worthwhile investment to make. It is possible to do at least some parts of a library restructuring incrementally, although there are in my experiences some cases where you’ll bring a few people to a total standstill for a month or so while cleaning up some particular mess. That sort of situation is of course particularly hard to get resolved. It requires discipline, coordination, and above all, a clear understanding of the reasons for making the change – those reasons cannot be as simple as ‘the code needs to be clean’, there should be a cost-benefit analysis of some kind if you want to be really professional about it. If you invest a man-month in cleaning up your library structure, how long will it take before you’ve recouped that man-month?
Given that it is possible to go wrong with your shared library structure, what are the characteristics of a library structure that has gone right? Here’s a couple more bullet points and a diagram:
- Libraries are coherent and of the right size. There are some opposing forces that affect what is “the right library size”: smaller libraries are awkward to work with from a source management, information management and documentation perspective since the smaller the libraries are, the more of them you need. This means you’ll have more or more complicated IDE projects and build files, and you’ll have a larger set of things to search in order to find out where a particular feature is implemented. On the other hand, larger libraries suffer from a lack of purpose and coherence, which makes conflicts between teams sharing them more likely, increases their rate of change, and makes it harder to describe what they exist to do. All these things make them less suitable for sharing. I think you want to have libraries that are as fine-grained as you can make them, without making it too hard to get an overview over which libraries are available, what they are used for and where to find the code you want to make a change to.
- There’s a clear definition of what type of information and logic goes where in the layered structure. At the bottom, you’ll find super-generic things like logging and monitoring code. Slightly higher up, you’ll typically have things that are very central to the business: normally anything that relates to money or customers, where you’ll want to ensure that all products work the same way. Further upwards, you can find things that are shared within a given product category, and if you have higher level libraries, they are quite likely to be product-specific, so maybe not candidates for sharing at all.
- In addition to the horizontal structure defined by the layers, there is a coarse-grained vertical structure provided by services. Services group together related functions and are usually primarily introduced for operational reasons – for instance, the need to scale a certain set of features independently of some other set. But they also add to architectural clarity in that they provide a simpler view on their function set, allowing products to share implementations of certain features without being linked using the same code. Services also simplify code sharing in the way that they provide isolation: you can have separate teams develop services and clients in parallel as long as you have a sufficiently well-defined service API.
Structuring your code in a way that is conducive to sharing is a good thing, but it is also hard. I particularly struggle with the fact that it is very hard to be agile about it: you can’t easily “inspect and adapt”, because of the difficulty of changing a library structure once it is in place. The best opportunity to put a good structure in place is when you start sharing some code between two products, but at that time it is very hard to foresee future developments (how is product number three going to be similar to or different from products 1 and 2?). Defining a coarse layered structure based on expected ‘genericness’ and making libraries small enough to be coherent is probably the best way to get the structure approximately right.
Maven’s slow progress towards becoming the most accepted Java build tool seems to continue, although a lot of people are still annoyed enough with its numerous warts to prefer Ant or something else. My personal opinion is that Maven is the best build solution for Java programs that is out there, and as somebody said – I’ve been trying to find the quote, but I can’t seem to locate it – when an Ant build is complicated, you blame yourself for writing a bad build.xml, but when it is hard to get Maven to behave, you blame Maven. With Ant, you program it, so any problems are clearly due to a poorly structured program. With Maven you don’t tell it how to do things, you try to tell it what should be done, so any problems feel like the fault of the tool. The thing is, though, that Maven tries to take much more responsibility for some of the issues that lead to complex build scripts than something like Ant does.
I’ve certainly spent a lot of time cursing poorly written build scripts for Ant and other tools, and I’ve also spent a lot of time cursing Maven when it doesn’t do what I want it to. But the latter time is decreasing as Maven keeps improving as a build tool. There’s been lots of attempts to create other tools that are supposed to make builds easier than Maven, but from what I have seen, nothing has yet really succeeded to provide a clearly better option (I’ve looked at Buildr and Raven, for instance). I think the truth is simply that the build process for a large system is a complex problem to solve, so one cannot expect it to be free of hassles. Maven is the best tool out there for the moment, but will surely be replaced by something better at some point.
So, using Maven isn’t going to be problem-free. But it can help with a lot of things, particularly in the context of sharing code between multiple teams. The obvious thing it helps with is the single benefit that most people agree that Maven has – its way of managing dependencies and the massive repository infrastructure and dependency database that is just available out there. On top of that, building Maven projects in Hudson is dead easy, and there’s a whole slew of really nice tools that come with Maven plugins that you can use that enable you to get all kinds of reports and metadata about your code. My current favourite is Sonar, which is great if you want to keep track of how your code base evolves from some kind of aggregated perspective.
Here are some things you’ll want to do if you decide to use Maven for the various projects that make up your system:
- Use Nexus as an internal repository for build artifacts.
- Use the Maven Release plugin to create releases of internal artifacts.
- Create a shared POM for the whole code base where you can define shared settings for your builds.
The word ‘repository’ is a little overloaded in Maven, so it may be confusing. Here’s a diagram that explains the concept and shows some of the things that a repository manager like Nexus can help you with:
The setup includes a Git server (because you use Git) for source control, a Hudson server (or set of) that does continuous integration, a Nexus-managed artifact repository and a developer machine. The Nexus server has three repositories in it: internal releases, internal snapshots and a cache of external repositories. The latter is only there as a performance improvement. The other two are the way that you distribute Maven artifacts within your organisation. When a Maven build runs on the Hudson or developer machines, Maven will use artifacts from the local repository on the machine – by default located in a folder under the user’s home directory. If a released version of an artifact isn’t present in the local repository, it will be downloaded from Nexus, and snapshot versions will periodically be refreshed, even if present locally. In the example setup, new snapshots are typically deployed to the Nexus repository by the Hudson server, and released versions are typically deployed by the developer producing the release. Note that both Hudson and developers are likely to install snapshots to the local repository.
I’ve tried a couple of other repository managers (Archiva, Artifactory and Maven-Proxy), but Nexus has been by a pretty wide margin the best – robust, easy to use and easy to understand. It’s been a year or two since I looked at the other ones, so they may have improved since.
Having an internal repository opens up for code sharing by providing a uniform mechanism for distributing updated versions of internal libraries using the standard Maven deploy command. Maven has two types of artifact versions: releases and snapshots. Releases are assumed to be immutable and snapshots mutable, so updating a snapshot in the internal repository will affect any build that downloads the updated snapshot, whereas releases are supposed to be deployed to the internal repository once only – any subsequent deployments should deploy something that is identical. Snapshots are tricky, especially when branching. If you create two branches of the same library and fail to ensure that the two branches have different snapshot versions, the two branches will interfere,
There is interference between the two branches because they both create updates to the same artifact in the Maven repositories. Depending on the ordering of these updates, builds may succeed or fail seemingly at random. At Shopzilla, we typically solve this problem in two ways: for some shared projects, where we have long-lived/permanent team-specific branches, the team name is included in the version number of the artifact, and for short-lived user story branches, the story ID is included in the version number. So if I need to create a branch off of version 2.3-SNAPSHOT for story S3765, I’ll typically label the branch S3765 and change the version of the Maven artifact to 2.3-S3765-SNAPSHOT. The Maven release plugin has a command that simplifies branching, but for whatever reason, I never seem to use it. Either way, being careful about managing branches and Maven versions is necessary.
A situation where I do use the maven release plugin a lot is when making releases of shared libraries. I advocate a workflow where you make a new release of your top-level project every time you make a live update, and because you want to make live updates frequently and you use scrum, that means a new Maven release with every iteration. To make a Maven release of a project, you have to eliminate all snapshot dependencies – this is a necessary requirement for immutability – so releasing the top level project means make release versions of all its updated dependencies. Doing this frequently reduces the risk of interference between teams by shortening the ‘checkout, modify, checkin’ cycle.
See the pom file example below for some hands-on pom.xml settings that are needed to enable using the release plugin.
The final tip for code sharing using Maven that I wanted to give is to use a shared parent POM that contains settings that should be shared between projects. The main reason is of course to reduce code duplication – any build file is code, of course, and Maven build files are not as easy to understand as one would like, so simplifying them is very valuable. Here’s some stuff that I think should go into a shared pom.xml file:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.mycompany</groupId> <artifactId>shared-pom</artifactId> <name>Company Shared pom</name> <version>1.0-SNAPSHOT</version> <packaging>pom</packaging> <!-- One of the things that is necessary in order to be able to use the release plugin is to specify the scm/developerConnection element. I usually also specify the plain connection, although I think that is only used for generating project documentation, a Maven feature I don't find particularly useful personally. A section like this needs to be present in every project for which you want to be able to use the release plugin, with the project- specific Git URL. --> <scm> <connection>scm:git:git://GITHOST/GITPROJECT</connection> <developerConnection>scm:git:git://GITHOST/GITPROJECT</developerConnection> </scm> <build> <!-- Use the plugins section to define Maven plugin configurations that you want to share between all projects. --> <plugins> <!-- Compiler settings that are typically going to be identical in all projects. With a name like Måhlén, you get particularly sensitive to using the only useful character encoding there is.. ;) --> <plugin> <artifactId>maven-compiler-plugin</artifactId> <configuration> <source>1.6</source> <target>1.6</target> <encoding>UTF-8</encoding> </configuration> </plugin> <!-- Tell Maven to create a source bundle artifact during the package phase. This is extremely useful when sharing code, as the act of sharing means you'll want to create a relatively large number of smallish artifacts, so creating IDE projects that refer directly to the source code is unmanageable. But the Maven integration of a good IDE will fetch the Maven source bundle if available, so if you navigate to a class that is included via Maven from your top-level project, you'll still see the source version - and even the right source version, because you'll get what corresponds to the binary that has been linked. --> <plugin> <artifactId>maven-source-plugin</artifactId> <executions> <execution> <phase>package</phase> <goals> <goal>jar</goal> </goals> </execution> </executions> </plugin> <!-- Ensure that a javadoc jar is being generated and deployed. This is useful for similar reasons as source bundle generation, although to a lesser degree in my opinion. Javadoc is great, but the source is always up to date. --> <plugin> <artifactId>maven-javadoc-plugin</artifactId> <executions> <execution> <phase>package</phase> <goals> <goal>jar</goal> </goals> </execution> </executions> </plugin> <!-- The below configuration information was necessary to ensure that you can use the maven release plugin with Git as a version control system. The exact version numbers that you want to use are likely to have changed since then, and it may even be that Git support is more closely integrated nowadays, so less explicit configuration is needed - I haven't tested that since maybe March 2009. --> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-release-plugin</artifactId> <dependencies> <dependency> <groupId>org.apache.maven.scm</groupId> <artifactId>maven-scm-provider-gitexe</artifactId> <version>1.1</version> </dependency> <dependency> <groupId>org.codehaus.plexus</groupId> <artifactId>plexus-utils</artifactId> <version>1.5.7</version> </dependency> </dependencies> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-scm-plugin</artifactId> <version>1.1</version> <dependencies> <dependency> <groupId>org.apache.maven.scm</groupId> <artifactId>maven-scm-provider-gitexe</artifactId> <version>1.1</version> </dependency> <dependency> <groupId>org.codehaus.plexus</groupId> <artifactId>plexus-utils</artifactId> <version>1.5.7</version> </dependency> </dependencies> </plugin> </plugins> </build> <!-- Configuration of internal repositories so that the sub-projects know where to download internally created artifacts from. Note that due to a bootstrapping issue, this configuration needs to be duplicated in individual projects. This file, the shared POM, is available from the Nexus repo, but if the project POM doesn't contain the repo config, the project build won't know where to download the shared POM. --> <repositories> <!-- internal Nexus repository for released artifacts --> <repository> <id>internal-releases</id> <url>http://NEXUSHOST/nexus/content/repositories/internal-releases</url> <releases><enabled>true</enabled></releases> <snapshots><enabled>false</enabled></snapshots> </repository> <!-- internal Nexus repository for SNAPSHOT artifacts --> <repository> <id>internal-snapshots</id> <url>http://NEXUSHOST/nexus/content/repositories/internal-snapshots</url> <releases><enabled>false</enabled></releases> <snapshots><enabled>true</enabled></snapshots> </repository> <!-- Nexus repository cache for third party repositories such as ibiblio. This is not necessary, but is likely to be a performance improvement for your builds. --> <repository> <id>3rd party</id> <url>http://NEXUSHOST/nexus/content/repositories/thirdparty/</url> <releases><enabled>true</enabled></releases> <snapshots><enabled>false</enabled></snapshots> </repository> </repositories> <distributionManagement> <!-- Defines where to deploy released artifacts to --> <repository> <id>internal-repository-releases</id> <name>Internal release repository</name> <url>URL TO NEXUS RELEASES REPOSITORY</url> </repository> <!-- Defines where to deploy artifact snapshot to --> <snapshotRepository> <id>internal-repository-snapshot</id> <name>Internal snapshot repository</name> <url>URL TO NEXUS SNAPSHOTS REPOSITORY</url> </snapshotRepository> </distributionManagement> </project>
The less pleasant part of using Maven is that you’ll need to learn more about Maven’s internals than you’d probably like, and you’ll most likely stop trying to fix your builds not when you’ve understood the problem and solved it in the way you know is correct, but when you’ve arrived at a configuration that works through trial and error (as you can see from my comments in the example pom.xml above). The benefits you’ll get in terms of simplifying the management of build artifacts across teams and actually also simplifying the builds themselves outweigh the costs of the occasional hiccup, though. A typical top-level project at Shopzilla links in around 70 internal artifacts through various transitive dependencies – managing that number of dependencies is not easy unless you have a good tool to support you, and dependency management is where Maven shines.
(This is item 3 in the code sharing cookbook)
Today, no CV comes along without “Scrum” and “Agile” on it and Scrum has gained acceptance so quickly that it should definitely set off any hype-warning systems. I personally think there’s a lot more to Scrum than just a hype – if you do it right, it’s extremely useful for productivity. There are some aspects of Scrum done right that are particularly valuable for sharing code:
- Empowering teams to deliver end-to-end functionality.
- (Relatively) short iterations.
- Delivering potentially shippable results with every sprint.
The biggest one is empowering teams to deliver fully functioning features. The way I think that affects code sharing is that rather than having your organisation focus on developing technology horizontals, you aim at developing feature verticals.
As much as feasible, Scrum tells you to have teams whose focus is to deliver the blue verticals in the above diagram. The red horizontals will be developed and maintained as a consequence of needs driven by the products. At the other end of the spectrum, your teams are aligned along the libraries or technology components, with each team responsible for one or more services or libraries. Obviously, you can vary your focus from totally red to totally blue, and you get different advantages and disadvantages depending on where you are. At the ‘red’ end of the spectrum, with teams very much aligned along technical component lines, you get the following advantages and disadvantages:
- Teams get a very deep knowledge of their components leading to solid, strong technology.
- Teams get a strong feeling of ownership of components, and can take a long view when developing them.
- You need to align product roadmaps for different products so that they can take advantage of features and changes made in the underlying libraries.
- You will get queues and blockages in the product development, where one team is waiting for the result of another team’s work, or where one team has finished its work and the result “sits on a shelf” until the downstream team is ready to pick it up. (This is what Lean tells you to avoid.)
- Most teams are not customer-facing, meaning that their priorities tend to shift from what is important to the business to what is important to their component. This in turn increases the risk of developing technology for its own sake rather than due to a business need.
At the ‘blue’ end, on the other hand, shared code is collectively owned by multiple teams and you get a situation where:
- Teams rarely need to wait for others in order to get their features finished and launched. This is great for innovation.
- The fact that there is never any waiting and the teams are typically in control of their own destiny is energising: nobody else is to blame for failures or gets credit for successes. Work done leads to something visible. This makes it more rewarding, fun and efficient.
- Features and changes that are developed are typically well aligned with business priorities.
- There is a real risk of under-investing in shared technology. Larger restructuring tasks may never happen because no single team owns the responsibility for technology components.
- There is no roadmap for individual components, which can lead to bloat and sprawl.
- The lack of continuity increases the risk that it is unclear how some feature was intended to work and the reasons why it was implemented in a certain way are more likely to be forgotten.
- Developers (by which I don’t mean just programmers, but all team members) need to be jacks-of-all-trades and risk being masters of none.
I think that the best solution in general is to get redder the deeper you go in the technology stack, because that’s where you have more complex and general technologies that a) need deep knowledge to develop and b) typically don’t affect end-user functionality very directly. I also think that most organisations I’ve seen have been too red. Having product-specific teams that are allowed to make modifications to much of the shared codebase allows you to develop your business-driving products quickly and based on their individual needs. So there should be more focus on products and teams that can develop features end-to-end, just as Scrum tells us!
The next thing that makes Scrum great for sharing code is the combination of time-limited sprints and the focus on delivering finished code at each iteration. The ‘deliver potentially shippable code’ bit seems to be one of the hardest things about Scrum, even though it is actually quite trivial as a concept: if you don’t feel like you could launch the code demoed at the end of the sprint the day after, don’t call the story done, and don’t grant yourself any story points for it. That way, you’ll not be able to let anything out of your sights until it is shippable and your velocity will be reduced until you’re great at getting things really ready – which is exactly right! If it is difficult for your team to take things all the way to potentially shippable because of environmental or process problems, then fix those issues until it is easy.
Actually finishing things is great for productivity in general, but it is even better in a code-sharing context. To illustrate how, I’ll use one of my favourite diagrams:
Assume you have three features to complete, each of which requires three tasks to be done. Each of the tasks takes one day, and you can only do one task at a time. If, as above, you start each feature as early as possible, all the code that is touched by feature A is in a ‘being modified’ state for 7 days (from the start of A1 to the end of A3), and the same applies to features B and C. In the second version below, the corresponding time is 3 days per feature, meaning that the risk that another team will need to make modifications concurrently is less than half of the first version.
Also, having any updates to shared code completed quickly means that the time between merge opportunities is decreased, which decreases the risks of isolation. You really want to shorten the branch – modify – merge back cycle as much as possible, and Scrum’s insistence on getting things production-ready within the time period of a single sprint (which is supposed to be less than 4 weeks – I’ve found that 2 or 3 weeks seems to work even better in most situations) is a great support in pushing for a short such cycle.
I think Scrum is great in general, and done right, it also helps you with sharing code. Jeff Sutherland said in a presentation I attended that the requirement to produce potentially shippable code with each iteration is the hardest requirement in the Nokia test. I can reluctantly understand why that’s the case. It’s not conceptually hard and unlike some classes of technical problem, it doesn’t require any exceptional talent to succeed with. What makes it hard is that it requires discipline and an environment where it is OK to flag up problems without management seeing that as being obstructive. It’s worth doing, so don’t allow yourself any shortcuts, and fix every process or environment obstacle that stands in the way of producing shippable code with each sprint. Combine that with teams that are empowered to own and modify pretty much all the code that makes up their product, and you’ve come a long way towards a great environment for code sharing.
(This is item 2 in the code sharing cookbook)
Joel Spolsky used what seems to be his last blog post to talk about Git and Mercurial. I like his description of their main benefit as being that they track changes rather than revisions, and like him, I don’t particularly like the classification of them as distributed version control systems. As I’ve mentioned before, the ‘distributed’ bit isn’t what makes them great. In this post, I’ll try to explain why I think that Git is a great VCS, especially for sharing code between multiple teams – I’ve never used Mercurial, so I can’t have any opinions on it. I will use SVN as the counter-example of an older version control system, but I think that in most of the places where I mention SVN, it could be replaced by any other ‘centralised’ VCS.
The by far biggest reason to use Git when sharing code is its support for branching and merging. The main issue at work here is the conflict between two needs: teams need to have complete control of their code and environments in order to be effective in developing their features, and the overall need to detect and resolve conflicting changes as quickly as possible. I’ll probably have to explain a little more clearly what I mean by that.
Assume that Team Red and Team Blue are both working on the same shared library. If they push their changes to the exact same central location, they are likely to interfere with each other. Builds will break, bugs will be introduced in parts of the code supposedly not touched, larger changes may be impossible to make and there will be schedule conflicts – what if Team Blue commits a large and broken change the day before Team Red is going to release? So you clearly want to isolate teams from each other.
On the other hand, the longer the two teams’ changes are isolated, the harder it is to find and resolve conflicting changes. Both volume of change and calendar time are important here. If the volume of changes made in isolation is large and the code doesn’t work after a merge, the volume of code to search in order to figure out the problem is large. This of course makes it a lot harder to figure out where the problem is and how to solve it. On top of that, if a large volume of code has been changed since the last merge, the risk that a lot of code has been built on top of a faulty foundation is higher, which means that you may have wasted a lot of effort on something you’ll need to rewrite.
To explain how long calendar time periods between merges are a problem, imagine that it takes maybe a couple of months before a conflict between changes is detected. At this time, the persons who were making the conflicting changes may no longer remember exactly how the features were supposed to work, so resolving the conflicts will be more complicated. In some cases, they may be working in a totally different team or even have left the company. If the code is complicated, the time when you want to detect and fix the problem is right when you’re in the middle of making the change, not even a week or two afterwards. Branches represent risk and untestable potential errors.
So there is a spectrum between zero isolation and total isolation, and it is clear that the extremes are not where you want to be. That’s very normal and means you have a curve looking something like this:
You have a cost due to team interference that is high with no isolation and is reduced by introducing isolation, and you have a corresponding cost due to the isolation itself that goes up as you isolate teams more. Obviously the exact shape of the curves is different in different situations, but in general you want to be at some point between the extremes, close to the optimum, where teams are isolated enough for comfort, yet merges happen soon enough to not allow the conflict troubles to grow too large.
So how does all that relate to Git? Well, Git enables you to fine-tune your processes on the X axis in this diagram by making merges so cheap that you can do them as often as you like, and through its various features that make it easier to deal with multiple branches (cherry-picking, the ability to identify whether or not a particular commit has gone into a branch, etc.). With SVN, for instance, the costs incurred by frequent merges are prohibitive, partly because making a single merge is harder than with Git, but probably even more because SVN can only tell that there is a difference between two branches, not where the difference comes from. This means that you cannot easily do intermediate merges, where you update a story branch with changes made on the more stable master branch in order to reduce the time and volume of change between merges.
At every SVN merge, you have to go through all the differences between the branches, whereas Git’s commit history for each branch allows you to remember choices you made about certain changes, greatly simplifying each merge. So during the second merge, at commit number 5 in the master branch, you’ll only need to figure out how to deal with (non-conflicting) changes in commits 4 and 5, and during the final merge, you only need to worry about commits 6 and 7. In all, this means that with SVN, you’re forced closer to the ‘total isolation’ extreme than you would probably want to be.
Working with Git has actually totally changed the way I think about branches – I used to say something along the lines of ‘only branch in extreme situations’. Now I think having branches is a good, normal state of being. But the old fears about branching are not entirely invalidated by Git. You still need to be very disciplined about how you use branches, and for me, the main reason is that you want to be able to quickly detect conflicts between them. So I think that branches should be short-lived, and if that isn’t feasible, that relatively frequent intermediate merges should be done. At Shopzilla, we’ve evolved a de facto branching policy over a year of using Git, and it seems to work quite well:
- Shared code with a low rate of change: a single master branch. Changes to these libraries and services are rare enough that two teams almost never make them at the same time. When they do, the second team that needs to make changes to a new release of the library creates a story branch and the two teams coordinate about how to handle merging and releasing.
- Shared code with a high rate of change: semi-permanent team-specific branches and one team has the task of coordinating releases. The teams that work on their different stories/features merge their code with the latest ‘master’ version and tell the release team which commits to pick up. The release team does the merge and update of the release branch and both teams do regression QA on the final code before release. This happens every week for our biggest site.
- Team-specific code: the practice in each team varies but I believe most teams follow similar processes. In my team, we have two permanent branches that interleave frequently: release and master, and more short-lived branches that we create on an ad-hoc basis. We do almost all of our work on the master branch. When we’re starting to prepare a release (typically every 2-3 weeks or so), we split off the release branch and do the final work on the stories to be released there. Work on stories that didn’t make it into the release goes onto the master branch as usual. It is common that we have stories that we put on story-specific branches, when we don’t believe that they will make it into the next planned release and thus shouldn’t be on master.
The diagram above shows a pretty typical state of branches for our team. Starting from the left, the work has been done on the master branch. We then split off the release branch and finalise a release there. The build that goes live will be the last one before merging release back into master. In the mean time, we started some new work on the master branch, plus two stories that we know or believe we won’t be able to finish before the next release, so they live in separate branches. For Story A, we wanted to update it with changes made on the release and master branch, so we merged them into Story A shortly before it was finished. At the time the snapshot is taken, we’ve started preparing the next release and the Story A branch has been deleted as it has been merged back into master and is no longer in use. This means that we only have three branches pointing to commits as indicated by the blueish markers.
This blog post is now far longer than I had anticipated, so I’m going to have to cut the next two advantages of Git shorter than I had planned. Maybe I’ll get back to them later. For now, suffice it to say that Git allows you to do great magic in order to fix mistakes that you make and even extracting and combining code from different repositories with full history. I remember watching Linus Torvalds’ Tech Talk about Git and that he said that the performance of Git was such that it led to a quantum change in how he worked. For me working with Git has also led to a radical shift in how I work and how I look at code management, but it’s not actually the performance that is the main thing, it is the whole conceptual model with tracking commits that makes branching and merging so easy that has led to the shift for me. That Git is also a thousand times (that may not be strictly true…) faster than SVN is of course not a bad thing either.
(This is item 1 in the code sharing cookbook)
Since shared code leads to free features, one might think that more sharing is always better. That is actually not true. Sharing code or technology between products has some very obvious benefits and some much less obvious costs. The sneakiness of the costs leads to underestimating them, which in turn can lead to broken attempts at sharing things. I’ll try to give my picture of what characterises things that are suitable for sharing and how to think about what not to share in this post. Note that the perspective I have is based on an organisation whose products are some kind of service (I’ve mostly been developing consumer-oriented web services for the last few years) as opposed to shrink-wrapped products, so probably a lot of what I say isn’t applicable everywhere.
These are some of the main reasons why you want to share code:
- You get features for free – this is almost always the original reason why you end up having some shared code between different products. Product A is out there, and somebody realises that there is an opportunity to create product B which has some similarities with A. The fastest and cheapest way to get B out and try it is to build it on A, so let’s do that.
- You get bug fixes for free – of course, if product A and B share code and a bug is fixed for product A, when B starts using the fixed version of the shared code, it is fixed for B as well.
- Guaranteed consistent behaviour between products in crucial functional areas. This is typically important for backoffice-type functions, where, for instance, you want to make sure that all your products feed data into the data warehouse in a consistent way so the analysts can actually figure out how the products are doing using the same tools.
- Using proven solutions and minimising risk. Freshly baked code is more likely to have bugs in it than stuff that has been around for a while.
- Similarity of technology can typically reduce operational costs. The same skill sets and tools can be used to run most or all of your products and you can share expensive environments for performance testing, etc. This also has the effect of making it easier for staff to move between products as there is less new stuff to learn in order to get productive with your second product.
All of those reasons are very powerful and typically valid. But they need to be contrasted against some of the costs that you incur from sharing code:
- More communication overhead and slower decision making. To change a piece of code, one needs to talk to many people to ensure that it doesn’t break their planned or existing functionality. Decisions about architecture may require days instead of minutes due to the need to coordinate multiple teams.
- More complicated code. Code that needs to support a single product with a single way of doing things can be simpler than code that has to support multiple products with slight variations on how they do things. This complexity tends to increase over time. Also, every change has to be made with backwards compatibility in mind, which adds additional difficulties to working with the code.
- More configuration management overhead. Managing different branches and dependencies between different shared libraries is time consuming, as are merges when they have to happen. Similarly, you need to be good at keeping track of which versions of shared libraries are used for a particular build.
- More complex projects, especially when certain pieces of shared technology can only be modified by certain people or teams. If there are two product teams (A and B) and a team that delivers shared functionality (let’s call them ‘core’), the core team’s backlog needs to be prioritised based on both the needs of A and B. Also, both team A and B are likely to end up blocked waiting for changes to be made by the core team – during times like that, people don’t typically become idle, they just work on other things than what is really important, leading to reduced productivity and a lack of the ‘we can do anything’ energy that characterises a project that runs really well.
- More mistakes – all the above things lead to a larger number of mistakes being made, which costs in terms of frustration and time taken to develop new features.
The problem with the costs is that they are insidious and sneak up on you as you share more and more stuff between more products, whereas the benefits are there from day 1 – especially on day 1, when you release your second product which is ‘almost the same’ as the first one and you want as many free features as you can get.
So sharing code can give you lots of important or even vital benefits, but done wrong, it can also make your organisation into a slow-moving behemoth stuck in quicksand due to the dependencies it creates between the products and the teams that should develop them. The diagram below shows how while shared libraries can be building blocks for constructing products, they also create ties between the products. These ties will need management to prevent them from binding your arms behind your back.
The way I think about it, products represent things that you make money from, so changing your products should lead to making more money. Shared code makes it possible for you to develop your products at a lower cost or with a lower risk, but reduces the freedom to innovate on the product level. So in the diagram, the blue verticals represent a money-making perspective and the red horizontals a cost-saving perspective. I guess there is something smart to be said about when in a product’s or organisation’s lifecycle one is more important than the other, but I don’t really know what – probably more mature organisations or products need more sharing and less freedom to develop?
Anyway, getting back to the core message of this post: I’m saying that even if code is identical between two products, that doesn’t necessarily mean that it should be shared. The reason is that it may well start out identical, but if it is likely to be something that one or both product teams will want to modify in the future to maximise their chances of making money, both products will be slowed down by having to worry about what their changes are doing to the other team. Good candidates for sharing tend to have:
- A low rate of change – so the functionality is mature and not something you need to tweak frequently in order to tune your business or add features to your product.
- A tight coupling to other parts of the company’s ecosystem – reporting/invoicing systems, etc. This usually means tight integration into business processes that are hard to change.
- A high degree of generality – the extreme examples of such general systems are of course things like java.util.Set or log4j. Within a company, you can often find things that are very generic in the context of the business.
Of course, those three factors are related. I have found that simply looking at the first one by checking the average number of commits over some period of time gives a really good indication. If there are many changes, don’t share the code. If there are few, you might want to share it. I think the reason why it works is partly that rate of change is a very good indicator of generality and partly because if you try to share something that changes a lot, you’ll incur the costs of sharing very frequently.
Sharing is great, but it’s definitely not something that should be maximised at all costs. Any sharing of functionality between two products creates dependencies between them in the form of feature interaction that adds to the cost of features and schedule interaction between projects/teams that get blocked by each other due to changes to something that is shared.
It is often useful to think of sharing at different levels: maybe it isn’t a great idea to create a shared library from some code that you believe will be modified frequently by two different projects. As an alternative, you can still gain free features by copying and pasting that code from one product to the other, and then letting the two versions lead their own lives. So share code, but be smart about it and don’t share everything!
If you’re in an organisation that grows or whose business is changing, you’ll soon want to add another product to the one or ones you’ve already got. Frequently, the new product idea has a lot of similarity to existing ones (because you tend to both come up with ideas in the space where you work, and because you’ll tend to want to play to your existing strengths), so there is a strong desire to reuse technology. First, to get the new product out, at least as a prototype, and later, assuming it is successful, continuing to share code in order to not have to reinvent the wheel.
Code sharing makes a lot of sense, but it is in fact a lot harder than it seems on the face of it. In what is quite possibly a more ambitious project than I will have the tenacity to complete, I’m going to try to set out some ideas on how to do code sharing in the kind of organisation that I have recent experience of: around 30 developers working on around 5 different products. The first one will be a bit theoretical, but the rest should be quite concrete with hands-on tips about how to do things.
Here’s the list of topics I’ve got planned:
- Share Code Selectively.
- Use Git.
- Use Scrum.
- Use Maven
- Use JUnit.
- Use Hudson.
- Divide and Conquer.
- Manage Dependencies.
Over the next few weeks or months, I’ll try to write something more detailed about each of them. I would be surprised if I don’t have to go back to this post and update it based on the fact that my thinking around this will probably change as I write the posts.