Code sharing: Use Maven

Maven’s slow progress towards becoming the most accepted Java build tool seems to continue, although a lot of people are still annoyed enough with its numerous warts to prefer Ant or something else. My personal opinion is that Maven is the best build solution for Java programs that is out there, and as somebody said – I’ve been trying to find the quote, but I can’t seem to locate it – when an Ant build is complicated, you blame yourself for writing a bad build.xml, but when it is hard to get Maven to behave, you blame Maven. With Ant, you program it, so any problems are clearly due to a poorly structured program. With Maven you don’t tell it how to do things, you try to tell it what should be done, so any problems feel like the fault of the tool. The thing is, though, that Maven tries to take much more responsibility for some of the issues that lead to complex build scripts than something like Ant does.

I’ve certainly spent a lot of time cursing poorly written build scripts for Ant and other tools, and I’ve also spent a lot of time cursing Maven when it doesn’t do what I want it to. But the latter time is decreasing as Maven keeps improving as a build tool. There’s been lots of attempts to create other tools that are supposed to make builds easier than Maven, but from what I have seen, nothing has yet really succeeded to provide a clearly better option (I’ve looked at Buildr and Raven, for instance). I think the truth is simply that the build process for a large system is a complex problem to solve, so one cannot expect it to be free of hassles. Maven is the best tool out there for the moment, but will surely be replaced by something better at some point.

So, using Maven isn’t going to be problem-free. But it can help with a lot of things, particularly in the context of sharing code between multiple teams. The obvious thing it helps with is the single benefit that most people agree that Maven has – its way of managing dependencies and the massive repository infrastructure and dependency database that is just available out there. On top of that, building Maven projects in Hudson is dead easy, and there’s a whole slew of really nice tools that come with Maven plugins that you can use that enable you to get all kinds of reports and metadata about your code. My current favourite is Sonar, which is great if you want to keep track of how your code base evolves from some kind of aggregated perspective.

Here are some things you’ll want to do if you decide to use Maven for the various projects that make up your system:

Use Nexus as an internal repository for build artifacts.
Use the Maven Release plugin to create releases of internal artifacts.
Create a shared POM for the whole code base where you can define shared settings for your builds.

The word ‘repository’ is a little overloaded in Maven, so it may be confusing. Here’s a diagram that explains the concept and shows some of the things that a repository manager like Nexus can help you with:

The setup includes a Git server (because you use Git) for source control, a Hudson server (or set of) that does continuous integration, a Nexus-managed artifact repository and a developer machine. The Nexus server has three repositories in it: internal releases, internal snapshots and a cache of external repositories. The latter is only there as a performance improvement. The other two are the way that you distribute Maven artifacts within your organisation. When a Maven build runs on the Hudson or developer machines, Maven will use artifacts from the local repository on the machine – by default located in a folder under the user’s home directory. If a released version of an artifact isn’t present in the local repository, it will be downloaded from Nexus, and snapshot versions will periodically be refreshed, even if present locally. In the example setup, new snapshots are typically deployed to the Nexus repository by the Hudson server, and released versions are typically deployed by the developer producing the release. Note that both Hudson and developers are likely to install snapshots to the local repository.

I’ve tried a couple of other repository managers (Archiva, Artifactory and Maven-Proxy), but Nexus has been by a pretty wide margin the best – robust, easy to use and easy to understand. It’s been a year or two since I looked at the other ones, so they may have improved since.

Having an internal repository opens up for code sharing by providing a uniform mechanism for distributing updated versions of internal libraries using the standard Maven deploy command. Maven has two types of artifact versions: releases and snapshots. Releases are assumed to be immutable and snapshots mutable, so updating a snapshot in the internal repository will affect any build that downloads the updated snapshot, whereas releases are supposed to be deployed to the internal repository once only – any subsequent deployments should deploy something that is identical. Snapshots are tricky, especially when branching. If you create two branches of the same library and fail to ensure that the two branches have different snapshot versions, the two branches will interfere,

There is interference between the two branches because they both create updates to the same artifact in the Maven repositories. Depending on the ordering of these updates, builds may succeed or fail seemingly at random. At Shopzilla, we typically solve this problem in two ways: for some shared projects, where we have long-lived/permanent team-specific branches, the team name is included in the version number of the artifact, and for short-lived user story branches, the story ID is included in the version number. So if I need to create a branch off of version 2.3-SNAPSHOT for story S3765, I’ll typically label the branch S3765 and change the version of the Maven artifact to 2.3-S3765-SNAPSHOT. The Maven release plugin has a command that simplifies branching, but for whatever reason, I never seem to use it. Either way, being careful about managing branches and Maven versions is necessary.

A situation where I do use the maven release plugin a lot is when making releases of shared libraries. I advocate a workflow where you make a new release of your top-level project every time you make a live update, and because you want to make live updates frequently and you use scrum, that means a new Maven release with every iteration. To make a Maven release of a project, you have to eliminate all snapshot dependencies – this is a necessary requirement for immutability – so releasing the top level project means make release versions of all its updated dependencies. Doing this frequently reduces the risk of interference between teams by shortening the ‘checkout, modify, checkin’ cycle.

See the pom file example below for some hands-on pom.xml settings that are needed to enable using the release plugin.

The final tip for code sharing using Maven that I wanted to give is to use a shared parent POM that contains settings that should be shared between projects. The main reason is of course to reduce code duplication – any build file is code, of course, and Maven build files are not as easy to understand as one would like, so simplifying them is very valuable. Here’s some stuff that I think should go into a shared pom.xml file:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
   <modelVersion>4.0.0</modelVersion>
   <groupId>com.mycompany</groupId>
   <artifactId>shared-pom</artifactId>
   <name>Company Shared pom</name>
   <version>1.0-SNAPSHOT</version>
   <packaging>pom</packaging>

   <!--
      One of the things that is necessary in order to be able to use the
      release plugin is to specify the scm/developerConnection element.
      I usually also specify the plain connection, although
      I think that is only used for generating project
      documentation, a Maven feature I don't find particularly useful
      personally.

      A section like this needs to be present in every project for which
      you want to be able to use the release plugin, with the project-
      specific Git URL.
     -->
   <scm>
     <connection>scm:git:git://GITHOST/GITPROJECT</connection>
     <developerConnection>scm:git:git://GITHOST/GITPROJECT</developerConnection>
   </scm>

   <build>
     <!--
        Use the plugins section to define Maven plugin configurations that
        you want to share between all projects.
       -->
     <plugins>
       <!--
          Compiler settings that are typically going to be identical in all
          projects. With a name like Måhlén, you get particularly sensitive
          to using the only useful character encoding there is.. ;)
         -->
       <plugin>
         <artifactId>maven-compiler-plugin</artifactId>
         <configuration>
           <source>1.6</source>
           <target>1.6</target>
           <encoding>UTF-8</encoding>
         </configuration>
       </plugin>

       <!--
         Tell Maven to create a source bundle artifact during the package
         phase. This is extremely useful when sharing code, as the act of
         sharing means you'll want to create a relatively large number of
         smallish artifacts, so creating IDE projects that refer directly
         to the source code is unmanageable. But the Maven integration of
         a good IDE will fetch the Maven source bundle if available, so if
         you navigate to a class that is included via Maven from your
         top-level project, you'll still see the source version - and even
         the right source version, because you'll get what corresponds
         to the binary that has been linked.
         -->
       <plugin>
         <artifactId>maven-source-plugin</artifactId>
         <executions>
           <execution>
             <phase>package</phase>
             <goals>
               <goal>jar</goal>
             </goals>
           </execution>
         </executions>
       </plugin>

       <!--
         Ensure that a javadoc jar is being generated and deployed. This
         is useful for similar reasons as source bundle generation,
         although to a lesser degree in my opinion. Javadoc is great, but
         the source is always up to date.
         -->
      <plugin>
        <artifactId>maven-javadoc-plugin</artifactId>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>jar</goal>
            </goals>
          </execution>
         </executions>
      </plugin>

      <!--
        The below configuration information was necessary to ensure that
        you can use the maven release plugin with Git as a version control
        system. The exact version numbers that you want to use are likely
        to have changed since then, and it may even be that Git support is
        more closely integrated nowadays, so less explicit configuration
        is needed - I haven't tested that since maybe March 2009.
       -->
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-release-plugin</artifactId>
        <dependencies>
          <dependency>
            <groupId>org.apache.maven.scm</groupId>
            <artifactId>maven-scm-provider-gitexe</artifactId>
            <version>1.1</version>
          </dependency>
          <dependency>
            <groupId>org.codehaus.plexus</groupId>
            <artifactId>plexus-utils</artifactId>
            <version>1.5.7</version>
          </dependency>
        </dependencies>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
          <artifactId>maven-scm-plugin</artifactId>
          <version>1.1</version>
          <dependencies>
            <dependency>
              <groupId>org.apache.maven.scm</groupId>
              <artifactId>maven-scm-provider-gitexe</artifactId>
              <version>1.1</version>
            </dependency>
            <dependency>
              <groupId>org.codehaus.plexus</groupId>
              <artifactId>plexus-utils</artifactId>
              <version>1.5.7</version>
            </dependency>
          </dependencies>
        </plugin>
      </plugins>
    </build>

    <!--
       Configuration of internal repositories so that the sub-projects
       know where to download internally created artifacts from. Note
       that due to a bootstrapping issue, this configuration needs to
       be duplicated in individual projects. This file, the shared POM,
       is available from the Nexus repo, but if the project POM doesn't
       contain the repo config, the project build won't know where to
       download the shared POM.
      -->
    <repositories>
      <!-- internal Nexus repository for released artifacts -->
      <repository>
        <id>internal-releases</id>
        <url>http://NEXUSHOST/nexus/content/repositories/internal-releases</url>
        <releases><enabled>true</enabled></releases>
        <snapshots><enabled>false</enabled></snapshots>
      </repository>
      <!-- internal Nexus repository for SNAPSHOT artifacts -->
      <repository>
        <id>internal-snapshots</id>
        <url>http://NEXUSHOST/nexus/content/repositories/internal-snapshots</url>
        <releases><enabled>false</enabled></releases>
        <snapshots><enabled>true</enabled></snapshots>
      </repository>

      <!--
        Nexus repository cache for third party repositories such as
        ibiblio. This is not necessary, but is likely to be a
        performance improvement for your builds.
        -->
      <repository>
        <id>3rd party</id>
        <url>http://NEXUSHOST/nexus/content/repositories/thirdparty/</url>
        <releases><enabled>true</enabled></releases>
        <snapshots><enabled>false</enabled></snapshots>
      </repository>

   </repositories>

   <distributionManagement>

      <!-- Defines where to deploy released artifacts to -->
      <repository>
        <id>internal-repository-releases</id>
        <name>Internal release repository</name>
        <url>URL TO NEXUS RELEASES REPOSITORY</url>
      </repository>

      <!-- Defines where to deploy artifact snapshot to -->
      <snapshotRepository>
        <id>internal-repository-snapshot</id>
        <name>Internal snapshot repository</name>
        <url>URL TO NEXUS SNAPSHOTS REPOSITORY</url>
     </snapshotRepository>

   </distributionManagement>

</project>

The less pleasant part of using Maven is that you’ll need to learn more about Maven’s internals than you’d probably like, and you’ll most likely stop trying to fix your builds not when you’ve understood the problem and solved it in the way you know is correct, but when you’ve arrived at a configuration that works through trial and error (as you can see from my comments in the example pom.xml above). The benefits you’ll get in terms of simplifying the management of build artifacts across teams and actually also simplifying the builds themselves outweigh the costs of the occasional hiccup, though. A typical top-level project at Shopzilla links in around 70 internal artifacts through various transitive dependencies – managing that number of dependencies is not easy unless you have a good tool to support you, and dependency management is where Maven shines.

Code Sharing, Maven

This entry was posted on May 1, 2010, 19:12 and is filed under Code Management. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.

#1 by Dominic Mitchell on May 1, 2010 - 22:05

One of the nice things about maven as a build tool is that it’s been around a long time. This means it’s built up tools that work with it. For example, m2eclipse, as well as the above mentioned nexus. I don’t see much in the way of tooling support for ivy, for instance.

m2eclipse makes it incredibly to make use of those internal libraries that you have developed. I also found it significantly useful for understanding how to author good POMs.

I also can’t begin to state how wonderful I found the release plugin after using it for a while. It’s really critical to getting a good, consistent release process going. Like you, I haven’t used the branching functionality much. Development on shared code like that tends to happen in isolated spurts, so I’ve not found much need for a release branch.

I do have one question though. With using story numbers as part of the snapshot revision, don’t you get quite a lot of merge conflicts?

- #2 by Petter Måhlén on May 3, 2010 - 07:48
  
  You’re right, we do get quite a lot of conflicts in the pom.xml files, but they tend to be easy to resolve. Merging a story branch into the master means that you’re removing the story name from the version number, and we keep story branches short-lived, so we don’t do a lot of merges back and forth between them.
  
  Good point about the tools support as well – that is an important thing for productivity. I’ve not used Eclipse much in the last 2 years, but IntelliJ IDEA has great Maven support built in as well.
  
#3 by Dean Moses on February 5, 2011 - 14:49

Excellent post! Love the well-documented pom.xml. Clarified how best to hold shared Maven configuration.

#4 by Alexey Yudichev on January 23, 2012 - 15:00

Hi, I’ve noticed you use separate repositories for internal releases and snapshots. Doesn’t this prevent you from using “Remove Snapshots From Repository / remove if released” in Nexus as snapsot repository never knows about releases?

Branch merge conflicts is also an annoying thing for us. We use mvn release:update-versions after almost every merge before committing.

- #5 by Petter Måhlén on January 23, 2012 - 15:35
  
  Hi,
  
  We are in fact using the ‘remove if released’ feature. I haven’t spent any serious time trying to figure out whether it works or not, but my interpretation of the Nexus documentation is that it should work even if you separate release and snapshot repositories: “This tells the task to ignore the above settings if an artifact with the same group/artifact/version is detected in a release repository”. A quick glance in our Nexus repo seems to bear that out as well – snapshots relating to released versions appear to be deleted for the one project I checked.

Cookbook for Code Sharing « Petter's Blog

Petter Måhlén's Blog

Leave a comment Cancel reply

Archives

Blogs I read

Friends that blog

Tag Cloud

Email Subscription

Admin

Google profile

Petter Måhlén's Blog