Maven’s slow progress towards becoming the most accepted Java build tool seems to continue, although a lot of people are still annoyed enough with its numerous warts to prefer Ant or something else. My personal opinion is that Maven is the best build solution for Java programs that is out there, and as somebody said – I’ve been trying to find the quote, but I can’t seem to locate it – when an Ant build is complicated, you blame yourself for writing a bad build.xml, but when it is hard to get Maven to behave, you blame Maven. With Ant, you program it, so any problems are clearly due to a poorly structured program. With Maven you don’t tell it how to do things, you try to tell it what should be done, so any problems feel like the fault of the tool. The thing is, though, that Maven tries to take much more responsibility for some of the issues that lead to complex build scripts than something like Ant does.
I’ve certainly spent a lot of time cursing poorly written build scripts for Ant and other tools, and I’ve also spent a lot of time cursing Maven when it doesn’t do what I want it to. But the latter time is decreasing as Maven keeps improving as a build tool. There’s been lots of attempts to create other tools that are supposed to make builds easier than Maven, but from what I have seen, nothing has yet really succeeded to provide a clearly better option (I’ve looked at Buildr and Raven, for instance). I think the truth is simply that the build process for a large system is a complex problem to solve, so one cannot expect it to be free of hassles. Maven is the best tool out there for the moment, but will surely be replaced by something better at some point.
So, using Maven isn’t going to be problem-free. But it can help with a lot of things, particularly in the context of sharing code between multiple teams. The obvious thing it helps with is the single benefit that most people agree that Maven has – its way of managing dependencies and the massive repository infrastructure and dependency database that is just available out there. On top of that, building Maven projects in Hudson is dead easy, and there’s a whole slew of really nice tools that come with Maven plugins that you can use that enable you to get all kinds of reports and metadata about your code. My current favourite is Sonar, which is great if you want to keep track of how your code base evolves from some kind of aggregated perspective.
Here are some things you’ll want to do if you decide to use Maven for the various projects that make up your system:
- Use Nexus as an internal repository for build artifacts.
- Use the Maven Release plugin to create releases of internal artifacts.
- Create a shared POM for the whole code base where you can define shared settings for your builds.
The word ‘repository’ is a little overloaded in Maven, so it may be confusing. Here’s a diagram that explains the concept and shows some of the things that a repository manager like Nexus can help you with:
The setup includes a Git server (because you use Git) for source control, a Hudson server (or set of) that does continuous integration, a Nexus-managed artifact repository and a developer machine. The Nexus server has three repositories in it: internal releases, internal snapshots and a cache of external repositories. The latter is only there as a performance improvement. The other two are the way that you distribute Maven artifacts within your organisation. When a Maven build runs on the Hudson or developer machines, Maven will use artifacts from the local repository on the machine – by default located in a folder under the user’s home directory. If a released version of an artifact isn’t present in the local repository, it will be downloaded from Nexus, and snapshot versions will periodically be refreshed, even if present locally. In the example setup, new snapshots are typically deployed to the Nexus repository by the Hudson server, and released versions are typically deployed by the developer producing the release. Note that both Hudson and developers are likely to install snapshots to the local repository.
I’ve tried a couple of other repository managers (Archiva, Artifactory and Maven-Proxy), but Nexus has been by a pretty wide margin the best – robust, easy to use and easy to understand. It’s been a year or two since I looked at the other ones, so they may have improved since.
Having an internal repository opens up for code sharing by providing a uniform mechanism for distributing updated versions of internal libraries using the standard Maven deploy command. Maven has two types of artifact versions: releases and snapshots. Releases are assumed to be immutable and snapshots mutable, so updating a snapshot in the internal repository will affect any build that downloads the updated snapshot, whereas releases are supposed to be deployed to the internal repository once only – any subsequent deployments should deploy something that is identical. Snapshots are tricky, especially when branching. If you create two branches of the same library and fail to ensure that the two branches have different snapshot versions, the two branches will interfere,
There is interference between the two branches because they both create updates to the same artifact in the Maven repositories. Depending on the ordering of these updates, builds may succeed or fail seemingly at random. At Shopzilla, we typically solve this problem in two ways: for some shared projects, where we have long-lived/permanent team-specific branches, the team name is included in the version number of the artifact, and for short-lived user story branches, the story ID is included in the version number. So if I need to create a branch off of version 2.3-SNAPSHOT for story S3765, I’ll typically label the branch S3765 and change the version of the Maven artifact to 2.3-S3765-SNAPSHOT. The Maven release plugin has a command that simplifies branching, but for whatever reason, I never seem to use it. Either way, being careful about managing branches and Maven versions is necessary.
A situation where I do use the maven release plugin a lot is when making releases of shared libraries. I advocate a workflow where you make a new release of your top-level project every time you make a live update, and because you want to make live updates frequently and you use scrum, that means a new Maven release with every iteration. To make a Maven release of a project, you have to eliminate all snapshot dependencies – this is a necessary requirement for immutability – so releasing the top level project means make release versions of all its updated dependencies. Doing this frequently reduces the risk of interference between teams by shortening the ‘checkout, modify, checkin’ cycle.
See the pom file example below for some hands-on pom.xml settings that are needed to enable using the release plugin.
The final tip for code sharing using Maven that I wanted to give is to use a shared parent POM that contains settings that should be shared between projects. The main reason is of course to reduce code duplication – any build file is code, of course, and Maven build files are not as easy to understand as one would like, so simplifying them is very valuable. Here’s some stuff that I think should go into a shared pom.xml file:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.mycompany</groupId> <artifactId>shared-pom</artifactId> <name>Company Shared pom</name> <version>1.0-SNAPSHOT</version> <packaging>pom</packaging> <!-- One of the things that is necessary in order to be able to use the release plugin is to specify the scm/developerConnection element. I usually also specify the plain connection, although I think that is only used for generating project documentation, a Maven feature I don't find particularly useful personally. A section like this needs to be present in every project for which you want to be able to use the release plugin, with the project- specific Git URL. --> <scm> <connection>scm:git:git://GITHOST/GITPROJECT</connection> <developerConnection>scm:git:git://GITHOST/GITPROJECT</developerConnection> </scm> <build> <!-- Use the plugins section to define Maven plugin configurations that you want to share between all projects. --> <plugins> <!-- Compiler settings that are typically going to be identical in all projects. With a name like Måhlén, you get particularly sensitive to using the only useful character encoding there is.. ;) --> <plugin> <artifactId>maven-compiler-plugin</artifactId> <configuration> <source>1.6</source> <target>1.6</target> <encoding>UTF-8</encoding> </configuration> </plugin> <!-- Tell Maven to create a source bundle artifact during the package phase. This is extremely useful when sharing code, as the act of sharing means you'll want to create a relatively large number of smallish artifacts, so creating IDE projects that refer directly to the source code is unmanageable. But the Maven integration of a good IDE will fetch the Maven source bundle if available, so if you navigate to a class that is included via Maven from your top-level project, you'll still see the source version - and even the right source version, because you'll get what corresponds to the binary that has been linked. --> <plugin> <artifactId>maven-source-plugin</artifactId> <executions> <execution> <phase>package</phase> <goals> <goal>jar</goal> </goals> </execution> </executions> </plugin> <!-- Ensure that a javadoc jar is being generated and deployed. This is useful for similar reasons as source bundle generation, although to a lesser degree in my opinion. Javadoc is great, but the source is always up to date. --> <plugin> <artifactId>maven-javadoc-plugin</artifactId> <executions> <execution> <phase>package</phase> <goals> <goal>jar</goal> </goals> </execution> </executions> </plugin> <!-- The below configuration information was necessary to ensure that you can use the maven release plugin with Git as a version control system. The exact version numbers that you want to use are likely to have changed since then, and it may even be that Git support is more closely integrated nowadays, so less explicit configuration is needed - I haven't tested that since maybe March 2009. --> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-release-plugin</artifactId> <dependencies> <dependency> <groupId>org.apache.maven.scm</groupId> <artifactId>maven-scm-provider-gitexe</artifactId> <version>1.1</version> </dependency> <dependency> <groupId>org.codehaus.plexus</groupId> <artifactId>plexus-utils</artifactId> <version>1.5.7</version> </dependency> </dependencies> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-scm-plugin</artifactId> <version>1.1</version> <dependencies> <dependency> <groupId>org.apache.maven.scm</groupId> <artifactId>maven-scm-provider-gitexe</artifactId> <version>1.1</version> </dependency> <dependency> <groupId>org.codehaus.plexus</groupId> <artifactId>plexus-utils</artifactId> <version>1.5.7</version> </dependency> </dependencies> </plugin> </plugins> </build> <!-- Configuration of internal repositories so that the sub-projects know where to download internally created artifacts from. Note that due to a bootstrapping issue, this configuration needs to be duplicated in individual projects. This file, the shared POM, is available from the Nexus repo, but if the project POM doesn't contain the repo config, the project build won't know where to download the shared POM. --> <repositories> <!-- internal Nexus repository for released artifacts --> <repository> <id>internal-releases</id> <url>http://NEXUSHOST/nexus/content/repositories/internal-releases</url> <releases><enabled>true</enabled></releases> <snapshots><enabled>false</enabled></snapshots> </repository> <!-- internal Nexus repository for SNAPSHOT artifacts --> <repository> <id>internal-snapshots</id> <url>http://NEXUSHOST/nexus/content/repositories/internal-snapshots</url> <releases><enabled>false</enabled></releases> <snapshots><enabled>true</enabled></snapshots> </repository> <!-- Nexus repository cache for third party repositories such as ibiblio. This is not necessary, but is likely to be a performance improvement for your builds. --> <repository> <id>3rd party</id> <url>http://NEXUSHOST/nexus/content/repositories/thirdparty/</url> <releases><enabled>true</enabled></releases> <snapshots><enabled>false</enabled></snapshots> </repository> </repositories> <distributionManagement> <!-- Defines where to deploy released artifacts to --> <repository> <id>internal-repository-releases</id> <name>Internal release repository</name> <url>URL TO NEXUS RELEASES REPOSITORY</url> </repository> <!-- Defines where to deploy artifact snapshot to --> <snapshotRepository> <id>internal-repository-snapshot</id> <name>Internal snapshot repository</name> <url>URL TO NEXUS SNAPSHOTS REPOSITORY</url> </snapshotRepository> </distributionManagement> </project>
The less pleasant part of using Maven is that you’ll need to learn more about Maven’s internals than you’d probably like, and you’ll most likely stop trying to fix your builds not when you’ve understood the problem and solved it in the way you know is correct, but when you’ve arrived at a configuration that works through trial and error (as you can see from my comments in the example pom.xml above). The benefits you’ll get in terms of simplifying the management of build artifacts across teams and actually also simplifying the builds themselves outweigh the costs of the occasional hiccup, though. A typical top-level project at Shopzilla links in around 70 internal artifacts through various transitive dependencies – managing that number of dependencies is not easy unless you have a good tool to support you, and dependency management is where Maven shines.