Objectives for A Better Maven

My friend Josh Slack made me aware of this post, by a guy (Kent Spillner) who is totally against Maven in almost every way. As I’ve mentioned before, I think Maven is the best tool out there for Java builds, so of course I like it better than Kent does. Still, there’s no doubt he has some points that you can’t help agreeing with. Reading his post made me think (once again) about what is great and not so great about Maven, and also of some ideas about how to fix the problems whilst retaining the great stuff (edit: I’ve started outlining these ideas here, with more to follow).

First, some of the things that are great:

  1. Dependency Management – I would go so far as to argue that Maven has done more to enable code reuse than anything else that is touted as a ‘reusability paradigm’ (such as OO itself). Before Maven and its repositories, you had to manually add every single dependency and their transitive requirements into each project, typically even into your source repository. The amount of manual effort to upgrade from one version of a library, and its transitive dependencies, means the optimal size of a library is quite large, making them unfocused and bloated. What’s more, it also means that library designers have a strong need to reduce the number of things they allow themselves to depend on, which reduces the scope for code reuse. With Maven, libraries can be more focused as it is effortless to have a deep dependency tree. At Shopzilla, our top-level builds typically include 50-200 dependencies. Imagine adding these to your source repository and keeping them up to date with every change – completely impossible!
  2. Build standardisation. The first sentence in Kent Spillner’s post is “The best build tool is the one you write yourself”. That’s probably true from the perspective of a single project, but with a larger set of projects that are collaboratively owned by multiple teams of developers, that idea breaks quickly. Again, I’ll use Shopzilla as an example – we have more than 100 Git repositories with Java code that are co-owned by 5-6 different teams. This means we must have standardised builds, or we would waste lots of time due to having to learn about custom builds for each project. Any open source project exists in an even larger ecosystem; essentially a global one. So unless you know that the number of developers who will be building your project is always going to be small, and that these developers will only have to work with a small number of projects, your build should be “mostly declarative” and as standardised as you can make it.
  3. The wealth of plugins that allow you to do almost any build-related task. This is thanks to the focus on a plugin-based architecture right from the get-go.
  4. The close integration with IDEs that makes it easier (though not quite painless) to work with it.

Any tool that would improve on Maven has to at least do equally well on those four counts.

To get a picture of the opportunities for improvement, here’s my list of Maven major pain points:

  1. Troubleshooting is usually hard to extremely hard. When something breaks, you get very little help from Maven to figure out what it is. Enabling debug level logging on the build makes it verbose to the point of obscuring the error. If you, like me, like to use the source code to find out what is happening, it is difficult to find because you will have to jump from plugin to plugin, and most plugins have different source repositories.
  2. Even though there is a wealth of plugins that allow you to do almost anything, it is usually unreasonably hard to a) find the right plugin and b) figure out how to use it. Understanding what Maven and its plugins do is really hard, and the documentation is frequently sub-standard.
  3. A common complaint is the verbose XML configuration. That is definitely an issue, succinctness improves readability and ease of use.
  4. The main drawback of the transitive dependency management is the risk of getting incompatible versions of the same library, or even worse, class. There is very little built-in support for managing this problem in Maven (there is some in the dependency plugin, but actually using that is laborious). This means that it is not uncommon to have large numbers of ‘exclusion’ tags for some dependencies polluting your build files, and that you anyway tend to have lots of stuff that you never use in your builds.
  5. Maven is slow, there’s no doubt about that. It takes time to create various JVMs, to download files/check for updates, etc. Also, every build runs through the same build steps even if some of them are not needed.
  6. Builds can succeed or fail on different machines for unobvious reasons – typically, the problem is due to differing versions of SNAPSHOT dependencies being installed in the local repository cache. It can also be due to using different versions of Maven plugins.

There’s actually quite a lot more that could be improved, but those are probably my main gripes. When listing them like this, I’m surprised to note that despite all these issues, I still think Maven is the best Java build tool out there. I really do think it is the best, but there’s no doubt that there’s plenty of room to improve things. So I’ve found myself thinking about how I would go about building a better Maven. I am not sure if I’ll be able to actually find the time to implement it, but it is fascinating enough that I can’t let go of the idea. Here’s what I would consider a useful set of objectives for an improved Maven, in order of importance:

  1. Perfect interoperability with the existing Maven artifact management repository infrastructure. There is so much power and value in being able to with near-zero effort get access to pretty much any open source project in the Java world that it is vital to be able to tap into that. Note that the value isn’t just in managing dependencies in a similar way to how Maven does it, but actually reusing the currently available artifacts and repositories.
  2. Simplified troubleshooting. More and better consistency checks of various kinds and at earlier stages of the build. Better and more to the point reporting of problems. Great frameworks tend to make this a key part of the architecture from the get-go rather than add it on as an afterthought.
  3. A pluggable architecture that makes it easy to add custom build actions. This is one of Maven’s great success points so a new framework has to be at least as good. I think it could and should be even easier than Maven makes it.
  4. Encouraging but not enforcing standardised builds. This means sticking to the idea of “convention over configuration“. It also means that defining your build should be a “mostly declarative” thing, not an imperative thing. You should say “I want a JAR”, not “zip up the class files in directory X”. Programming your builds is sometimes a necessary evil so it must be possible, but it should be discouraged as it is a slippery slope that leads to non-standardised builds, which in turn means making it harder for anybody coming new to a project to get going with it.
  5. Great integration with IDEs, or at least support for the ability to create great integration with IDEs. This is a necessary part of giving programmers a workflow that never forces them out of the zone.
  6. Less verbose configuration. Not a show-stopper in my opinion, but definitely a good thing to improve.
  7. EDIT: While writing further posts on this topic, I’ve realised that there is one more thing that I consider very important: improving performance. Waiting for builds is a productivity-killing drag.

It’s good to specify what you want to do, but in some ways, it’s even better to specify things you’re not trying to achieve either because they’re not relevant or because you find them counter-productive. That gives a different kind of clarity. So here’s a couple of non-objectives:

  1. Using the same artifact management mechanism for build tool plugins as for project dependencies, the way Maven does. While there is some elegance to this idea, it also comes with a host of difficulties – unreproducible builds being the main one, and while earlier versions of Maven actively updated plugin versions most or all the time, Maven 3 now issues warnings if you haven’t specified the plugin versions for your build.
  2. Reimplementing all the features provided by Maven plugins. Obviously, trying to out-feature something so feature-rich as Maven would be impossible and limit the likelihood of success hugely. So one thing to do is to select a subset of build steps that represent the most common and/or most different things that are typically done in a build and then see how well the framework deals with that.
  3. Being compatible with Maven plugins. In a way, it would be great to be able for a new build tool to be able to use any existing Maven plugin. But being able to do that would limit the architectural options and increase the complexity of the new architecture to the point of making it unlikely to succeed.
  4. Reproducing the ‘project information’ as a core part of the new tool. Producing project information was one of the core goals of Maven when it was first created. I personally find that less than useful, and therefore not worth making into a core part of a better Maven. It should of course be easy to create a plugin that does this, but it doesn’t have to be a core feature.

I’ve got some ideas for how to build a build tool that meets or is likely to meet most of those objectives. But this post is already more than long enough, and I’d anyway like to stop here to ask for some feedback. Any opinions on the strengths, weaknesses and objectives outlined here?

Advertisements

,

  1. #1 by Tim Morrow on January 21, 2011 - 11:55

    Great post, Petter.

    IMO one of the biggest issues are low-quality POMs for transitive dependencies and a lack of canonical naming for certain libraries, leading to dependency bloat as you note in point 4. The latter may be a holdover from the initial migration from Maven 1 to Maven 2 when the naming conventions changed, but finding that the same library is included N times through different groupIds and artifactIds is frustrating. And finding libraries that are either part of the JDK now (like Xalan and Xerces) or put of the runtime environment (like the servlet API) is frustrating.

    So in your improved build tool, acknowledging these issues and permitting some kind of convenient mechanism for globally eradicating certain dependencies would be great.

    As you note in point 6, the lack of determinism is problematic. Anytime you have to suggest to a developer who is having build issues to “try it again” is pretty broken.

    Do you have any experience with some of the build tools listed here?: http://www.streamhead.com/maven-alternatives/ buildr looks especially compelling and might meet some of your objectives.

    Tim

    • #2 by Petter Måhlén on January 21, 2011 - 13:10

      I have previously looked at buildr, rake and raven and not found them to be compelling alternatives to Maven. But it’s been a couple of years since I checked them out, and things may have happened. I guess one problem for me personally is Ruby – I never quite subscribed to the idea of “polyglot programmer” that was popular a couple of years back because I feel that programming languages actually take a lot of time and investment to learn. So I never got round to learning Ruby, it didn’t and doesn’t feel like a language for me. I’m working on the next post which describes some of my ideas for how to fix things and I’ve got one or two more things to say on this topic then. Either way, it is hard to get a proper understanding of the pros and cons of a build tool without working with it for an extended period of time. Maybe I should do that with one or a few of those tools.

      I definitely agree with your comments about the frustration of duplicated library inclusions – it’s not an easy thing to “fix”, but alleviating the problem should ideally be much easier than it is today using Maven.

  2. #3 by Andreas Holmén on January 21, 2011 - 16:40

    I am a fan of maven. If you have a project that depends on more than zero classes that are not in rt.jar or if you do intend to publish your product as anything other than a single .class file I would recommend maven.

    There are some problems of course, for me the lack of usable documentation is probably the worst but plugin configuration itself is less than fun even if you find some doc about how to do it. Performance issues and the unpredictable builds are bad problems. I have a few practises somewhat counter to maven core concepts like “never depend on a snapshot” and “don’t let maven download stuff”. But in spite of them I choose to use maven whenever I can. Also I really agree with the convetion over configuration policy. I will gladly adhere to a directory structure convention if it means that i get my .war packaged wihtout having to know how to package a .war.

    We have about the same opinions regarding what is great with maven. In my list of things that are good about maven I would not have mentioned all the plugins (3) because the docs are so hopeless I usually give up when I try to find a plugin that could help me in any specific case. I just use tiny subset but it’s enough for me.

    My list of things that need improving in a build tool would be somewhat different. I do not feel that troubleshooting errors in maven is much of a problem, typically my build will fail due to errors in my tests or my dependencies and that information is displayed well enough for me. Sub par docs are not really a problem with maven but rather with the plugin developers who don’t bother to write usable docs. Maven is slow, yeah maybe so, if a new tool was faster and equally good I would use it.

    Number three, verbose configs, is something I would like to see improved. Usable docs would mitigate some of the problem but perhaps plugins could be configured in seperate files imported by a main “pom” to keep things apart and small enough to take in easily. Perhaps there can be a golden convention for how plugins are configured and a way to make it easy to document plugins.

    The fourth point (and it’s close relative number six) would be a real USP. If you can resolve dependency versions in some way and provide deterministic build results I could quit with my policies of “never depend on a snapshot” and “don’t let maven download stuff”.

    My worry would be that it is an impossible task. Two 3rd party tools could both introduce transitive dependencies on some common tool but different versions. Each of them may or may not be compatible with the other version but how would you know? At best you could announce a dependency conflict and force the developer to make an epxlicit decision (but hey that’s a great idea!). If you have stated that you want to use a snapshot version of something you have said “I will accept that my build may fail due to something someone elses does whenever he wants to” and that’s not going to change.

    I see two other major issues, the first being your first objective. Interoperability with existing maven repos is probably required which means you have to parse and understand maven poms for transitive dep resolution. Also you will have to generate valid maven poms for your own components. Unless you intend to publish things into the maven repos that are not maven compatible (lots of bad karma there) you would have to generate maven poms that can actually build your project (not just resolve dependencies) using standard maven. This only really counts if you publish the source but a build tool should not exclude the open source option.

    The other thing would be IDE integration. Perhaps that can be left up to a comunity of plugin devs but before I make a switch it must load in my eclipse at least as well as my maven project does before I will consider the tool.

    Finally I have one imoprtant feature that I think needs to be on the list of objectives which is a SCM connection branching and releasing capability in line with what maven can do.

    • #4 by Petter Måhlén on January 22, 2011 - 11:36

      Thanks for the feedback – that comment is nearly long enough to be a blog post of its own. :)

      A couple of quick replies: I think that the way that Maven handles transitive dependencies isn’t great – that’s what you and Tim are both saying as well. We’re also all saying that it’s a hard or impossible problem to fix completely. Hm. More thought needed.

      About interoperability with Maven repos: my idea would be to be able to generate essentially just the dependencies part of the pom and include that in something that can be deployed to a Maven repo, but not the rest of the build configuration. You’re saying that a pom that can build the project is needed – why is that?

      Your comment about IDE integration is good. I started out thinking that enabling great IDE integration is vital, but it could actually be that implementing it is in fact a core feature.

  3. #5 by Jodi M on January 23, 2011 - 21:56

    One thing you didn’t mention in your post: repository managers. IMO Maven without a repository manager is practically unusable, and conversely repository management solves a great deal of the problems of using Maven standalone. Repository managers should be considered a necessary (core) aspect of Maven.

    http://maven.apache.org/repository-management.html

    (we use Nexus)

    • #6 by Petter Måhlén on January 24, 2011 - 10:59

      Absolutely, I agree completely. When I mentioned “Perfect interoperability with the existing Maven artifact management repository infrastructure”, I meant that to include reusing existing tools such as repository managers as well as reusing the services such as ibiblio, etc., that are available on the internet. So that would mean that any attempt at improving Maven would have to be capable of at least generating pom:s that can be deployed to repositories, even if those pom:s may or may not be used in the build configuration itself. Thanks for clarifying that!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: