One of the questions I often ask when interviewing candidate programmers is “what is the difference between inheritance and composition”. Some people don’t know, and they’re obviously out right then and there. Most people explain that “inheritance represents an is-a relationship and composition a has-a”, which is correct but insufficient. Very few people have a good answer to the question that I’m really asking: what impact does the choice of one or the other have on your software design? The last few years it seems there has been a trend towards feeling that inheritance is over-used and composition to a corresponding degree under-used. I agree, and will try to show some concrete reasons why that is the case. (As a side note, the main trigger for this blog post is that about a year ago, I introduced a class hierarchy as a part of a refactoring story, and it was immediately obvious that it was an awkward solution. However, it (and the other changes) represented a big improvement over what went before, and there wasn’t time to fix it back then. It’s bugged me ever since, and thanks to our gradual refactoring, we’re getting closer to the time when we can put something better in place, so I started thinking about it again.)
One guy I interviewed the other day had a pretty good attempt at describing when one should choose one or the other: it depends on which is closest to the real world situation that you’re trying to model. That has the ring of truth to it, but I’m not sure it is enough. As an example, take the classically hierarchical model of the animal kingdom: Humans and Chimpanzees are both Primates, and, skipping a bunch of steps, they are Animals just like Kangaroos and Tigers. This is clearly a set of is-a relationships, so we could model it like so:
public class Animal {}
public class Primate extends Animal {}
public class Human extends Primate {}
public class Chimpanzee extends Primate {}
public class Kangaroo extends Animal {}
public class Tiger extends Animal {}
Now, let’s add some information about their various body types and how they walk. Both Humans and Chimpanzees have two arms and two legs, and Tigers have four legs. So we could add two legs and two arms at the Primate level, and four legs at the Tiger level. But then it turns out that Kangaroos also have two legs and two arms, so this solution doesn’t work without duplication. Perhaps a more general concept involving limbs is needed at the animal level? There would still be duplication of the ‘legs + arms’ classification in both Kangaroo and Primate, not to mention what would happen if we introduce a RattleSnake into our Eden (or an EarthWorm if you prefer an animal that has never had limbs at any stage of its evolutionary history).
A similar problem is shown by introducing a walk() method. Humans do a bipedal walk, Chimpanzees and Tigers walk on all fours, and Kangaroos hop using their rear feet and tail. Where do we locate the logic for movement? All the animals in our sample hierarchy can walk on all fours (although humans tend to do so only when young or inebriated), so having a quadrupedWalk() method at the Animal level that is called by the walk() methods of Tiger and Chimpanzee makes sense. At least until we add a Chicken or Halibut, because they will now have access to a type of locomotion they shouldn’t.
The root problem here is of course that the model is bad even though it seems like a natural one. If you think about it, body types and locomotion are not defined by the various creatures’ positions in the taxonomy – there are both mammals and fish that swim, and both fish, lizards and snakes can be legless. I think it is natural for us to think in terms of templates and inherited properties from a more general class (Douglas Hofstadter said something very similar, I think in Gödel, Escher, Bach). If that kind of thinking is a human tendency, that could explain why it seems so common to initially think that a hierarchical model is the best one. It is often a mistake to assume that the presence of an is-a relationship in the real world means that inheritance is right for your object model.
But even if, unlike the example above, a hierarchical model is a suitable one to describe a certain real-world scenario, there are issues. A big one, in my opinion, is that the classes that make up a hierarchical model are very tightly coupled. You cannot instantiate a class without also instantiating all of its superclasses. This makes them harder to test – you can never ’mock’ a superclass, so each test of a leaf in the hierarchy requires you to do all the setup for all the classes above. In the example above, you might have to set up the limbs of your RattleSnake and the data needed for the quadrupedWalk() method of the Halibut before you could test them.
With inheritance, the smallest testable unit is larger than it is when composition is used and you increase code duplication in your tests at least. In this way, inheritance is similar to use of static methods – it removes the option to inject different behaviour when testing. The more code you try to reuse from the superclass/es, the more redundant test code you will typically get, and all that redundancy will slow you down when you modify the code in the future. Or, worse, it will make you not test your code at all because it’s such a pain.
There is an alternative model, where each of the beings above have-a BodyType and possibly also have-a Locomotion that knows how to move that body. Probably, each of the animals could be differently configured instances of the same Animal class.
public interface BodyType {}
public TwoArmsTwoLegs implements BodyType {}
public FourLegs implements BodyType {}
public interface Locomotion<B extends BodyType> {
void walk(B body);
}
public class BipedWalk implements Locomotion<TwoArmsTwoLegs> {
public void walk(TwoArmsTwoLegs body) {}
}
public class Slither implements Locomotion<NoLimbs> {
public void walk(NoLimbs body) {}
}
public class Animal {
BodyType body;
Locomotion locomotion;
}
Animal human = new Animal(new TwoArmsTwoLegs(), new BipedWalk());
This way, you can very easily test the bodies, the locomotions, and the animals in isolation of each other. You may or may not want to do some automated functional testing of the way that you wired up your Human to ensure that it behaves like you want it to. I think composition is almost universally preferable to inheritance for code reuse – I’ve tried to think of cases where the only natural model is a hierarchical one and you can’t use composition instead, but it seems to be beyond me.
In addition to the arguments here, there’s the well-known fact that inheritance breaks encapsulation by exposing subclasses to implementation details in the superclass. This makes the use of subclassing as a means to integrate with a library a fragile solution, or at least one that limits the future evolution of the library.
So, there it is – that summarises what I think are the differences between composition and inheritance, and why one should tend to prefer composition over inheritance. So, if you’re reading this because you googled me before an interview, this is one answer you should get right!
#1 by Zbigniew Lukasiak on September 23, 2010 - 14:05
I don’t have an interview with you – just googled for inheritance versus composition – but this is interesting. I’ve heard similar advice for a long time – but programmers still use inheritance very extensively, one explanation to that could be the mentioned compatibility with how our brain works, but I think it is also the dynamic process of development that is at play here. It is easier to get something working if you can take something similar and tweak it then when you are limited to building something new from ready made parts. As Agile arguments – this is also economic way to start with code working but possibly ugly structured and refactor it later.
#2 by Smahlatz on July 27, 2012 - 13:19
hmm – you say “One guy I interviewed the other day had a pretty good attempt at describing when one should choose one or the other: it depends on which is closest to the real world situation that you’re trying to model.”
Then give an example which is not based on a real world situation, which you state yourself – “The root problem here is of course that the model is bad”.
I generally agree with your view re Inheritance V Composition, but I think this particular example is a bad one,
#3 by Petter Måhlén on July 28, 2012 - 15:49
It looks like maybe a part of your comment didn’t make it – the point of the example was to show how easy it is to incorrectly use a model with inheritance using a concept I think is universally known. If you have some more ideas about how the example could be improved or why it is bad, go ahead and post them!
#4 by Manne Fagerlind on January 4, 2013 - 09:23
Funny, I used this very example in a presentation the other day. I think it’s a good way to demonstrate that inheritance is over-used and composition is generally more useful.
I’d say inheritance doesn’t really exist in the animal kingdom; it’s just a theoretical construct. We are what we are because of our constituent parts (molecules, cells, body parts), not because we have been categorized as mammals or primates. The same goes for most other concepts, especially physical objects.
#5 by Petter Måhlén on January 4, 2013 - 10:45
I’m not sure I’d go so far as to say that inheritance doesn’t exist in the animal kingdom (the ghost of Linné might come to haunt me if I did!). I was trying to make a slight variation of that point: even though inheritance does exist, it’s not necessarily the best perspective to have when you are creating a domain model for some specific problem. Humans and Chimpanzees are both Primates because we share the same ancestors. So there is an is-a relationship. But that relationship doesn’t have to be expressed through classes inheriting from each other, and I think doing so is usually a bad idea. If you do need that relationship in the code, maybe the best choice is a simple tree of objects rather than a hierarchy of classes.
In the animal kingdom, parallel evolution means you cannot say that common traits (like a bipedal walk, or having eyes) are necessarily inherited. They can appear spontaneously on separate branches/nodes of the inheritance tree. The same is often true of other domains – you may want both your button and your text to be clickable, but I think it generally makes the code awkward if you add the ‘clickable’ behaviour into a parent class of both of those (especially if, say, you don’t want read-only text to be clickable, or something). And there is much less strength in the argument that buttons and texts are both ‘ClickableWidgets’ than that humans and chimps are primates. The ‘ClickableWidget’ parent has probably been added because the mistake of using a class hierarchy had already been made.
#6 by Manne Fagerlind on January 4, 2013 - 11:06
Yes, we have common ancestors, but what defines us as individuals isn’t the history of our species (inheritance) but the properties of our bodies (composition). As you rightly point out, the relationship between biological history and common traits is very shaky. The inheritance concept is based on Aristotelian logic and really too simplistic when applied to most real-world problems. In the area of animals, it is only relevant as a historical concept.
I think we agree about the main thing: favouring composition over inheritance. It will do all developers a world of good.
#7 by Petter Måhlén on January 4, 2013 - 11:29
Yes, we do agree. :)