Archive for October, 2010
DCI Better with DI?
Posted by Petter Måhlén in Software Development on October 2, 2010
I recently posted some thoughts about DCI and although I mostly thought it was great, I had two points of criticism in how it was presented: first, that using static class composition was broken from a layering perspective and second, that it seemed like class composition in general could be replaced by dependency injection. Despite getting a fair amount of feedback on the post, I never felt that those claims were properly refuted, which I took as an indication that I was probably right. But, although I felt and still feel confident that I am right about the claim about static class composition and layering, I was and am less sure about the second one. There was this distinction being made between class-oriented and object-oriented programming that I wasn’t sure I understood. So I decided to follow James Coplien’s advice that I should read up on Trygve Reenskaug’s work in addition to his own. Maybe that way I could understand if there was a more subtle distinction between objects and classes than the one I called trivial in my post.
Having done that, my conclusion is that I had already understood about objects, and the distinction is indeed that which I called trivial. So no epiphany there. But what was great was that I did understand two new things. The first was something that Jodi Moran, a former colleague, mentioned more or less in passing. She said something like “DI (Dependency Injection) is of course useful as it separates the way you wire up your objects from the logic that they implement”. I had to file that away for future reference as I only understood it partially at the time – it sounded right, but what, exactly, were the benefits of separating out the object graph configuration from the business logic? Now, two years down the line, and thanks to DCI, I think I fully get what she meant, and I even think I can explain it. The second new thing I understood was that there is a benefit of injecting algorithms into objects (DCI proper) as opposed to injecting objects into algorithms (DCI/DI). Let’s start with the point about separating your wiring up code from your business logic.
An Example
Explanations are always easier to follow if they are concrete, so here’s an example. Suppose we’re working with a web site that is a shop of some kind, and that we’re running this site in different locations across the world. The European site delivers stuff all over Europe, and the one based in Wisconsin (to take some random US state) sells stuff all over but not outside the US. And let’s say that when a customer has selected some items, there’s a step before displaying the order for confirmation when we calculate the final cost:
// omitting interface details for brevity public interface VatCalculator { } public interface Shipper { } public class OrderProcessor { public void finaliseOrder(Order order) { vatCalculator.addVat(order); shipper.addShippingCosts(order); // ... probably other stuff as well } }
Let’s add some requirements about VAT calculations:
- In the US, VAT is based on the state where you send stuff from, and Wisconsin applies the same flat VAT rate to any item that is sold.
- In the EU, you need to apply the VAT rules in the country of the buyer, and:
- In Sweden, there are two different VAT rates: one for books, and another for anything else.
- In Poland, some items incur a special “luxury goods VAT” in addition to the regular VAT. These two taxes need to be tracked in different accounts in the bookkeeping, so must be different posts in the order.
VAT calculations in the above countries may or may not work a little bit like that, but that’s not very relevant. The point is just to introduce some realistic complexity into the business logic.
Here’s a couple of classes that sketch implementations of the above rules:
public class WisconsinVAT implements VAT { public void addVat(Order order) { order.addVat(RATE * order.getTotalAmount(), "VAT"); } } public class SwedenVAT implements VAT { public void addVat(Order order) { Money bookAmount = sumOfBookAmounts(order.getItems()); Money nonBookAmount = order.getTotalAmount() - bookAmount) order.addVat(BOOK_RATE * bookAmount + RATE * nonBookAmount, "Moms"); } } public class PolandVAT implements VAT { public void addVat(Order order) { Money luxuryAmount = sumOfLuxuryGoodsAmounts(order.getItems()); // Two VAT lines on this order order.addVat(RATE * order.getTotalAmount(), "Podatek VAT"); order.addVat(LUXURY_RATE * luxuryAmount, "Podatek akcyzowy"); } }
Right – armed with this example, let’s see how we can implement it without DI, with traditional DI and with DCI/DI.
VAT Calculation Before DI
This is a possible implementation of the OrderProcessor and VAT calculator without using dependency injection:
public class OrderProcessor { private VatCalculator vatCalculator = new VatCalculatorImpl(); private Shipper shipper = new ShipperImpl(); public void finaliseOrder(Order order) { vatCalculator.addVat(order); shipper.addShippingCosts(order); // ... probably other stuff as well } } public class VatCalculatorImpl implements VatCalculator { private WisconsinVAT wisconsinVAT = new WisconsinVAT(); private Map<Country, VAT> euVATs = new HashMap<>(); public VatCalculator() { euVATs.put(SWEDEN, new SwedenVAT()); euVATs.put(POLAND, new PolandVAT()); } public void addVat(Order order) { switch (GlobalConfig.getSiteLocation()) { case US: wisconsinVAT.addVat(order); break; case EUROPE: VAT actualVAT = euVATs.get(order.getCustomerCountry()); actualVAT.addVat(order); } } }
The same classes that implement the business logic also instantiate their collaborators, and the VatCalculatorImpl accesses a singleton implemented using a public static method (GlobalConfig).
The main problems with this approach are:
- Only leaf nodes (yes, the use of the term ‘leaf’ is sloppy when talking about a directed graph – ‘sinks’ is probably more correct) are unit testable in practice. So while it’s easy to instantiate and test the PolandVAT class, instantiating a VatCalculator forces the instantiation of four other classes: all the VAT implementations plus the GlobalConfig, which makes testing awkward. Nobody describes these problems better than Miško Hevery, see for instance this post. Oh, and unit testing is essential not primarily as a quality improvement measure, but for productivity and as a way to enable many teams to work on the same code.
- As Trygve Reenskaug describes, it is in practice impossible to look at the code and figure out how objects are interconnected. Nowhere in the OrderProcessor is there any indication that it not only will eventually access the GlobalConfig singleton, but also needs the getSiteLocation() method to return a useful value, and so on.
- There is no flexibility to use polymorphism and swap implementations depending on the situation, making the code less reusable.The OrderProcessor algorithm is actually general enough that it doesn’t care exactly how VAT is calculated, but this doesn’t matter since the wiring up of the object graph is intermixed with the business logic. So there is no easy way to change the wiring without also risking inadvertent change to the business logic, and if we would want to launch sites in for instance African or Asian countries with different rules, we might be in trouble.
- A weaker argument, or at least one I am less sure of, is: because the object graph contains all the possible execution paths in the application, automated functional testing is nearly impossible. Even for this small, simple example, most of the graph isn’t in fact touched by a particular use case.
Countering those problems is a large part of the rationale for using Dependency Injection.
VAT Calculation with DI
Here’s what an implementation could look like if Dependency Injection is used.
public class OrderProcessor { private VatCalculator vatCalculator; private Shipper shipper; public OrderProcessor(VatCalculator vatCalculator, Shipper shipper) { this.vatCalculator = vatCalculator; this.shipper = shipper; } public void finaliseOrder(Order order) { vatCalculator.addVat(order); shipper.addShippingCosts(order); // ... probably other stuff as well } } public class FlatRateVatCalculator implements VatCalculator { private VAT vat; public FlatRateVatCalculator(VAT vat) { this.vat = vat; } public void addVat(Order order) { vat.addVat(order); } } public class TargetCountryVatCalculator implements VatCalculator { private Map<Country, VAT> vatsForCountry; public TargetCountryVatCalculator(Map<Country, VAT> vatsForCountry) { vatsForCountry = ImmutableMap.copyOf(vatsForCountry); } public void addVat(Order order) { VAT actualVAT = vatsForCountry.get(order.getCustomerCountry()); actualVAT.addVat(order); } } // actual wiring is better done using a framework like Spring or Guice, but // here's what it could look like if done manually public OrderProcessor wireForUS() { VatCalculator vatCalculator = new FlatRateVatCalculator(new WisconsinVAT()); Shipper shipper = new WisconsinShipper(); return new OrderProcessor(vatCalculator, shipper); } public OrderProcessor wireForEU() { Map<Country, VAT> countryVats = new HashMap<>(); countryVATs.put(SWEDEN, new SwedenVAT()); countryVATs.put(POLAND, new PolandVAT()); VatCalculator vatCalculator = new TargetCountryVatCalculator(countryVats); Shipper shipper = new EuShipper(); return new OrderProcessor(vatCalculator, shipper); }
This gives the following benefits compared to the version without DI:
- Since you can now instantiate single nodes in the object graph, and inject mock objects as collaborators, every node (class) is testable in isolation.
- Since the wiring logic is separated from the business logic, the business logic classes get a clearer purpose (business logic only, no wiring) and are therefore simpler and more reusable.
- Since the wiring logic is separated from the business logic, wiring is defined in a single place. You can look in the wiring code or configuration and see what your runtime object graph will look like.
However, there are still a couple of problems:
- At the application level, there is a single object graph with objects that are wired together in the same way that needs to handle every use case. Since object interactions are frozen at startup time, objects need conditional logic (not well described in this example, I’m afraid) to deal with variations. This complicates the code and means that any single use case will most likely touch only a small fragment of the execution paths in the graph.
- Since in a real system, the single object graph to rule them all will be very large, functional testing – testing of the wiring and of object interactions in that particular configuration – is still hard or impossible.
- There’s too much indirection – note that the VatCalculator and VAT interfaces define what is essentially the same method.
This is where the use-case specific context idea from DCI comes to the rescue.
DCI+DI version
In DCI, there is a Context that knows how to configure a specific object graph – defining which objects should be playing which roles – for every use case. Something like this:
public class OrderProcessor { private VAT vat; private Shipper shipper; public OrderProcessor(VAT vat, Shipper shipper, ...) { this.vat = vat; this.shipper = shipper; } public void finaliseOrder(Order order) { vat.addVat(order); shipper.addShippingCosts(order); // ... probably other stuff as well } } public UsOrderProcessingContext implements OrderProcessingContext { private WisconsinVat wisconsinVat; // injected private Shipper shipper; // injected public OrderProcessor setup(Order order) { // in real life, would most likely do some other things here // to figure out other use-case-specific wiring return new OrderProcessor(wisconsinVat, shipper); } } public EuOrderProcessingContext implements OrderProcessingContext { private Map<Country, VAT> vatsForCountry; // injected private Shipper shipper; // injected public OrderProcessor setup(Order order) { // in real life, would most likely do some other things here // to figure out other use-case-specific wiring VAT vat = vatsForCountry.get(order.getCustomerCountry(); return new OrderProcessor(vat, shipper); } }
Note that dependency injection is being used to instantiate the contexts as well, and that it is possible to mix ‘normal’ static object graphs with dynamic, DCI-style graphs. In fact, I’m pretty sure the contexts should be part of a static graph of objects wired using traditional DI.
Compared to normal DI, we get the following advantages:
- Less indirection in the business logic because we don’t need to make up our minds about which path to take – note that the VatCalculator interface and implementation are gone; their only reason for existing was to select the correct implementation of VAT. Simpler and clearer business logic is great.
- The object graph is tight, every object in the graph is actually used in the use case.
- Since the object graphs for each use case contain subsets of all the objects in the application, it should be easier to create automated functional tests that actually cover a large part of the object graph.
The main disadvantage I can see without having tested this in practice is that the wiring logic is now complex, which is normally a no-no (see for instance DIY-DI). There might also be quite a lot of contexts, depending on how many and how different the use cases are. On the other hand, it’s not more complex than any other code you write, and it is simple to increase your confidence in it through unit testing – which is something that is hard with for instance wiring done using traditional Spring config files. So maybe that’s not such a big deal.
So separating out the logic for wiring up your application from the business logic has the advantage of simplifying the business logic by making it more targeted, and of making the business logic more reusable by removing use-case-specific conditional logic and/or indirection from it. It also clarifies the program structure by making it specific to a use case and explicitly defined, either dynamically in contexts or statically in DI configuration files or code. Good stuff!
A Point of Injecting Code into Objects
The second thing I understood came from watching Trygve Reenskaug’s talk at Øredev last year. He demoed the Interactions perspective in BabyIDE, and showed how the roles in the arrows example interact. That view was pretty eye-opening for me (it’s about 1h 10mins into the talk), because it showed how you could extract the code that objects run to execute a distributed algorithm and look at only that code in isolation from other code in the objects. So the roles defined in a particular context are tightly tied in with each other, making up a specific distributed algorithm. Looking at that code separately from the actual objects that will execute it at runtime means you can highlight the distributed algorithm and make it readable.
So, clearly, if you want to separate an algorithm into pieces that should be executed by different objects without a central controlling object, injecting algorithms into objects gives you an advantage over injecting objects into an algorithm. Of course, the example algorithm from the talk is pretty trivial and could equally well be implemented with a central controlling object that draws arrows between the shapes that play the roles. Such an algorithm would also be even easier to overview than the fragmented one in the example, but that observation may not always be true. The difficult thing about architecture is that you have to see how it works in a large system before you know if it is good or not – small systems always look neat.
So with DCI proper, you can do things you can’t with DCI/DI – it is more powerful. But without an IDE that can extract related roles and highlight their interactions, I think that understanding a system with many different contexts and sets of interacting roles could get very complex. I guess I’m not entirely sure that the freedom to more easily distribute algorithm pieces to different objects is an unequivocally good thing in terms of writing code that is simple. And simple is incredibly important.
Conclusion
I still really like the thought of doing DCI but using DI to inject objects into algorithms instead of algorithms into objects. I think it can help simplify the business logic as well as make the system’s runtime structure easier to understand. I think DCI/DI does at least as well as DCI proper in addressing a point that Trygve Reenskaug comes back to – the GOF statement that “code won’t reveal everything about how a system will work”. Compared to DCI proper, DCI/DI may be weaker in terms of flexibility, but it has a couple advantages, too:
- You can do it even in a language that doesn’t allow dynamically adding/removing code to objects.
- Dynamically changing which code an object has access to feels like it may be an area that contains certain pitfalls, not least from a concurrency perspective. We’re having enough trouble thinking about concurrency as it is, and adding “which code can I execute right now” to an object’s mutable state seems like it could potentially open a can of worms.
I am still not completely sure that I am right about DI being a worthy substitute for class composition in DCI, but I have increased my confidence that it is. And anyway, it probably doesn’t matter too much to me personally, since, using dynamic, context-based DI looks like an extremely promising technique compared to the static DI that I am using today. I really feel like trying DCI/DI out in a larger context to see if it keeps its promises, but I am less comfortable about DCI proper due to the technical and conceptual risks involved in dynamically injecting code into objects.