| ---- |
| Tapestry IoC Overview |
| ---- |
| |
| Tapestry IoC Overview |
| |
| Even today, with the overwhelming success of {{{http://www.springframework.org}Spring}} and the rise of |
| smaller, simpler approaches to building application that stand in sharp contrast to the ultra-heavyweight |
| EJB approach, many people still have trouble wrapping their heads around Inversion of Control. |
| |
| Really understanding IoC is a new step for many developers. If you can remember back to when you made the transition |
| from procedural programming (in C, or BASIC) to object oriented programming, you might remember the point where you "got it". The point |
| where it made sense to have methods on objects, and data inside objects. |
| |
| Inversion of Control builds upon those ideas. The goal is to make coding more robust (that is, with fewer errors), more reusable and |
| to make code much easier to test. |
| |
| Most developers are used to a more <monolithic> design, they have a few core objects and a <<<main()>>> method somewhere |
| that starts the ball rolling. <<<main()>>> instantiates the first couple of classes, and those classes |
| end up instantiating and using all the other classes in the system. |
| |
| That's an <unmanaged> system. Most desktop applications are unmanaged, so it's a very familiar pattern, and easy to get your head around. |
| |
| By contrast, web applications are a <managed> environment. You don't write a main(), you don't control startup. You <configure> |
| the Servlet API to tell it about your servlet classes to be instantiated, and their lifecycle is totally controlled by |
| the servlet container. |
| |
| Inversion of Control is just a more general application of this approach. The container is ultimately responsible for |
| instantiating and configuring the objects you tell it about, and running their entire lifecycle of those objects. |
| |
| Building web applications are more complicated than monolithic applications, largely because of <multithreading>. |
| Your code will be servicing many different users simultaneously across many different threads. This tends to complicate the |
| code you write, since some fundamental aspects of object oriented development get called into question: in particular, the use |
| of <internal state>, values stored inside instance variables, since in a multi-threaded environment, that's no longer the safe |
| place it is in traditional development. Shared objects plus internal state plus multiple threads equals an broken, unpredictable application. |
| |
| Frameworks such as Tapestry -- both the IoC container, and the web framework itself -- exist to help. |
| |
| When thinking in terms of IoC, <<small is beautiful>>. What does that mean? It means small classes and small methods |
| are easier to code than large ones. At one extreme, we have servlets circa 1997 (and Visual Basic before that) with methods a thousand lines long, |
| and no distinction between business logic and view logic. Everything mixed together into an untestable jumble. |
| |
| At the other extreme is IoC: small objects, each with a specific purpose, collaborating with other small objects. |
| |
| Using unit tests, in collaboration with tools such as {{{http://easymock.org/}EasyMock}}, you can have a code base that is easy to maintain, |
| easy to extend, and easy to test. And by factoring out a lot of <plumbing> code, your code base will not only be easier to work with, it will be smaller. |
| |
| Living on the Frontier |
| |
| Coding applications the traditional way is like being a homesteader on the American frontier in the 1800's. You're responsible for |
| every aspect of your house: every board, every nail, every stick of furniture is something you personally created. There <is> a great |
| comfort in total self reliance. Even if your house is small, the windows are a bit drafty or the floorboards creak a little, you know exactly <why> |
| things are not-quite perfect. |
| |
| Flash forward to modern cities or modern suburbia and it's a whole different story. Houses are built to specification from design plans, made |
| from common materials, by many specializing tradespeople. Construction codes dictate how plumbing, wiring and framing should be performed. |
| A home-owner may not even know how to drive a nail, but can still take comfort |
| in draft-free windows, solid floors and working plumbing. |
| |
| To extend the metaphor, a house in a town is not alone and self-reliant the way a frontier house is. The town house |
| is situated on a street, in a neighborhood, within a town. The town provides services |
| (utilities, police, fire control, streets and sewers) to houses in a uniform way. Each house just needs to connect up to those services. |
| |
| The World of the Container |
| |
| So the IoC container is the "town" and in the world of the IoC container, everything has a name, a place, and a relationship |
| to everything else in the container. Tapestry calls this world "The Registry". |
| |
| [images/ioc-overview.png] IoC Overview |
| |
| Here we're seeing a few services from the built-in Tapestry IoC module, and a few of the services from the Tapestry web framework module. |
| In fact, there are over 100 services, all interrelated, in the Registry ... and that's before you add your own to the mix. The IoC Registry |
| treats all the services uniformly, regardless of whether they are part of Tapestry, or part of your application, or part of an add-on library. |
| |
| Tapestry IoC's job is to make all of these services available to each other, and to the outside world. The outside world could |
| be a standalone application, or it could be an application built on top of the Tapestry web framework. |
| |
| Service Lifecycle |
| |
| |
| Tapestry services are <lazy>, which means they are not fully instantiated until they are absolutely needed. Often, what looks like a service |
| is really a proxy object ... the first time any method of the proxy is invoked, |
| the actual service is instantiated and initialized (Tapestry uses the term <realized> for this process). Of course, this is all absolutely |
| thread-safe. |
| |
| Initially a service is <defined>, meaning some module has defined the service. Later, the service will be <virtual>, meaning a proxy |
| has been created. This occurs most often because some other service <depends> on it, but hasn't gotten around to invoking methods on it. Finally, a service |
| that is ready to use is <realized>. What's nice is that your code neither knows nor cares about the lifecycle of the service, because of the magic of the proxy. |
| |
| In fact, when a Tapestry web application starts up, before it services its first request, only about 20% of the services have been realized; the remainder |
| are defined or virtual. |
| |
| Class vs. Service |
| |
| A Tapestry service is more than just a class. First of all, it is a combination of an <interface> that defines the operations of the service, |
| and an <implementation class> that implements the interface. |
| |
| Why this extra division? Having a service interface is what lets Tapestry create proxies and perform other operations. It's also a very good practice to |
| code to an interface, rather than a specific implementation. You'll often be surprised at the kinds of things you can accomplish by substituting |
| one implementation for another. |
| |
| Tapestry is also very aware that a service will have dependencies on other services. It may also have other needs ... for example, in Tapestry IoC, the |
| container provides services with access to Loggers. |
| |
| Tapestry IoC also has support for other configuration that may be provided to services when they are realized. |
| |
| Dependency Injection |
| |
| Inversion of Control refers to the fact that the container, here Tapestry IoC's Registry, instantiates your classes. It decides on when the classes |
| get instantiated. |
| |
| Dependency Injection is a key part of <realization>: this is how a service is provided with the other services it needs to operate. For example, |
| a Data Access Object service may be injected with a ConnectionPool service. |
| |
| In Tapestry, injection occurs exclusively through constructors. Other frameworks support a mix of constructor injection, property injection (i.e., invoking setter methods) |
| and method injection (invoking arbitrary methods to pass in dependencies). Tapestry focuses exclusively on constructor injection, and emphasizes |
| that dependencies should be stored in <<final>> variables. This is the best approach towards ensuring thread safety. |
| |
| In any case, injection "just happens". Tapestry finds the constructor of your class and analyzes the parameters to determine what to pass in. In some cases, |
| it uses just the parameter type to find a match, in other cases, annotations on the parameters may also be used. |
| |
| Why can't I just use <<<new>>>? |
| |
| I've had this question asked me many a time. All these new concepts seem alien. All that XML (in the Spring or HiveMind IoC containers; Tapestry IoC uses no XML) is a burden. |
| What's wrong with <<<new>>>? |
| |
| The problem with new is that it rigidly connects one implementation to another implementation. Let's follow a progression that reflects how a lot of projects |
| get written. It will show that in the real world, <<<new>>> is not as simple as it first seems. |
| |
| This example is built around some work I've done recently involving a Java Messaging Service queue, part of an application performance monitoring |
| subsystem for a large application. Code inside each server collects performance data of various types and sends it, via a shared JMS queue, |
| to a central server for collection and reporting. |
| |
| This code is for a metric that periodically counts the number of rows in a key database table. Other implementations of MetricProducer |
| will be responsible for measuring CPU utilization, available disk space, number of requests per second, and so forth. |
| |
| +----+ |
| public class TableMetricProducer implements MetricProducer |
| { |
| . . . |
| |
| public void execute() |
| { |
| |
| int rowCount = . . .; |
| Metric metric = new Metric("app/clients", System.currentTimeMillis(), rowCount); |
| |
| new QueueWriter().sendMetric(metric); |
| } |
| } |
| +----+ |
| |
| I've elided some of the details (this code will need a database URL or a connection pool to operate), |
| so as to focus on the one method and it's relationship to the QueueWriter class. |
| |
| Obviously, this code has a problem ... we're creating a new QueueWriter for each metric we write into the queue, and the QueueWriter presumably is going to |
| open the JMS queue fresh each time, an expensive operation. Thus: |
| |
| +----+ |
| public class TableMetricProducer implements MetricProducer |
| { |
| . . . |
| |
| private final QueueWriter queueWriter = new QueueWriter(); |
| |
| public void execute() |
| { |
| int rowCount = . . .; |
| Metric metric = new Metric("app/clients", System.currentTimeMillis(), rowCount); |
| |
| queueWriter.sendMetric(metric); |
| } |
| +-----+ |
| |
| That's better. It's not perfect ... a proper system might know when the application was being shutdown and would shut down the JMS Connection |
| inside the QueueWriter as well. |
| |
| Here's a more immediate problem: JMS connections are really meant to be shared, and we'll have lots of little classes collecting different metrics. So we need |
| to make the QueueWriter shareable: |
| |
| +----+ |
| private final QueueWriter queueWriter = QueueWriter.getInstance(); |
| +----+ |
| |
| ... and inside class QueueWriter: |
| |
| +----+ |
| public class QueueWriter |
| { |
| private static QueueWriter instance; |
| |
| private QueueWriter() |
| { |
| ... |
| } |
| |
| public static getInstance() |
| { |
| if (instance == null) |
| instance = new QueueWriter(); |
| |
| return instance; |
| } |
| } |
| +-----+ |
| |
| Much better! Now all the metric producers running inside all the threads can share a single QueueWriter. Oh wait ... |
| |
| +-----+ |
| public synchronized static getInstance() |
| { |
| if (instance == null) |
| instance = new QueueWriter(); |
| |
| return instance; |
| } |
| +----+ |
| |
| Is that necessary? Yes. Will the code work without it? Yes -- <<99.9% of the time>>. In fact, this is a very common error |
| in systems that manually code a lot of these construction patterns: forgetting to properly synchronize access. These things often work in development and testing, |
| but fail (with infuriating infrequency) in production, as it takes two or more threads running simultaneously to reveal the |
| coding error. |
| |
| Wow, we're a long way from a simple <<<new>>> already, and we're talking about just one service. But let's detour into <testing>. |
| |
| How would you test TableMetricProducer? One way would be to let it run and try to find the message or messages it writes |
| in the queue, but that seems fraught with difficulties. It's more of an integration test, and is certainly something |
| that you'd want to execute at some stage of your development, but not as part of a quick-running unit test suite. |
| |
| Instead, let's split QueueWriter in two: a QueueWriter interface, and a QueueWriterImpl implementation class. This will allow |
| us to run TableMetricProducer against a <mock implementation> of QueueWriter, rather than the real thing. This is one |
| of the immediate benefits of <coding to an interface> rather than <coding to an implementation>. |
| |
| We'll need to change |
| TableMetricProducer to take the QueueWriter as a constructor parameter. |
| |
| +----+ |
| public class TableMetricProducer implements MetricProducer |
| { |
| private final QueueWriter queueWriter; |
| |
| /** |
| * The normal constructor. |
| * |
| */ |
| public TableMetricProducer(. . .) |
| { |
| this(QueueWriterImpl.getInstance(), . . .); |
| } |
| |
| /** |
| * Constructor used for testing. |
| * |
| */ |
| TableMetricProducer(QueueWriter queueWriter, . . .) |
| { |
| queueWriter = queueWriter; |
| . . . |
| } |
| |
| public void execute() |
| { |
| int rowCount = . . .; |
| Metric metric = new Metric("app/clients", System.currentTimeMillis(), rowCount); |
| |
| queueWriter.sendMetric(metric); |
| } |
| } |
| +----+ |
| |
| This still isn't ideal, as we still have an explicit linkage between TableMetricProducer and QueueWriterImpl. |
| |
| What we're seeing here is that there are multple <concerns> inside the little bit of code in this example. TableMetricProducer has an unwanted |
| <construction concern> about which implementation of QueueWriter to instantiate (this shows up as two constructors, |
| rather than just one). QueueWriterImpl has an additional <lifecycle concern>, in terms |
| of managing the singleton. |
| |
| These extra concerns, combined with the use of static variables and methods, are a <bad design smell>. It's not yet very stinky, because |
| this example is so small, but these problems tend to multiply as an application grows larger and more complex, especially as services |
| start to truly collaborate in earnest. |
| |
| For comparison, lets see what the Tapestry IoC implementation would look like: |
| |
| +----+ |
| public class MonitorModule |
| { |
| public static void bind(ServiceBinder binder) |
| { |
| binder.bind(QueueWriter.class, QueueWriterImpl.class); |
| binder.bind(MetricScheduler.class, MetricSchedulerImpl.class); |
| } |
| |
| public void contributeMetricScheduler(Configuration<MetricProducer> configuration, QueueWriter queueWriter, . . .) |
| { |
| configuration.add(new TableMetricProducer(queueWriter, . . .)) |
| } |
| } |
| +----+ |
| |
| Again, I've elided out a few details related to the database the TableMetricProducer will point at (in fact, Tapestry IoC |
| provides a lot of support for configuration of this type as well, which is yet another concern). |
| |
| The MonitorModule class is a Tapestry IoC module: a class that defines and configures services. |
| |
| The bind() method is the principle way that services are made known to the Registry: here we're binding |
| a service interface to a service implementation. QueueWriter we've discussed already, |
| and MetricScheduler is a service that is responsible for determining when MetricProducer instances |
| run. |
| |
| The contributeMetricScheduler() method allows the module to <contribute> into the MetricProducer service's <configuration>. More testability: |
| the MetricProducer isn't tied to a pre-set list of producers, instead it will have a Collection\<MetricProducer\> injected into its |
| constructor. Thus, when we're coding the MetricProducerImpl class, we can test it against mock implementations of MetricProducer. |
| |
| The QueueWriter service in injected into the contributeMetricScheduler() method. Since there's only one QueueWriter service, |
| Tapestry IoC is able to "find" the correct service based entirely on type. If, eventually, there's more than one QueueWriter service |
| (perhaps pointing at different JMS queues), you would use an annotation on the parameter to help Tapestry connect the parameter to the appropriate service. |
| |
| Presumably, there'd be a couple of other parameters to the contributeMetricScheduler() method, to inject in a database URL or connection pool |
| (that would, in turn, be passed to TableMetricProducer). |
| |
| A new TableMetricProducer instance is created and contributed in. We could contribute as many producers as we like here. Other modules could also |
| define a contributeMetricScheduler() method and contribute their own MetricProducer instances. |
| |
| Meanwhile, the QueueWriterImpl class no longer needs the _instance variable or getInstance() method, and the TableMetricProducer |
| only needs a single constructor. |
| |
| Advantages of IoC: Summary |
| |
| It would be ludicrous for us to claim that applications built without an IoC container are doomed to failure. There is overwhelming evidence |
| that applications have been built without containers and have been perfectly successful. |
| |
| What we are saying is that IoC techniques and discipline will lead to applications that are: |
| |
| * More testable -- smaller, simpler classes; coding to interfaces allows use of mock implementations |
| |
| * More robust -- smaller, simpler classes; use of final variables; thread safety baked in |
| |
| * More scalable -- thread safety baked in |
| |
| * Easier to maintain -- less code, simpler classes |
| |
| * Easier to extend -- new features are often additions (new services, new contributions) rather than changes to existing classes |
| |
| [] |
| |
| What we're saying is that an IoC container allows you to work faster and smarter. |
| |
| |
| Many of these traits work together; for example, a more testable application is inherently more robust. Having a test suite |
| makes it easier to maintain and extend your code, because its much easier to see if new features break existing ones. |
| Simpler code plus tests also lowers the cost of entry for new developers |
| coming on board, which allows for more developers to work efficiently on the same code base. The clean separation between |
| interface and implementation also allows multiple developers to work on different aspects of the same code |
| base with a lowered risk of interference and conflict. |
| |
| By contrast, traditional applications, which we term <monolithic> applications, are often very difficult to test, because |
| there are fewer classes, and each class has multiple concerns. A lack of tests makes it more difficult to |
| add new features without breaking existing features. Further, the monolithic approach |
| more often leads to implementations being linked to other implementations, yet another hurdle standing in the way of testing. |
| |
| Let's end with a metaphor. |
| |
| Over a decade ago, when Java first came on the scene, it was the first mainstream language to support garbage collection. |
| This was very controversial: the garbage collector was seen as unnecessary, and a waste of resources. Among |
| C and C++ developers, the attitude was "Why do I need a garbage collector? If I call malloc() I can call free()." |
| |
| I don't know about you, but I don't think I could ever go back to a non-garbage collected environment. Having the GC |
| around makes it much easier to code in a way I find natural: many small related objects working together. It turns out |
| that knowing when to call free() is more difficult than it sounds. The Objective-C language tried to solve this with retain |
| counts on objects and that still lead to memory leaks when it was applied to object <graphs> rather than object <trees>. |
| |
| Roll the clock forward a decade and the common consensus has shifted considerably. Objective-C 2.0 features |
| true garbage collection and GC libraries are available for C and C++. All scripting languages, including Ruby and Python, feature |
| garbage collection as well. A new language <without> garbage collection is now considered an anomaly. |
| |
| The point is, the lifecycle of objects turns out to be far more complicated than it looks at first glance. We've come to accept that our |
| own applications lack the ability to police their objects as they are no longer needed (they literally lack the ability to determine |
| <when> an object is no longer needed) and the garbage collector, a kind of higher authority, takes over that job very effetively. The end result? |
| Less code and fewer bugs. And a careful study shows that the Java memory allocator and garbage collector (the two are |
| quite intimately tied together) is actually |
| <<more>> efficient that malloc() and free(). |
| |
| So we've come to accept that the <death concern> is better handled outside of our own code. The use of Inversion of Control |
| is simply the flip side of that: the <lifecycle and construction concerns> are also better handled by an outside authority as well: the IoC container. |
| These concerns govern when a service is <realized> and how its dependencies and configuration are injected. As with |
| the garbage collector, ceding these chores to the container |
| results in less code and fewer bugs, and lets you concentrate on the things that should matter to you: your business logic, your application -- and not |
| a whole bunch of boilerplate plumbing! |
| |
| |
| |
| |
| |
| |
| |
| |
| |