tapestry/src/site/xdoc/usersguide/state.xml - tapestry4 - Git at Google

 <?xml version="1.0" encoding="UTF-8"?>
 <!--
     Copyright 2004, 2005 The Apache Software Foundation

     Licensed under the Apache License, Version 2.0 (the "License");
     you may not use this file except in compliance with the License.
     You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

     Unless required by applicable law or agreed to in writing, software
     distributed under the License is distributed on an "AS IS" BASIS,
     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     See the License for the specific language governing permissions and
     limitations under the License.
 -->
 <document>
     <properties>
         <title>Managing server-side state</title>
     </properties>
     <body>

         <section name="Managing server-side state">

             <p>
                 Server-side state is any information that exists on the server, and persists between
                 requests. This can be anything from a single flag all the way up to a large database
                 result set. In a typical application, server-side state is the identity of the user
                 (once the user logs in) and, perhaps, a few important domain objects (or, at the
                 very least, primary keys for those objects).
             </p>

             <p>
                 In an ordinary servlet application, managing server-side state is entirely the
                 application's responsibility. The Servlet API provides just the HttpSession, which
                 acts like a Map, relating keys to arbitrary objects. It is the application's
                 responsibility to obtain values from the session, and to update values into the
                 session when they change.
             </p>

             <p>
                 Tapestry takes a different tack; it defines server side state in terms of either
                 entire objects (
                 <a href="state.html#state.aso">application state object</a>
                 s) or by allowing specific page or component properties to be persistent.
             </p>

             <section name="Understanding servlet state">


                 <p>
                     Managing server-side state is one of the most complicated and error-prone
                     aspects of web application design, and one of the areas where Tapestry provides
                     the most benefit. Generally speaking, Tapestry applications which are functional
                     within a single server will be functional within a cluster with no additional
                     effort. This doesn't mean planning for clustering, and testing of clustering, is
                     not necessary; it just means that, when using Tapestry, it is possible to narrow
                     the design and testing focus. The Tapestry framework embraces the correct
                     best-practices for managing client state on either a single server or a cluster
                     of services.
                 </p>

                 <p>
                     The point of server-side state is to ensure that information about the user
                     acquired during the session is available later in the same session. The
                     canonical example is an application that requires some form of login to access
                     some or all of its content; the identify of the user must be collected at some
                     point (in a login page) and be generally available to other pages.
                 </p>

                 <p>
                     The other aspect of server-side state concerns failover. Failover is an aspect
                     of highly-available computing where the processing of the application is spread
                     across many servers. A group of servers used in this way is referred to as a
                     <em>cluster</em>
                     . Generally speaking (and this may vary significantly between vendor's
                     implementations) requests from a particular client will be routed to the same
                     server within the cluster.
                 </p>

                 <p>
                     In the event that the particular server in question fails (crashes unexpectedly,
                     or otherwise brought out of service), future requests from the client will be
                     routed to a different, surviving server within the cluster. This failover event
                     should occur in such a way that the client is unaware that anything exceptional
                     has occured with the web application; and this means that any server-side state
                     gathered by the original server must be available to the backup server.
                 </p>

                 <p>
                     The main mechanism for handling this using the Java Servlet API is the
                     HttpSession. The session can store
                     <em>attributes</em>
                     , much like a Map. Attributes are object values referenced with a string key. In
                     the event of a failover, all such attributes are expected to be available on the
                     new, backup server, to which the client's requests are routed.
                 </p>

                 <p>
                     Different application servers implement HttpSession replication and failover in
                     different ways; the servlet API specification is delibrately non-specific on how
                     this implementation should take place. Tapestry follows the conventions of the
                     most limited interpretation of the servlet specification; it assumes that
                     attribute replication only occurs when the HttpSession
                     <code>setAttribute()</code>
                     method is invoked.
                 </p>

                 <span class="info">
                     <strong>Note:</strong>
                     <p>
                     This is the replication strategy employed by BEA's WebLogic server.
                     </p>
                 </span>

                 <p>
                     Attribute replication was envisioned as a way to replicate simple, immutable
                     objects such as String or Integer. Attempting to store mutable objects, such as
                     List, Map or some user-defined class, can be problematic. For example, modifying
                     an attribute value after it has been stored into the HttpSession may cause a
                     failover error. Effectively, the backup server sees a snapshot of the object at
                     the time that
                     <code>setAttribute()</code>
                     was invoked; any later change to the object's internal state is
                     <em>not</em>
                     replicated to the other servers in the cluster! This can result in strange and
                     unpredictable behavior following a failover.
                 </p>

                 <p>
                     Tapestry attempts to sort out the issues involving server-side state in such a
                     way that they are invisible to the developer. Most applications will not need to
                     explicitly access the HttpSession at all, but may still have significant amounts
                     of server-side state. The following sections go into more detail about how
                     Tapestry approaches these issues.
                 </p>


             </section><!-- state.general -->

             <section name="Persistent page properties">


                 <p>
                     Servlets, and by extension, JavaServer Pages, are inherently stateless. That is,
                     they will be used simultaneously by many threads and clients. Because of this,
                     they must not store (in instance variables) any properties or values that are
                     specific to any single client.
                 </p>

                 <p>
                     This creates a frustration for developers, because ordinary programming
                     techniques must be avoided. Instead, client-specific state and data must be
                     stored in the HttpSession or as HttpServletRequest attributes. This is an
                     awkward and limiting way to handle both
                     <em>transient</em>
                     state (state that is only needed during the actual processing of the request)
                     and
                     <em>persistent</em>
                     state (state that should be available during the processing of this and
                     subsequent requests).
                 </p>

                 <p>
                     Tapestry bypasses most of these issues by
                     <em>not</em>
                     sharing objects between threads and clients. Tapestry uses an
                     <em>object pool</em>
                     to store constructed page instances. As a page is needed, it is removed from the
                     page pool. If there are no available pages in the pool, a fresh page instance is
                     constructed.
                 </p>

                 <p>
                     For the duration of a request, a page and all components within the page are
                     reserved to the single request. There is no chance of conflicts because only the
                     single thread processing the request will have access to the page. At the end of
                     the request cycle, the page is reset back to a pristine state and returned to
                     the shared pool, ready for reuse by the same client, or by a different client.
                 </p>

                 <p>
                     In fact, even in a high-volume Tapestry application, there will rarely be more
                     than a few instances of any particular page in the page pool.
                 </p>

                 <p>
                     For this scheme to work it is important that, at the end of the request cycle,
                     the page be returned to its
                     <em>pristine state</em>
                     . The prisitine state is equivalent to a freshly created instance of the page.
                     In other words, any properties of the page that changed during the processing of
                     the request must be returned to their initial values.
                 </p>

                 <p>
                     The page is then returned to the page pool, where it will wait to be used in a
                     future request. That request may be for the same end user, or for another user
                     entirely.
                 </p>

                 <span class="warn">
                     <strong>Warning:</strong>
                     <p>
                     Imagine a page containing a form in which a user enters their address and credit
                     card information. When the form is submitted, properties of the page will be
                     updated with the values supplied by the user. Those values must be cleared out
                     before the page is stored into the page pool ... if not, then the
                     <em>next</em>
                     user who accesses the page will see the previous user's address and credit card
                     information as default values for the form fields!
                     </p>
                 </span>

                 <p>
                     Tapestry separates the persistent state of a page from any instance of the page.
                     This is very important, because from one request cycle to another, a different
                     instance of the page may be used ... even when clustering is not used. Tapestry
                     has many copies of any page in a pool, and pulls an arbitrary instance out of
                     the pool for each request.
                 </p>

                 <p>
                     In Tapestry, a page may have many properties and may have many components, each
                     with many properties, but only a tiny number of all those properties needs to
                     persist between request cycles. On a later request, the same or different page
                     instance may be used. With a little assistance from the developer, the Tapestry
                     framework can create the illusion that the same page instance is being used in a
                     later request, even though the request may use a different page instance (from
                     the page pool) ... or (in a clustering environment) may be handled by a
                     completely different server.
                 </p>

                 <p>
                     Tapestry is flexible about how these properties are ultimately stored. Tapestry
                     3.0 and earlier were less so ... persistent page properties were always stored
                     in the HttpSession. Starting in release 4.0, persistent page properties may be
                     stored in the session, or on the client.
                 </p>

                 <subsection name="Using Persistent Page Properties">


                     <p>
                         Persistent properties make use of a
                         <a href="spec.html#spec.property">&lt;property&gt;</a>
                         element in the page or component specification. Tapestry does something
                         special when a component contains any such elements; it dynamically
                         fabricates a subclass that provides the desired fields, methods and whatever
                         extra initialization or cleanup is required.
                     </p>

                     <p>
                         You may also, optionally, make your page or component class abstract, and
                         define abstract accessor methods that will be filled in by Tapestry in the
                         fabricated subclass. This allows you to read and update properties inside
                         your Java code, such as inside listener methods.
                     </p>

                     <span class="info">
                         <strong>Note:</strong>
                         <p>
                         You only need to define abstract accessor methods if you are going to invoke
                         those accesor methods in your code, such as in a
                         <a href="listenermethods.html">listener method</a>
                         . Tapestry will create an enhanced subclass that contains the new field, a
                         getter method and a setter method, plus any necessary initialization
                         methods. If you are only going to access the property using OGNL
                         expressions, then there's no need to define either accessor method.
                         </p>
                     </span>

                     <span class="info">
                         <strong>Note:</strong>
                         <p>
                         Properties defined this way may be either transient or persistent. For
                         transient properties (properties which do not persist between requests), it
                         is only necessary to specify the property (with a
                         <a href="spec.html#spec.property">&lt;property&gt;</a>
                         element) if it has an initial value. Tapestry scans the component class
                         looking for abstract properties that don't match up against component
                         parameters or
                         <a href="spec.html#spec.property">&lt;property&gt;</a>
                         elements; for each of these unclaimed properties, a concrete property is
                         created. The property is a transient property, exactly as if a
                         <a href="spec.html#spec.property">&lt;property&gt;</a>
                         element
                         <em>did</em>
                         exist for it.
                         </p>
                     </span>

                     <p>A page class that uses a specified property:</p>
                     <source xml:space="preserve">
 package mypackage;

 import org.apache.tapestry.html.BasePage;

 public abstract class MyPage extends <a
                             href="../apidocs/org/apache/tapestry/html/BasePage.html">BasePage</a>
 {
     abstract public int getItemsPerPage();

     abstract public void setItemsPerPage(int itemsPerPage);
 }
 </source>

                     <p>
                         This is combined with a
                         <a href="spec.html#spec.property">&lt;property&gt;</a>
                         element in the page's specification:
                     </p>

                     <source xml:space="preserve">

 &lt;?xml version="1.0" encoding="UTF-8"?&gt;
 &lt;!DOCTYPE page-specification PUBLIC
   "-//Apache Software Foundation//Tapestry Specification 4.0//EN"
   "http://tapestry.apache.org/dtd/Tapestry_4_0.dtd"&gt;

 &lt;page-specification class="mypackage.MyPage"&gt;

   &lt;property
     name="itemsPerPage"
     persist="session"
     initial-value="10"/&gt;

 &lt;/page-specification&gt;

 </source>

                     <p>
                         Again, making the class abstract, and defining abstract accessors is
                         <em>optional</em>
                         . It is only useful when a method within the class will need to read or
                         update the property. It is also valid to just implement one of the two
                         accessors. The enhanced subclass will always include both a getter and a
                         setter.
                     </p>

                     <span class="info">
                         <strong>Note:</strong>
                         <p>
                         In Tapestry 3.0, many users were frustrated that they had to specify the
                         type of property in both the Java code and in the page specification. This
                         is a violation of the
                         <a href="http://c2.com/cgi/wiki?DontRepeatYourself">Dont Repeat Yourself</a>
                         principal -- requiring coordination is just an invitation for the two sides
                         to get out of synchronization. Starting with Tapestry 4.0, there is no type
                         attribute on the
                         <a href="spec.html#spec.property">&lt;property&gt;</a>
                         element; instead Tapestry matches the type to the property type of any
                         existing accessor methods, and simply uses Object when there are no accessor
                         methods. In this example, the persistent itemsPerPage property will be type
                         int, because of the abstract accessor methods.
                         </p>
                     </span>

                     <p>
                         This exact same technique can be used with components as well as pages, the
                         component specification also supports the
                         <a href="spec.html#spec.property">&lt;property&gt;</a>
                         element.
                     </p>

                     <p>
                         When an initial-value is provided, it is evaluated once inside
                         <code>finishLoad()</code>
                         , but then again every time the page is detached (returned to pristine
                         state) before being stored into the page pool for later re-use. By default,
                         it is an
                         <a href="http://www.ognl.org">OGNL</a>
                         expression, but this can be overriden by providing a
                         <a href="bindings.html">binding reference</a>
                         prefix.
                     </p>


                     <p>
                         This means that you may perform initialization for the property inside
                         <code>finishLoad()</code>
                         (instead of providing an initial-value). However, don't attempt to update
                         the property from
                         <code>initialize()</code>
                         ... the order of operations when the page detaches is not defined and is
                         subject to change.
                     </p>

                     <span class="warn">
                         <strong>Warning:</strong>
                         <p>
                         If the values stored as a persistent property is
                         <em>mutable</em>
                         (for example, a List, or a Map, or some custom Java class), you should
                         <em>not</em>
                         modify its internal state once it has been stored into the persistent page
                         property. Changes to internal state of a persistent property, after it has
                         been stored, may
                         <em>not</em>
                         be propogated to other servers in the cluster (this is really a function of
                         how session replication is implemented by the application server, it has
                         very little to do with Tapestry). Similar problems concerning mis-matched
                         state can occur with client persistence.
                         </p>
                     </span>

                 </subsection><!-- state.page-properties.using -->

                 <subsection name="Persistence Types">


                     <p>
                         Tapestry defines two basic types of property persistence. The type of
                         persistence (internally known as the
                         <em>property persistence strategy</em>
                         ) is defined by the value of the persist attribute (in the
                         <a href="spec.html#spec.property">&lt;property&gt;</a>
                         element). Omitting the persist attribute, or not providing a
                         <a href="spec.html#spec.property">&lt;property&gt;</a>
                         element, indicates a
                         <em>transient</em>
                         page property, one which does not persist from request to request.
                     </p>

                     <dl>
                         <dt>client</dt>
                         <dd>
                             Client properties are stored on the client, in the form of query
                             parameters. All persistent properties for each page are encoded into a
                             single query parameter, named
                             <code>state:</code>
                             <em>PageName</em>
                             . The query parameter value is a MIME encoded byte stream. This can get
                             quite long if there are many client persistent properties on the page
                             ... which may quickly run into limitations on the maximum size of a URL
                             (approximately 4000 characters is a good guideline). This is less a
                             problem for forms.
                         </dd>
                         <dt>session</dt>
                         <dd>
                             The traditional style of property persistence (and the only kind
                             available in Tapestry 3.0 and earlier). Each persistent property is
                             mapped to a HttpSession attribute.
                         </dd>
                     </dl>

                     <p>
                         More such stategies are expected; these will give more control over the
                         lifecycle of the page property.
                     </p>

                     <span class="info">
                         <strong>Fixme:</strong>
                         <p>
                         Mind Bridge has added (or changed) these, adding a concept of "scope" for
                         how long a property will persist. This documentation needs to be updated.
                         </p>
                     </span>
                 </subsection><!-- state.page-properties.types -->

             </section><!-- state.page-properties -->

             <section name="Application state objects">


                 <p>
                     What happens when you have objects that are needed by multiple pages? For that,
                     you need an
                     <em>application state object</em>
                     . ASO's are global objects that can be
                     <a href="#state.aso.access">accessed</a>
                     from any page or component in the application. Each ASO has a unique name (the
                     two default ASO's are called "visit" and "global" for historical reasons). An
                     ASO is created when first referenced by any page. ASO's with session scope are
                     stored into the HttpSession at the end of the request (and
                     <a href="#state.stateless">may force the creation of the session</a>
                     ). ASOs in the application scope are available to any and all users.
                 </p>

                 <span class="info">
                     <strong>Note:</strong>
                     <p>
                     Tapestry 3.0 had a more limited form of ASO: The Visit object and the Global
                     object. The Visit object is session scope, the Global object is application
                     scope. This concept has been extended in Tapestry 4.0 to allow any number of
                     ASOs with any desired scope, and lots of flexibility on how ASOs get created.
                     The Visit and Global still exist in 4.0, as default ASOs you can use or
                     override.
                     </p>
                 </span>

                 <subsection name="Defining new Application State Objects">


                     <p>
                         To create a new ASO, you must update your
                         <a href="hivemind.html">HiveMind module deployment descriptor</a>
                         and add a contribution to the
                         <a
                             href="../tapestry-framework/hivedoc/config/tapestry.state.ApplicationObjects.html">
                             tapestry.state.ApplicationObjects
                         </a>
                         configuration point:
                     </p>

                     <source xml:space="preserve">
 &lt;contribution configuration-id="tapestry.state.ApplicationObjects"&gt;
   &lt;state-object name="registration-data" scope="session"&gt;
     &lt;create-instance class="org.example.registration.RegistrationData"/&gt;
   &lt;/state-object&gt;
 &lt;/contribution&gt;
 </source>

                     <p>
                         This defines an ASO named registration-data that is session scoped (stored
                         in the HttpSession once created). The first time it is referenced, an
                         instance of RegistrationData is created and stored into the session.
                     </p>

                     <p>
                         If your data object can't be created using a simple constructor, then you
                         can supply an &lt;invoke-factory&gt; element instead of
                         &lt;create-instance&gt;. &lt;invoke-factory&gt; allows you to reference an
                         object or service that implements the
                         <a
                             href="../apidocs/org/apache/tapestry/engine/state/StateObjectFactory.html">
                             StateObjectFactory
                         </a>
                         interface.
                     </p>

                     <span class="info">
                         <strong>Note:</strong>
                         <p>
                         The two default ASOs, "visit" and "global", are defined in the
                         hivemind.state.FactoryObjects configuration point. Definitions in the
                         ApplicationObjects configuration point override definitions in
                         FactoryObjects with the same name.
                         </p>
                     </span>

                 </subsection><!-- state.aso.defining -->

                 <subsection name="Accessing Application State Objects">


                     <p>
                         Tapestry provides an
                         <a href="spec.html#spec.inject">&lt;inject&gt;</a>
                         element to support access to the application state objects. This element can
                         be used in any page or component specification to create a new property.
                         Reading the property will obtain the corresponding state object (which will
                         be created if necessary). The property may be updated, which will store a
                         new application state object, overwriting the automatically created one.
                     </p>

                     <p>For example:</p>

                     <source xml:space="preserve">
 &lt;inject property="registration" type="state" object="registration-data"/&gt;
 </source>

                     <p>
                         This will create a
                         <code>registration</code>
                         property, which can be wired to components. Your class may define accessors
                         for this property, in which case you should be sure that the application
                         state object is assignable.
                     </p>

                     <span class="warn">
                         <strong>Warning:</strong>
                         <p>
                         The
                         <a href="../apidocs/org/apache/tapestry/IPage.html">
                             IPage
                         </a>
                         interface defines two read-only properties:
                         <code>visit</code>
                         and
                         <code>global</code>
                         . These are both type Object. This is a holdover from Tapestry 3.0, which
                         only supported these two application state objects. If you want to access
                         the visit or the global application state objects without needing casts, you
                         will have to inject as a differently named property (say
                         <code>appVisit</code>
                         or
                         <code>visitObject</code>
                         ).
                         </p>
                     </span>


                 </subsection><!-- state.aso.access -->

                 <subsection name="Optimizing Storage">


                     <p>
                         Normally, Tapestry has no way of knowing when the internal state of an ASO
                         has changed. On any request where the ASO is accessed, Tapestry assumes that
                         its internal state has changed. The ASO is re-stored at the end of the
                         request. For session-scoped ASOs in a cluster, this is critical to ensure
                         that information is properly distributed around the cluster.
                     </p>

                     <p>
                         However, it can also be expensive. Assuming that your application mostly
                         <em>reads</em>
                         information out of the ASO, that means a lot of wasted resources needlessly
                         copying the ASO around the cluster.
                     </p>

                     <p>
                         To control this, ASOs may
                         <em>optionally</em>
                         implement the
                         <a
                             href="../apidocs/org/apache/tapestry/SessionStoreOptimized.html">
                             SessionStoreOptimized
                         </a>
                         interface. The method isStoreToSessionNeeded() will be checked; if it
                         returns false, the object will
                         <em>not</em>
                         be stored.
                     </p>

                     <p>
                         Typically, the ASO will store a dirty flag, and will set the dirty flag on
                         any change to internal state. The flag will be returned by
                         isStoreToSessionNeeded(). The ASO will also implement the
                         HttpSessionBindingListener interface, and clear the flag in valueBound().
                     </p>

                     <p>
                         A base class,
                         <a
                             href="../apidocs/org/apache/tapestry/BaseSessionStoreOptimized.html">
                             BaseSessionStoreOptimized
                         </a>
                         , implements this behavior.
                     </p>

                 </subsection>

             </section><!-- state.aso -->

             <section name="Stateless applications">


                 <p>
                     In a Tapestry application, the framework acts as a buffer between the
                     application code and the Servlet API ... in particular, it manages how data is
                     stored into the HttpSession. In fact, the framework controls
                     <em>when</em>
                     the session is first created.
                 </p>

                 <p>
                     This is important and powerful, because an application that runs, even just
                     initially, without a session consumes far less resources than a stateful
                     application. This is even more important in a clustered environment with
                     multiple servers; any data stored into the HttpSession will have to be
                     replicated to other servers in the cluster, which can be expensive in terms of
                     resources (CPU time, network bandwidth, and so forth). Using less resources
                     means better throughput and more concurrent clients, always a good thing in a
                     web application.
                 </p>

                 <p>
                     Tapestry defers creation of the HttpSession until one of two things happens:
                     When a session-scoped
                     <a href="state.html#state.aso">application state object</a>
                     is first created, or when the first persistent page property is recorded. At
                     this point, Tapestry will create the HttpSession to hold the object or property.
                 </p>


                 <p>
                     For the most part, your application will be unaware of when it is stateful or
                     stateless; statefulness just happens on its own. Ideally, at least the first, or
                     <code>Home</code>
                     page, should be stateless (it should be organized in such a way that the Visit
                     object is not created, and no persistent state is stored). This will help speed
                     the initial display of the application, since no processing time will be used in
                     creating the HttpSession.
                 </p>

                 <p>
                     The
                     <code>state:</code>
                     <a href="bindings.html">binding reference</a>
                     combined with the
                     <a href="../components/general/if.html">If</a>
                     component makes it easy for you to skip portions of a page if a particular ASO
                     does not already exist; this allows you to avoid accidentally forcing its
                     creation on first reference.
                 </p>

                 <p>
                     The application may be
                     <em>stateless</em>
                     even when it has persistent page properties, if those properties use the
                     <em>client</em>
                     persistence strategy (which encodes pesistent page data into URLs as query
                     parameters). This can be a very powerful approach, though it introduces its own
                     problems:
                 </p>

                 <ul>
                     <li>
                         The query parameters are an encoding of Java objects, and could be decoded
                         to expose privileged information.
                     </li>
                     <li>
                         The encoding of page state can result in very long strings included as part
                         of URLs, possibly extending beyond the 3000 to 4000 character effective
                         maximum URL length.
                     </li>
                 </ul>
             </section><!-- state.stateless -->

         </section>

     </body>
 </document>