blob: bdd28cf9ba165d51a58cfa1dc25cbf88aabb1924 [file] [log] [blame]
<!-- $Id$ -->
<!--
Copyright 2004 The Apache Software Foundation
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<chapter id="state">
<title>Managing server-side state</title>
<para>
Server-side state is any information that exists on the server, and persists between requests.
This can be anything from a single flag all the way up to a large database result set. In a typical
application, server-side state is the identity of the user (once the user logs in) and, perhaps,
a few important domain objects (or, at the very least, primary keys for those objects).
</para>
<para>
In an ordinary servlet application, managing server-side state is
entirely the application's responsibility. The Servlet API provides just the &HttpSession;, which
acts like a &Map;, relating keys to arbitrary objects. It is the application's responsibility
to obtain values from the session, and to update values into the session when they change.
</para>
<para>
Tapestry takes a different tack; it defines server-side state in terms of the Engine,
the Visit object, and persistent page properties.
</para>
<section id="state.general">
<title>Understanding servlet state</title>
<para>
Managing server-side state is one of the most complicated and error-prone aspects of web
application design, and one of the areas where Tapestry provides the most benefit. Generally
speaking, Tapestry applications which are functional within a single server will be functional within
a cluster with no additional effort. This doesn't mean planning for clustering, and testing of
clustering, is not necessary; it just means that, when using Tapestry, it is possible to narrow
the design and testing focus.
</para>
<para>
The point of server-side state is to ensure that information about the user acquired during the session
is available later in the same session. The canonical example is an application that requires some form of
login to access some or all of its content; the identify of the user must be collected at some point (in
a login page) and be generally available to other pages.
</para>
<para>
The other aspect of server-side state concerns failover. Failover is an aspect of highly-available computing
where the processing of the application is spread across many servers. A group of servers used
in this way is referred to as a <emphasis>cluster</emphasis>.
Generally speaking (and this may vary significantly between vendor's implementations)
requests from a particular client will be routed to the same server within the cluster.
</para>
<para>
In the event that the particular server in question fails (crashes unexpectedly, or otherwise brought
out of service), future requests from the client will be routed to a different, surviving server
within the cluster. This failover event should occur in such a way that the client is unaware that
anything exceptional has occured with the web application; and this means that any server-side state
gathered by the original server must be available to the backup server.
</para>
<para>
The main mechanism for handling this using the Java Servlet API is the &HttpSession;. The session can store
<emphasis>attributes</emphasis>, much like a &Map;. Attributes are object values referenced with
a string key. In the event of a failover, all such attributes are expected to be available on the
new, backup server, to which the client's requests are routed.
</para>
<para>
Different application servers implement &HttpSession; replication and failover in different ways; the servlet API
specification is delibrately non-specific on how this implementation should take place. Tapestry
follows the conventions of the most limited interpretation of the servlet specification; it assumes
that attribute replication only occurs when the &HttpSession; <function>setAttribute()</function>
method is invoked
<footnote>
<para>
This is the replication strategy employed by BEA's WebLogic server.
</para>
</footnote>.
</para>
<para>
Attribute replication was envisioned as a way to replicate simple, immutable objects
such as &String; or &Integer;. Attempting to store mutable objects, such as &List;, &Map; or some user-defined
class, can be problematic. For example, modifying an attribute value after it has been stored into the
&HttpSession; may cause a failover error. Effectively, the backup server sees a snapshot of the object
at the time that <function>setAttribute()</function> was invoked; any later change to the object's
internal state is <emphasis>not</emphasis> replicated to the other servers in
the cluster! This
can result in strange and unpredictable behavior following a failover.
</para>
<para>
Tapestry attempts to sort out the issues involving server-side state in such a way that they
are invisible to the developer. Most applications will not need to explicitly access the &HttpSession; at
all, but may still have significant amounts of server-side state. The following
sections go into more detail about how Tapestry approaches these issues.
</para>
</section> <!-- state.general -->
<section id="state.engine">
<title>Engine</title>
<para>
The engine, a class which implements the interface &IEngine;, is the central object that is
responsible for managing server-side state (among its many other
responsibilities). The engine is itself stored as an &HttpSession; attribute.
</para>
<para>
Because the internal state of the engine can change, the framework will re-store
the engine into the &HttpSession; at the end of most requests. This ensures that
any changes to the
<link linkend="state.visit">Visit object</link> are properly replicated.
</para>
<para>
The simplest way to replicate server-side state is simply not to have any. With some
care, Tapestry applications can
<link linkend="state.stateless">run stateless</link>, at least until some
actual server-side state is necessary.
</para>
</section> <!-- state.engine -->
<section id="state.visit">
<title>Visit object</title>
<para>
The Visit object is an application-defined object that may be obtained from the engine (via the
<varname>visit</varname> property of the &IEngine; or &IPage;). By convention, the class is usually named <classname>Visit</classname>, but it can be
any class whatsoever, even &Map;.
</para>
<para>
The name, "Visit", was selected to emphasize that whatever data is stored in the Visit
concerns just a single visit to the web application.
<footnote>
<para>
Another good name would have been "session", but that name is heavily overloaded throughout Java
and J2EE.
</para>
</footnote>
</para>
<para>
Tapestry doesn't mandate anything about the Visit object's class. The type of the
<literal>visit</literal> property is &Object;. In Java code, accessing the Visit object
involves a cast from &Object; to an application-specific class.
The following example demonstrates how a &listener-method;
may access the visit object.
</para>
<example>
<title>Accessing the Visit object</title>
<programlisting>
public void formSubmit(&IRequestCycle; cycle)
{
MyVisit visit = (MyVisit)getPage().getVisit();
visit.<emphasis>doSomething()</emphasis>;
}
</programlisting>
</example>
<para>
In most cases, listener methods, such as <function>formSubmit()</function>, are implemented directly
within the page. In that case, the first line can be abbreviated to:
<informalexample>
<programlisting>
MyVisit visit = (MyVisit)getVisit();
</programlisting>
</informalexample>
</para>
<para>
The Visit object is instantiated lazily, the first time it is needed. Method
<function>createVisit()</function> of &AbstractEngine; is responsible for this.
</para>
<para>
In most cases, the Visit object is an ordinary JavaBean, and therefore, has a no-arguments
constructor. In this case, the complete class name of the
Visit is specified as
&configuration-property;
<literal>org.apache.tapestry.visit-class</literal>.
</para>
<para>
Typically, the Visit class is defined in the application specification, or
as a <sgmltag class="starttag">init-parameter</sgmltag> in the web deployment descriptor (web.xml).
</para>
<example>
<title>Defining the Visit class</title>
<programlisting>
<![CDATA[
<application name="My Application">
<property name="org.apache.tapestry.visit-class" value="mypackage.MyVisit"/>
...
]]>
</programlisting>
</example>
<para>
In cases where the Visit object does not have a no-arguments contructor, or
has other special initialization requirements, the method
<function>createVisit()</function> of &AbstractEngine; can be overridden.
</para>
<para>
There is a crucial difference between accessing the visit via the
<varname>visit</varname> property of &IPage; and the
<varname>visit</varname> property of &IEngine;. In the former case, accessing the visit
via the page, the visit <emphasis>will</emphasis> be created if it does not already exist.
</para>
<para>
Accessing the <varname>visit</varname> property of the &IEngine; is different, the visit will <emphasis>not</emphasis>
be created if it does not already exist.
</para>
<para>
Carefully crafted applications will take heed of this difference and try to avoid
creating the visit unnecessarilly. It is not just the creation of this one object that is
to be avoided ... creating the visit will likely force the entire application
to go stateful (create an &HttpSession;), and applications are more efficient
while <link linkend="state.stateless">stateless</link>.
</para>
</section> <!-- state.visit -->
<section id="state.global">
<title>Global object</title>
<para>
The Global object is very similar to the Visit object with some key differences.
The Global object is shared by all instances of the application engine; ultimately, it is stored
as a &ServletContext; attribute. The Global object is therefore not persistent in any way.
The Global object is specific to an individual server within a cluster; each server will have its own instance
of the Global object.
In a failover,
the engine will connect to a new instance of the Global object within the new server.
</para>
<para>
The Global object may be accessed using the <varname>global</varname> property of either the
page or the engine (unlike the <varname>visit</varname> property, they are completely equivalent).
</para>
<para>
Care should be taken that the Global object is threadsafe; since many engines (from many sessions,
in many threads) will access it simultanenously. The default Global object is a synchronized
&HashMap;. This can be overriden with
&configuration-property;
<literal>org.apache.tapestry.global-class</literal>.
</para>
<para>
The most typical use of the Global object is to interface to J2EE resources such as EJB home and remote
interfaces or JDBC data sources. The shared Global object can cache home and remote interfaces that
are efficiently shared by all engine instances.
</para>
</section> <!-- state.global -->
<section id="state.page-properties">
<title>Persistent page properties</title>
<para>
Servlets, and by extension, JavaServer Pages, are inherently stateless. That is, they will be used
simultaneously by many threads and clients. Because of this, they must not store (in instance variables)
any properties or values that are specified to any single client.
</para>
<para>
This creates a frustration for developers, because ordinary programming techniques must be avoided.
Instead, client-specific state and data must be stored in the &HttpSession; or as &HttpServletRequest; attributes.
This is an awkward and limiting way to handle both <emphasis>transient</emphasis> state (state that is only needed
during the actual processing of the request) and
<emphasis>persistent</emphasis> state (state that should be available during the processing of this
and subsequent requests).
</para>
<para>
Tapestry bypasses most of these issues by <emphasis>not</emphasis> sharing objects between threads and clients.
Tapestry uses an object pool to store constructed page instances. As a page is needed, it is removed from the page pool.
If there are no available pages in the pool, a fresh page instance is constructed.
</para>
<para>
For the duration of a request, a page and all components within the page are reserved to the single request.
There is no chance of conflicts because only the single thread processing the request will have access
to the page. At the end of the request cycle, the page is reset back to a pristine state and
returned to the shared pool,
ready for reuse by the same client, or by a different client.
</para>
<para>
In fact, even in a high-volume Tapestry application, there will rarely be more than a few instances of any
particular page in the page pool.
</para>
<para>
For this scheme to work it is important that at the end of the request cycle, the page must return
to its pristine state. The prisitine state is equivalent to a freshly created instance of the page. In other words, any
properties of the page that changed during the processing of the request must be returned to their initial values.
</para>
<para>
The page is then returned to the page pool, where it will wait to be used in a future request. That request may be for
the same end user, or for another user entirely.
</para>
<note>
<title>Importance of resetting properties</title>
<para>
Imagine a page containing a form in which a user enters their address and credit card information. When
the form is submitted, properties of the page will be updated with the values supplied by the user.
Those values must be cleared out before the page is stored into the page pool ... if not, then the <emphasis>next</emphasis>
user who accesses the page will see the previous user's address and credit card information as default
values for the form fields!
</para>
</note>
<para>
Tapestry separates the persistent state of a page from any instance of the page.
This is very important, because
from one request cycle to another, a different instance of the page may be used ... even when clustering is
not used. Tapestry has many copies of any page in a pool, and pulls an arbitrary instance out of the pool
for each request.
</para>
<para>
In Tapestry, a page may have many properties
and may have many components, each with many properties, but only a tiny number of all those
properties needs to persist between request cycles.
On a later request, the same or different page instance may be used. With a little
assistance from the developer,
the Tapestry framework can create the illusion that the same page instance is being used in
a later request, even though the request may use a different page instance (from the page pool) ... or
(in a clustering environment) may be handled by a completely different server.
</para>
<para>
Each persistent page property is stored individually as an &HttpSession; attribute. A call
to the static method <function>Tapestry.fireObservedChange()</function> must be added
to the setter method for the property (as we'll see shortly, Tapestry can
write this method for you, which is the best approach). When the property is changed, its value is stored
as a session attribute.
Like the Servlet API, persistent properties work best with immutable objects
such as &String; and Integer;. For mutable objects (including &List; and &Map;), you must
be careful <emphasis>not</emphasis> to change the internal state of a persistent property value after invoking the
setter method.
</para>
<para>
Persistent properties make use of a &spec.property-specification; element in the
page or component specification. Tapestry does something special when a component
contains any such elements; it dynamically fabricates a subclass that provides the desired fields,
methods and whatever extra initialization or cleanup is required.
</para>
<para>
You may also, optionally, make your class abstract, and define abstract accessor methods that will
be filled in by Tapestry in the fabricated subclass. This allows you to read and update properties inside
your class, inside listener methods.
</para>
<tip>
<title>Define only what you need</title>
<para>
You only need to define abstract accessor methods if you are going to invoke those accesor methods
in your code, such as in a &listener-method;. Tapestry will create an enhanced subclass that contains
the new field, a getter method and a setter method, plus any necessary initialization methods.
If you are only going to access the property using OGNL expressions, then there's no need to define either
accessor
method.
</para>
</tip>
<note>
<title>Transient or persistent?</title>
<para>
Properties defined this way may be either transient or persistent. It is useful to define
even transient
properties using the &spec.property-specification; element because doing so ensures that
the property will be properly reset at the end of the request (before the page
is returned to the pool for later reuse).
</para>
</note>
<example>
<title>Persistent page property: Java class</title>
<programlisting>
package mypackage;
import org.apache.tapestry.html.BasePage;
public abstract class MyPage extends &BasePage;
{
abstract public int getItemsPerPage();
abstract public void setItemsPerPage(int itemsPerPage);
}
</programlisting>
</example>
<example>
<title>Persistent page property: page specification</title>
<programlisting>
<![CDATA[
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE page-specification PUBLIC
"-//Apache Software Foundation//Tapestry Specification 3.0//EN"
"http://jakarta.apache.org/tapestry/dtd/Tapestry_3_0.dtd">
<page-specification class="mypackage.MyPage">
<property-specification
name="itemsPerPage"
persistent="yes"
type="int" initial-value="10"/>
</page-specification>
]]>
</programlisting>
</example>
<para>
Again, making the class abstract, and defining abstract accessors is <emphasis>optional</emphasis>.
It is only useful when a method within the class will need to read or update the property. It
is also valid to just implement one of the two accessors. The enhanced subclass will
always include both a getter and a setter.
</para>
<para>
This exact same technique can be used with components as well as pages.
</para>
<para>
A last note about initialization. After Tapestry invokes the <function>finishLoad()</function>
method, it processes the initial value provided in the specification. If
the <literal>initial-value</literal> attribute is ommitted or blank, no change takes place.
Tapestry then takes a snapshot of the property value, which it retains
and uses at the end of each request cycle
to reset the property back to its "pristine" state.
</para>
<warning>
<para>
The previous paragraph may not be accurate; I believe Mindbridge may have changed this behavior
recently.
</para>
</warning>
<para>
This means that you may perform initialization for the property inside
<function>finishLoad()</function> (instead of providing an <literal>initial-value</literal>). However,
don't attempt to update the property from <function>initialize()</function> ... the order of operations
when the page detaches is not defined and is subject to change.
</para>
</section> <!-- state.page-properties -->
<section id="state.manual-page-properties ">
<title>Implementing persistent page properties manually</title>
<warning>
<para>
There is very little reason to implement persistent page properties manually. Using
the &spec.property-specification; element is much easier.
</para>
</warning>
<para>
The preferred way to
implement persistent page properties without using
the &spec.property-specification; element is to implement the method <function>initialize()</function> on your page. This method is invoked
once when the page is first created; it is invoked again at the end of each request cycle. An empty implementation
of this method is provided by &AbstractPage;.
</para>
<para>
The first example demonstrates how to properly implement a transient property. It is simply
a normal JavaBean property implementation, with a little extra to reset the property
back to its pristine value (<literal>null</literal>) at the end of the request.
</para>
<example>
<title>Use of <function>initialize()</function> method</title>
<programlisting>
package mypackage;
import org.apache.tapestry.html.BasePage;
public class MyPage extends &BasePage;
{
private String _message;
public String getMessage()
{
return _message;
}
public void setMessage(String message)
{
_message = message;
}
protected void initialize()
{
_message = null;
}
}
</programlisting>
</example>
<para>
If your page has additional attributes, they should also be reset inside
the <function>initialize()</function> method.
</para>
<para>
Now that we've shown how to manually implement <emphasis>transient</emphasis> state, we'll
show how to handle <emphasis>persistent</emphasis> state.
</para>
<para>
For a property to be persistent, all that's necessary is that the accessor method notify
the framework of changes. Tapestry will record the changes (using an &IPageRecorder;)
and, in later request cycles, will restore the property
using using the recorded value and whichever page instance is taken out of the page pool.
</para>
<para>
This notification takes the form of an invocation of the static method
<function>fireObservedChange()</function> in the &Tapestry; class.
This method is overloaded for all the scalar types, and for &Object;.
</para>
<example>
<title>Manual persistent page property</title>
<programlisting>
package mypackage;
import org.apache.tapestry.Tapestry;
import org.apache.tapestry.html.BasePage;
public class MyPage extends &BasePage;
{
private int _itemsPerPage;
public int getItemsPerPage()
{
return _itemsPerPage;
}
public void setItemsPerPage(int itemsPerPage)
{
_itemsPerPage = itemsPerPage;
Tapestry.fireObservedChange(this, "itemsPerPage", itemsPerPage);
}
protected void initialize()
{
_itemsPerPage = 10;
}
}
</programlisting>
</example>
<para>
This sets up a property, <varname>itemsPerPage</varname>, with a default value of 10. If
the value is changed (perhaps by a form or a &listener-method;),
the changed value will "stick" with the user who changed it, for the duration of their
session.
</para>
</section> <!-- state.manual-page-properties -->
<section id="state.manual-component-properties">
<title>Manual persistent component properties</title>
<warning>
<para>
There is very little reason to implement persistent component properties manually. Using
the &spec.property-specification; element is much easier.
</para>
</warning>
<para>
Tapestry uses the same mechanism for persistent component properties as it does for
persisting page properties (remember that pages are, in fact, specialized components).
Implementing transient and persistent properties inside components involves more work than with
pages as the initialization of
the component is more complicated.
</para>
<para>
Components do not have the equivalent of the <function>initialize()</function> method. Instead,
they must register for an event notification to tell them when the page is being <emphasis>detached</emphasis>
from the engine (prior to be stored back into the page pool). This event is generated by the page itself.
</para>
<para>
The Java interface &PageDetachListener; is the event listener interface for this purpose.
By simply implementing this interface, Tapestry will register the component as a listener and ensure that
it receives event notifications at the right time (this works for the other
page event interfaces, such as &PageRenderListener; as well; simply
implement the interface and leave the rest to the framework).
</para>
<para>
Tapestry provides a method, <function>finishLoad()</function>, for just this purpose: late initialization.
</para>
<example>
<title>Manual Persistent Component Properties</title>
<programlisting>
package mypackage;
import org.apache.tapestry.Tapestry;
import org.apache.tapestry.BaseComponent;
import org.apache.tapestry.event.PageDetachListener;
import org.apache.tapestry.event.PageEvent;
public class MyComponent extends &BaseComponent; implements &PageDetachListener;
{
private String _myProperty;
public void setMyProperty(String myProperty)
{
_myProperty = myProperty;
Tapestry.fireObservedChange(this, "myProperty", myProperty);
}
public String getMyProperty()
{
return _myProperty;
}
protected void initialize()
{
_myProperty = "<emphasis>a default value</emphasis>";
}
protected void finishLoad()
{
initialize();
}
/**
* The method specified by &PageDetachListener;.
*
*/
public void pageDetached(PageEvent event)
{
initialize();
}
}
</programlisting>
</example>
<para>
Again, there is no particular need to do all this; using the
&spec.property-specification; element is far, far simpler.
</para>
</section> <!-- state.manual-component-properties -->
<section id="state.stateless">
<title>Stateless applications</title>
<para>
In a Tapestry application, the framework acts as a buffer between the application code and
the Servlet API ... in particular, it manages how data is stored into the &HttpSession;.
In fact, the framework controls <emphasis>when</emphasis> the session is first created.
</para>
<para>
This is important and powerful, because an application that runs, even just initially, without
a session consumes far less resources that a stateful application. This is even more important
in a clustered environment with multiple servers; any data stored into the &HttpSession; will
have to be replicated to other servers in the cluster, which can be expensive in terms of resources (CPU time,
network bandwidth, and so forth). Using
less resources means better throughput and more concurrent clients, always a good thing
in a web application.
</para>
<para>
Tapestry defers creation of the &HttpSession; until one of two things happens: When
the Visit object is created, or when the first persistent page property is recorded. At this point,
Tapestry will create the &HttpSession; and store the engine into it.
</para>
<para>
Earlier, we said that the &IEngine; instance is stored in the &HttpSession;, but this is not always the case.
Tapestry maintains an object pool of &IEngine; instances that are used for stateless requests. An instance
is checked out of the pool and used to process a single request, then checked back into the pool for
reuse in a later request, by the same or different client.
</para>
<para>
For the most part, your application will be unaware of when it is stateful or stateless; statefulness
just happens on its own. Ideally, at least the first, or "Home" page, should be stateless (it should be
organized in such a way that the visit is not created, and no persistent state is stored). This will help
speed the initial display of the application, since no processing time will be used in creating the session.
</para>
</section> <!-- state.stateless -->
</chapter>