blob: e7ebf3ccd68c11e0a131dab7eda9c32be80afc41 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "../dtd/document-v10.dtd">
<document>
<header>
<title>Write a Custom Generator</title>
<version>0.3</version>
<authors>
<person name="Geoff Howard" email="javageoff@yahoo.com" />
</authors>
</header>
<body>
<s1 title="Introduction">
<p>This Tutorial describes the steps necessary to write a basic Cocoon
generator. Starting with a quick "Hello World" example and
progressing to slightly more involved examples should give a good
start to those whose applications call for extending Cocoon with a
custom generator.</p>
<p>The intention is to provide:</p>
<ul>
<li>the basics of creating SAX events in a C2 generator</li>
<li>a little understanding of the Avalon container contract as it
relates to C2 generators</li>
<li>a little understanding of the factors that would influence
the decision about which xxxGenerator to extend</li>
</ul>
<s2 title="Purpose">
<p>The flexibility to extend the basic "Out of the box"
functionality of Cocoon will be an important feature for Cocoon's
viability as a broadly used application framework. Though the
documentation on
<link href="../developing/extending.html">"Extending Cocoon"</link>
(at least at this writing) seems to have a hard time imagining
applications for custom generators outside of the bizarre, I
imagine several scenarios which could call for it:</p>
<ul>
<li>A datasource as yet undeveloped in Cocoon (e.g. event
logs)</li>
<li>Database driven applications for which XSP is either too
awkward or holds too many performance questions. The need for
high scalability will drive some (such as myself) to seek
optimization in custom generators that just do not seem
reasonable to expect out of the auto-generated code that XSPs
produce. The current
<link href="../performancetips.html">Performance Tips</link>
documentation seems to lead in this direction.</li>
<li>Customized control over the caching behaviour if not
provided for by other means.</li>
</ul>
</s2>
<s2 title="Important">
<p>There are other options that should be considered before
settling on a new generator. One notable consideration is the
option of writing a Source that would fit your needs. See
<link href="http://marc.theaimsgroup.com/?t=102571404500001&amp;r=1&amp;w=2">this discussion</link>
from the mailing list for an introduction to the idea. Of course,
XSP should be considered - I have not seen any performance
comparisons that quantify the benefit that can be had from a
custom generator. Finally, be sure you understand the purpose and
capabilities of all current standard Generators, as well as those
in the scratchpad (for instance, there is a
<code>TextParserGenerator</code> in the scratchpad at the moment
which may be configurable enough to process the event log need
mentioned above). Cocoon is a rapidly developing technology that
may have anticipated your need. Because the documentation lags
behind development, you may find more by examining the source
directory and searching the
<link href="http://cocoon.apache.org/community/mail-archives.html">mail archives</link>
for applicable projects.</p>
</s2>
<s2 title="Intended Audience">
<p>This Tutorial is aimed at users who have developed an
understanding of the basics of Cocoon and have a need to begin
extending it for their own purposes, or desire a deeper
understanding of what goes on under the hood.</p>
</s2>
<s2 title="Prerequisites">
<p>Generator developers should have:</p>
<ul>
<li>Read
<link href="../userdocs/concepts/index.html">Cocoon Concepts</link>
, as well as
<link href="../developing/extending.html">Extending Cocoon</link>
, and the broad overview of
<link href="../developing/avalon.html">Avalon</link>
, the framework upon which Cocoon is built.</li>
<li>An installed version of Cocoon if you want to follow the
examples yourself (obviously).</li>
<li>A good understanding of Java.</li>
<li>Java SDK (1.2 or later) "installed".</li>
</ul>
</s2>
</s1>
<s1 title="Diving In">
<p>Let us start with a simple "Hello World" example:</p>
<s2 title="Simple Example">
<p>Our goal will be to build the following document (or, more to
the point, the SAX events that would correspond to this document).
</p>
<source xml:space="preserve">
<![CDATA[<example>Hello World!</example>]]>
</source>
<p>An example of code that will send the correct SAX events down
the pipeline:</p>
<source xml:space="preserve">
<![CDATA[
import org.apache.cocoon.generation.AbstractGenerator;
import org.xml.sax.helpers.AttributesImpl;
import org.xml.sax.SAXException;
public class HelloWorldGenerator extends AbstractGenerator
{
AttributesImpl emptyAttr = new AttributesImpl();
/**
* Override the generate() method from AbstractGenerator.
* It simply generates SAX events using SAX methods.
* I haven't done the comparison myself, but this
* has to be faster than parsing them from a string.
*/
public void generate() throws SAXException
{
// the org.xml.sax.ContentHandler is inherited
// through org.apache.cocoon.xml.AbstractXMLProducer
contentHandler.startDocument();
contentHandler.startElement("", "example", "example", emptyAttr);
contentHandler.characters("Hello World!".toCharArray(),0,
"Hello World!".length());
contentHandler.endElement("","example", "example");
contentHandler.endDocument();
}
}
]]></source>
<p>So, the basic points are that we extend
<code>AbstractGenerator</code>, override its generate() method,
call the relevant SAX methods on the contentHandler (inherited
from <code>AbstractGenerator</code>) to start, fill and end the
document. For information on the SAX api, see
<link href="http://www.saxproject.org/">www.saxproject.org</link>
</p>
<note>A performance tip might be to keep an empty instance of
<code>AttributesImpl</code> around to reuse for each element
with no attributes. Also, the characters(char[] chars, int start,
int end) begs to be overloaded with a version like
<code>characters(String justPutTheWholeThingIn)</code>
that handles the conversion to a character array and assumes you
want from beginning to end, as is done in
<code>org.apache.cocoon.generation.AbstractServerPage</code>.
If you are not using namespaces, it is easy to imagine overloaded
convenience implementations of the other SAX methods as well.
You will probably want to set up a convenient BaseGenerator with
helpers like this and extend it for your real Generators.</note>
<s3 title="What to Extend?">
<p>How did we choose to extend <code>AbstractGenerator</code>?
Generators are defined by the
<code>org.apache.cocoon.generation.Generator</code> interface.
The only direct implementation of this of interest to us is
<code>AbstractGenerator</code>, which gives a basic level of
functionality. Another option would have been
<code>ComposerGenerator</code>, which would give us the added
functionality of implenting the Avalon interface
<code>Composable</code>
, which would signal the container that handles all the
components including our generator to give us a handle back to
the <code>ComponentManager</code>
during the startup of the container. If we needed to lookup a
pooled database connection, or some other standard or custom
Cocoon component, this is what we would do. Most of the out
of the box Generators extend <code>ComposerGenerator</code>.
Other abstract Generators you may choose to extend include the
poorly named (IMHO) <code>ServletGenerator</code>
, and <code>AbstractServerPage</code>
. While these both introduce functionality specific to their
eventual purpose - the JSP and XSP generators, they do make a
convenient starting place for many other Generators.</p>
</s3>
<s3 title="Running The Sample">
<p>In order to run this sample, you will need to compile the code,
deploy it into the cocoon webapp, and modify the sitemap to
declare our generator and allow access to it via a pipeline.</p>
<s4 title="Compile">
<p>Save this source as <code>HelloWorldGenerator.java</code>
and compile it using</p>
<source>javac -classpath %PATH_TO_JARS%\cocoon.jar;%PATH_TO_JARS%\xml-apis.jar
HelloWorldGenerator.java</source>
<p>Unfortunately for me, the exact name of your cocoon and
xml-apis jars may vary with exactly which distribution,
or CVS version you are using, since the community has taken
to appending dates or versions at the end of the jar name
to avoid confusion. Be sure to find the correct name on
your system and substitute it in the classpath. Also, you
have several options on where to find jars. If you have a
source version that you built yourself, you may want to
point to <code>lib\core\</code> for them. If you have only
the binary version, you can find them in
<code>WEB-INF\lib\</code></p>
</s4>
<s4 title="Deploy">
<p>Simply copy the class file into the
<code>%TOMCAT_HOME%\webapps\cocoon\WEB-INF\classes</code>
directory</p>
<note>If memory serves me, there have been occasional
classloading problems in the past that may affect
classloading. If your compiled classes are not recognized
in the classes directory, try
<code>jar</code>-ing them up and place them in
<code>WEB-INF\lib\</code> instead. That is probably where
your real generators would go anyway - with a whole package
of all your custom classes in one jar.</note>
</s4>
<s4 title="Sitemap Modifications">
<p>You need to do two things: in the
<code>map:generators</code>
section, add an element for your class:</p>
<source><![CDATA[<map:generator name="helloWorld" src="HelloWorldGenerator"/>]]></source>
<p>Then add a pipeline to sitemap.xmap which uses it:</p>
<source xml:space="preserve">
<![CDATA[...
<map:match pattern="heyThere.xml">
<map:generate type="helloWorld"/>
<map:serialize type="xml"/>
</map:match>
...]]>
</source>
<p>And finally, our creation should be available at
<code>http://localhost:8080/cocoon/heyThere.xml</code>
</p>
<p>Depending on your exact setup, you may need to restart
Tomcat (or whatever your servlet container is) to get
there.</p>
<note>Notice that the
<code>
<![CDATA[<?xml version="1.0" encoding="UTF-8"?>]]>
</code>
declaration was added for us by the xml serializer at the
beginning. If you need to modify this, the generator is not
the appropriate place. The default encoding of UTF-8 could
be overridden with iso-8859-1 for example by specifying an
<code>
<![CDATA[<encoding>iso-8859-1</encoding>]]>
</code>
child parameter inside the declaration for the xml
serializer in your sitemap.</note>
</s4>
</s3>
</s2>
<s2 title="A Less Trivial Example">
<p>Moving on to a less trivial example, we will take some
information out of the Request, and construct a slightly more
involved document. This time, our goal will be the following
document:</p>
<source xml:space="preserve">
<![CDATA[<doc>
<uri>...</uri>
<params>
<param value="...">...</param>
...
</params>
<date>..</date>
</doc>]]>
</source>
<p>The values of course will be filled in from the request, and
will depend on choices we make later.</p>
<source xml:space="preserve"><![CDATA[import org.apache.cocoon.generation.AbstractGenerator;
import org.xml.sax.helpers.AttributesImpl;
import org.xml.sax.SAXException;
// for the setup() method
import org.apache.cocoon.environment.SourceResolver;
import java.util.Map;
import org.apache.avalon.framework.parameters.Parameters;
import org.apache.cocoon.ProcessingException;
import java.io.IOException;
// used to deal with the request parameters.
import org.apache.cocoon.environment.ObjectModelHelper;
import org.apache.cocoon.environment.Request;
import java.util.Enumeration;
import java.util.Date;
public class RequestExampleGenerator extends AbstractGenerator
{
// Will be initialized in the setup() method and used in generate()
Request request = null;
Enumeration paramNames = null;
String uri = null;
// We will use attributes this time.
AttributesImpl myAttr = new AttributesImpl();
AttributesImpl emptyAttr = new AttributesImpl();
public void setup(SourceResolver resolver, Map objectModel,
String src, Parameters par)
throws ProcessingException, SAXException, IOException
{
super.setup(resolver, objectModel, src, par);
request = ObjectModelHelper.getRequest(objectModel);
paramNames = request.getParameterNames();
uri = request.getRequestURI();
}
/**
* Implement the generate() method from AbstractGenerator.
*/
public void generate() throws SAXException
{
contentHandler.startDocument();
contentHandler.startElement("", "doc", "doc", emptyAttr);
// <uri> and all following elements will be nested inside the doc element
contentHandler.startElement("", "uri", "uri", emptyAttr);
contentHandler.characters(uri.toCharArray(),0,uri.length());
contentHandler.endElement("", "uri", "uri");
contentHandler.startElement("", "params", "params", emptyAttr);
while (paramNames.hasMoreElements())
{
// Get the name of this request parameter.
String param = (String)paramNames.nextElement();
String paramValue = request.getParameter(param);
// Since we've chosen to reuse one AttributesImpl instance,
// we need to call its clear() method before each use. We
// use the request.getParameter() method to look up the value
// associated with the current request parameter.
myAttr.clear();
myAttr.addAttribute("","value","value","",paramValue);
// Each <param> will be nested inside the containing <params> element.
contentHandler.startElement("", "param", "param", myAttr);
contentHandler.characters(param.toCharArray(),0,param.length());
contentHandler.endElement("","param", "param");
}
contentHandler.endElement("","params", "params");
contentHandler.startElement("", "date", "date", emptyAttr);
String dateString = (new Date()).toString();
contentHandler.characters(dateString.toCharArray(),0,dateString.length());
contentHandler.endElement("", "date", "date");
contentHandler.endElement("","doc", "doc");
contentHandler.endDocument();
}
public void recycle() {
super.recycle();
this.request = null;
this.paramNames = null;
this.parNames = null;
this.uri = null;
}
}]]></source>
<s3 title="Compile and Test">
<p>Save this code as
<code>RequestExampleGenerator.java</code>
and compile as before. You will need to add both
<code>avalon-framework.jar</code> and
<code>avalon-excalibur.jar</code> to your classpath
this time. Besides finding the exact name of the jar
as described above, you may now also have to ensure
that you have the version of excalibur targeted to your
jvm version - there is currently a version for JDK 1.4
and one for 1.2/1.3</p>
<p>For your sitemap, you will need to add a definition
for this generator like
<code><![CDATA[<map:generator name="requestExample" src="RequestExampleGenerator"/>]]></code>
and you will need a sitemap pipeline like:</p>
<source xml:space="preserve"><![CDATA[<map:match pattern="howYouDoin.xml">
<map:generate type="requestExample"/>
<map:serialize type="xml"/>
</map:match>]]>
</source>
<p>At this point, you should be able to access the
example at
<code>http://localhost:8080/cocoon/howYouDoin.xml?anyParam=OK&amp;more=better</code></p>
</s3>
<s3 title="New Concepts">
<s4 title="Lifecycle">
<p>First, notice that we now override the
<code>setup(...)</code> and <code>recycle()</code> methods
defined in <code>AbstractGenerator</code>.
The <code>ComponentManager</code> that handles the lifecycle of
all <code>component</code>s in Cocoon, calls
<code>setup(..)</code> before each new call to
<code>generate()</code> to give the Generator information
about the current request and its environment, and calls
recycle() when it is done to enable it to clean up resources
as appropriate. Our example uses only the
<code>objectModel</code> which abstracts the Request,
Response, and Context. We get a reference to the Request
wrapper, and obtain an <code>Enumeration</code> of all the
GET/POST parameters available.</p>
<p>The <code>src</code> and <code>SourceResolver</code> are
provided to enable us to look up and use whatever source is
specified in the pipeline setup. Had we specified
<code><![CDATA[<map:generate type="helloWorld" src="someSourceString"/>]]></code>
we would have used the <code>SourceResolver</code> to work
with "someSourceString", whether it be a file, or url, etc.</p>
<p>We are also given a
<code>Parameters</code> reference which we would use to obtain
any parameter names and values which are children elements of
our <code>map:generate</code> element in the pipeline.</p>
<note>It may be good practice to abstract the source of your parameters so
that they do not have to come from the Request object. For instance, the
following code would allow us to abstract the origin of two parameters, param1
and param2:</note>
<source xml:space="preserve"><![CDATA[In RequestExampleGenerator.java,
...
String param1 = null;
String param2 = null;
...
public void setup(SourceResolver resolver, Map objectModel,
String src, Parameters par)
throws ProcessingException, SAXException, IOException
{
...
param1 = par.getParameter("param1");
param2 = par.getParameter("param2");
}
and in sitemap.xmap,
...
<map:match pattern="abstractedParameters.xml"/>
<map:act type="request">
<map:parameter name="parameters" value="true"/>
<map:generate type="requestExample">
<parameter name="param1" value="{visibleName1}"/>
<parameter name="param2" value="{visibleName2}"/>
</map:generate>
</map:act>
</map:match>
...]]></source>
<p>As you can see, we have also hidden the internal
name from the outside world who will use
<code>?visibleName1=foo&amp;visibleName2=bar</code>
</p>
</s4>
<s4 title="Nested Elements">
<p>In this example, nested elements are created simply
by nesting complete
<code>startElement()</code>/<code>endElement</code>
pairs within each other. If we had a logic failure in our code and
sent non-wellformed xml events down the pipeline, nothing in our
process would complain (try it!). Of course, any transformers later
in the pipeline would behave in an unpredictable manner.</p>
</s4>
<s4 title="Attributes">
<p>Finally, we've introduced the use of attributes.
We chose to
employ one <code>attributesImpl</code>, clearing it before each
element. Multiple attributes for an element would simply be added
by repeated calls to <code>addAttribute</code>.</p>
</s4>
</s3>
<s3 title="A Lesson">
<p>Before moving on, it is worth noting that
after all this work, there is already a generator provided with
Cocoon which does much of what we have accomplished here
- <code>org.apache.cocoon.generation.RequestGenerator</code>
which in the default configuration is probably available at
<code>http://localhost:8080/cocoon/request</code></p>
</s3>
</s2>
<s2 title="Moving On">
<p>From here, we will move on to cover handling ugly pseudo-xml
(like real world html) with CDATA blocks, employing some of the
Avalon lifecycle method callbacks (Composable/Disposable), Database
access, and Caching.</p>
<s3 title="The Employee SQL Example Reworked">
<p>In the samples included with Cocoon, there is an example of a SQL
query using XSP and ESQL. We will recreate part of that example
below using the same HSQL database, which should be automatically
configured and populated with data in the default build. If you
find that you do not have that database set up, see the ESQL XSP
sample for instructions on setting the datasource up. Do note that
this specific task is handled in the ESQL XSP example in just a few
lines of code. If your task is really this simple, there may be no
need to create your own generator.</p>
<source xml:space="preserve"><![CDATA[import org.apache.cocoon.generation.ComposerGenerator;
import org.apache.avalon.framework.component.ComponentManager;
import org.apache.avalon.framework.component.ComponentException;
import org.apache.avalon.framework.component.ComponentSelector;
import org.apache.avalon.excalibur.datasource.DataSourceComponent;
import org.apache.cocoon.environment.SourceResolver;
import org.apache.avalon.framework.parameters.Parameters;
import org.apache.cocoon.environment.ObjectModelHelper;
import org.apache.cocoon.environment.Request;
import org.apache.cocoon.caching.Cacheable;
import org.apache.cocoon.caching.CacheValidity;
import org.apache.cocoon.ProcessingException;
import org.xml.sax.ContentHandler;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.AttributesImpl;
import java.sql.*;
import java.util.Map;
import java.util.Date;
import org.apache.avalon.framework.activity.Disposable;
public class EmployeeGeneratorExample extends ComposerGenerator
implements Cacheable, Disposable
{
public void dispose() {
super.dispose();
manager.release(datasource);
datasource = null;
}
public void recycle() {
myAttr.clear();
super.recycle();
}
public void setup(SourceResolver resolver, Map objectModel,
String src, Parameters par) {
// Not neeed for this example, but you would get request
// and/or sitemap parameters here.
}
public void compose(ComponentManager manager)
throws ComponentException{
super.compose(manager);
ComponentSelector selector = (ComponentSelector)
manager.lookup(DataSourceComponent.ROLE + "Selector");
this.datasource = (DataSourceComponent) selector.select("personnel");
}
public void generate()
throws SAXException, ProcessingException {
try {
Connection conn = this.datasource.getConnection();
Statement stmt = conn.createStatement();
ResultSet res = stmt.executeQuery(EMPLOYEE_QUERY);
//open the SAX event stream
contentHandler.startDocument();
myAttr.addAttribute("","date","date","",
(new Date()).toString());
//open root element
contentHandler.startElement("","content",
"content",myAttr);
String currentDept = "";
boolean isFirstRow = true;
boolean moreRowsExist = res.next() ? true : false;
while (moreRowsExist) {
String thisDept = attrFromDB(res, "name");
if (!thisDept.equals(currentDept)) {
newDept(res,thisDept,isFirstRow);
currentDept = thisDept;
}
addEmployee(res,attrFromDB(res,"id"),
attrFromDB(res,"empName"));
isFirstRow = false;
if (!res.next()) {
endDept();
moreRowsExist = false;
}
}
//close root element
contentHandler.endElement("","content","content");
//close the SAX event stream
contentHandler.endDocument();
res.close();
stmt.close();
conn.close();
} catch (SQLException e) {
throw new ProcessingException(e);
}
}
public long generateKey()
{
// Default non-caching behaviour. We will implement this later.
return 0;
}
public CacheValidity generateValidity()
{
// Default non-caching behaviour. We will implement this later.
return null;
}
private DataSourceComponent datasource;
private AttributesImpl myAttr = new AttributesImpl();
private String EMPLOYEE_QUERY =
"SELECT department.name, employee.id, employee.name as empName " +
"FROM department, employee " +
"WHERE department.id = employee.department_id ORDER BY department.name";
private void endDept() throws SAXException {
contentHandler.endElement("","dept","dept");
}
private void newDept(ResultSet res, String dept, boolean isFirstRow)
throws SAXException {
if (!isFirstRow) {
endDept();
}
myAttr.clear();
myAttr.addAttribute("","name","name","",dept);
contentHandler.startElement("","dept","dept",myAttr);
}
private void addEmployee(ResultSet res, String id, String name)
throws SAXException {
myAttr.clear();
myAttr.addAttribute("","id","id","",id);
contentHandler.startElement("","employee","employee",myAttr);
contentHandler.characters(name.toCharArray(),0,name.length());
contentHandler.endElement("","employee","employee");
}
private String attrFromDB(ResultSet res, String column)
throws SQLException {
String value = res.getString(column);
return (res.wasNull())?"":value;
}
}]]></source></s3>
<s3 title="Compile and Test">
<p>To compile this, you will now need the following on your classpath:
<code>avalon-excalibur.jar, avalon-framework.jar, cocoon.jar,
xml-apis.jar</code> (using whatever names they have in your
distribution). When you compile this, you may receive some
deprecation warnings. Do not worry about them - we will discuss
that later.</p>
<p>To test it, copy it over to your <code>WEB-INF\classes\</code>
directory as before and add something like the following to your
<code>sitemap.xmap</code> ...</p>
<source xml:space="preserve"><![CDATA[...
<map:generator name="employee" src="EmployeeGeneratorExample"/>
...
<map:match pattern="employee.xml">
<map:generate type="employee"/>
<map:serialize type="xml"/>
</map:match>
...]]></source>
</s3>
<s3 title="New Concepts">
<s4 title="Composable and Disposable">
<p>We've implemented the Avalon lifecycle interfaces Composable and
Disposable. When Cocoon starts up (which happens when the servlet
container starts up) the <code>ComponentManager</code> will call
<code>compose(ComponentManager m)</code> for our component as it works
its way through all the components declared in the sitemap. The handle
to <code>ComponentManager</code> is used to look up any other Avalon
components that we need. Lookups happen in an abstracted way using a
ROLE which enables us to change out implementations of each component
without affecting previously written code. Our generator's ROLE by the
way was defined in the <code>Generator</code> interface. </p>
<p>Similarly, when this instance of our generator is disposed of by the
container, it will call the <code>dispose()</code> method to allow us to
clean up any resources we held on to between invocations. Note that
components can be pooled by the container. If we thought that our employee
generator was going to see a lot of traffic, we might change its definition
at the top of sitemap.xmap to include attributes like <code>pool-grow="2"
pool-max="16" pool-min="2"</code> so that multiple overlapping requests
could be serviced without a log jam.</p>
</s4>
<s4 title="Datasource">
<p>We look up our HSQL database here by its name given in cocoon.xconf.
If we had multiple datasources (say a backup development database and
a live one), we could determine which one to use based on a simple
configuration parameter in sitemap.xmap. We could get at configuration
parameters using the Avalon interface <code>Configurable</code>.</p>
<note>Notice that we wait until generate() to request our connection
from the pool - as we should. The problem is that we lose the benefit
of using prepared statements since they would be destroyed when we
returned the instance to the pool. At present, the implementation of
org.apache.avalon.excalibur.datasource.DataSourceComponent does not
support the pooling of statements.</note>
</s4>
<s4 title="Caching">
<fixme author="open">Need more content here, or links to other docs.</fixme>
<note>FIXME: This is still coming.</note>
<p>Introduce new code to implement Caching, discuss basic logic, and
deprecation/move to Avalon. I could use some help here from Carsten,
or someone who can quickly give an overview of the changes and plan.
</p>
</s4>
</s3>
</s2>
</s1>
</body>
</document>