blob: 184c624d5610bb6d0158239445aa114023eceaa5 [file] [log] [blame]
<?xml version="1.0"?>
<html>
<head>
<title>Cocoon concepts</title>
</head>
<body>
<h1>Overview</h1>
<p>
This document gives a brief overview of the most important
concepts used in Cocoon.
</p>
<h1>Separation of Concerns (SoC)</h1>
<p>We believe the single most important Cocoon innovation is SoC-based design.</p>
<p>SoC is something that you've always been aware of: not everybody is equal,
not everybody performs the same job with the same ability.</p>
<p>It can be observed that separating people with common skills in different
working groups increases productivity and reduces management costs, but only
if the groups do not overlap and have clear "contracts" that define their
operability and their concerns.</p>
<p>For a web publishing system, the Cocoon project uses what we call the
<em>pyramid of contracts</em>, with four major concern areas and five
contracts between them:
</p>
<img alt="The Cocoon Pyramid Model of Contracts" src="files/pyramid-model.gif"/>
<p>
Cocoon is <em>engineered</em> to provide you a way to isolate these four
concern areas using just those 5 contracts, removing the contract
between style and logic that has been bugging web site development since
the beginning of the Web.
</p>
<p>
Let's have an example:
</p>
<pre class="code">
&lt;page&gt;
&lt;content&gt;
&lt;para&gt;Today is &lt;dynamic:today/&gt;&lt;/para&gt;
&lt;/content&gt;
&lt;/page&gt;
</pre>
<p>
Such a page is written by the content writers and you give them the
"contract" that states that the tag &lt;dynamic:today/&gt; prints out the time of the day
when included in the page. Content writers don't care (nor
should) about what language has been used for that, nor they
can mess up with the programming logic that generates the
content since it's stored in another part of the system they
don't have access to.
</p>
<p>
So &lt;dynamic:today/&gt; is the "logic - content" contract.
</p>
<p>
At the same time, the structure of the page is given as a contract to
the graphic designers who have to come up with the transformation rules
that transform this structure in a language that the browser can
understand (HTML, for example).
</p>
<p>
So, the page structure is the "content - style" contract.
</p>
<p>
As long as these contracts don't change, the three areas can work in a
completely parallel way without overwhelming the human resources used to
manage them: costs decrease because time to market is reduced and
maintenance costs is decreased because errors do not propagate out of
the concern areas.
</p>
<p>
For example, you can tell your designers to come up with a "Xmas look"
for your web site, without even telling the other people: just switch to
the Xmas transformation rules on Xmas morning and you're done.... just
imagine how painful it would be to do this on your web site today.
</p>
<p>
With the Cocoon architecture all this is a couple of line changes away.
</p>
<h1>Semantic markup</h1>
<p>
Although it is not a Cocoon invention, <em>semantic markup</em> is
very important to work efficently with Cocoon.
</p>
<p>
By <em>semantic markup</em> we mean a way of building XML documents
and models which preserves semantic information (metadata) as much as possible,
by keeping the data in structured format and moving to presentation
formats as late as possible in the document transformation process.
</p>
<p>
This document, for example:
<pre class="code">
&lt;page&gt;
&lt;author id="3243"&gt;Will Coyote&lt;author&gt;
&lt;created&gt;2005-03-12&lt;/created&gt;
&lt;revision&gt;1.42&lt;/revision&gt;
&lt;content&gt;
Once upon a time, John made a &lt;b&gt;bold&lt;/b&gt; move...
&lt;/content&gt;
&lt;/page&gt;
</pre>
contains structured information, that can be filtered and selected
at Will to generate different presentations.
</p>
<p>
As you can see, there's nothing very sophisticated about semantic
markup: the basic idea is to keep <em>everything that is known</em>
about a given document or piece of information in a structured format.
This makes it possible to precisely select the information elements
that must be published, and to format them in as many ways as needed.
</p>
<h1>The sitemap</h1>
<p>
The <em>sitemap</em> is a central component of any Cocoon application,
acting as a very powerful <em>request dispatcher</em>. It was the single most
important innovation of Cocoon 2, and it is not going away soon: we <em>love</em> its power!
</p>
<p>
The main role of the sitemap is to trigger the execution of <em>pipelines</em>
to process client requests. The sitemap uses <em>matchers</em> to select
which pipeline to execute, matching various attributes of the incoming request
(often the request path, but many other attributes can be used) to activate
the first pipeline which matches the incoming request.
</p>
<p>
Here are a few examples of matchers:
<pre class="code">
&lt;map:match pattern="*.html"&gt;
...pipeline definition goes here
</pre>
This pipeline would be activated for any request for a filename which
ends in <em>*.html</em>.
<pre class="code">
&lt;map:match pattern="**/resources/css/*.css"&gt;
...pipeline definition goes here
</pre>
This pipeline would be activated for requests having paths like
<em>something/resources/css/mystyles.css</em>
or
<em>a/b/c/d/resources/css/mystyles.css</em>
</p>
<p>
Sitemap variables give access to the matched parts of the request,
for example to apply a common XSLT transform to all XML files found
in a given directory:
<pre class="code">
&lt;map:match pattern="**/*.html"&gt;
&lt;map:generate src="content/{1}/{2}.xml"/&gt;
&lt;map:transform src="xslt/my-transform.xsl"/&gt;
&lt;map:serialize type="html"/&gt;
&lt;/map:match&gt;
</pre>
For the request <em>docs/planets/mars.html</em>, this pipeline would
use the XML file <em>content/docs/planets/mars.xml</em> for input,
process it with the <em>xslt/my-transform.xsl</em> XSLT transform,
and send the result as an HTML document to the client. The double
star (**) in the matcher pattern matches a path, while a single star matches
a filename.
</p>
<p>
This last example also introduces the usual components of a pipeline:
the <em>Generator</em> produces XML data, zero or more <em>Transformers</em>
process the XML and at the end a <em>Serializer</em> converts the XML
into the appropriate format and defines the <em>Content-Type</em> of the
output.
</p>
<note>
The XML data passes as <em>SAX events</em> from one component to the
next in the pipeline.
</note>
<p>
There's much more to the sitemap: it can contain component definitions,
define <em>views</em> to make the various stages of the pipeline available
for debugging, define <em>flowscripts</em> to act as the glue between
pages, and several other things that we'll discuss at a later point. Sitemaps
can also be "mounted" to create hierarchies of sitemaps and modularize
applications.
</p>
<h1>Flow</h1>
<fixme author="BD">
TODO: a brief description of this might be good here to
have all the critical concepts in a single document.
</fixme>
<h1>Cocoon Forms</h1>
<fixme author="BD">
TODO: a brief description of this might be good here to
have all the critical concepts in a single document.
</fixme>
<h1>Business logic</h1>
<fixme author="BD">
TODO: a brief description of this might be good here to
have all the critical concepts in a single document.
<p>
Talk about the integration of Java code, REST backends, etc.
</p>
</fixme>
</body>
</html>