blob: b20174fc0eea40689849098195a90c066d0b1a6a [file] [log] [blame]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>HSSF Serializer</title>
<link href="http://purl.org/DC/elements/1.0/" rel="schema.DC">
<meta content="Andrew C. Oliver" name="DC.Creator">
<meta content="Marcus W. Johnson" name="DC.Creator">
<meta content="J&ouml;rg Heinicke" name="DC.Creator">
<meta content="This document describes the HSSF serializer of Cocoon." name="DC.Description">
</head>
<body>
<h1>HSSF Serializer</h1>
<p>The HSSF serializer catches SAX events and creates a spreadsheet in the
XLS format used by Microsoft Excel (but the output looks just dandy in
<a class="external" href="http://www.gnome.org/gnumeric">Gnumeric</a> or
<a class="external" href="http://www.openoffice.org">OpenOffice.org</a> as well).
</p>
<p>The HSSF Serializer supports most of the functionality supplied by the
<a class="external" href="http://jakarta.apache.org/poi/hssf">HSSF API</a> which is
part of the
<a class="external" href="http://jakarta.apache.org/poi">Jakarta POI project</a>.
</p>
<ul>
<li>Name: xls</li>
<li>Class: org.apache.cocoon.serialization.HSSFSerializer</li>
<li>Cacheable: no</li>
</ul>
<h1>Usage</h1>
<p>Using the HSSF Serializer is fairly simple. You'll need a sitemap of
course. Once you have that, well, you're half there. Add </p>
<pre class="code">
&lt;map:serializer name="xls"
src="org.apache.cocoon.serialization.HSSFSerializer"
locale="us"/&gt;
</pre>
<p>into the <span class="codefrag">&lt;map:serializers/&gt;</span> section of your sitemap.
The locale is optional and is used only to validate numbers. Please note
that numbers not in US-default format may not be compatible with Gnumeric
(it's less cosmopolitan then the HSSF Serializer ;-) ). Setting the locale
lets you use default number formats from other locales. Set this to a two
letter lowercase country code. See java.util.Locale for details.
</p>
<p>Next, set up an entry for each URL or set of URLs (via matching rules)
resembling this:
</p>
<pre class="code">
&lt;map:match pattern="hello.xls"&gt;
&lt;map:generate src="docs/samples/hello-page.xml"/&gt;
&lt;map:transform src="stylesheets/page/simple-page2xls.xsl"/&gt;
&lt;map:serialize type="xls"/&gt;
&lt;/map:match&gt;
</pre>
<p>The most important question is now, which what XML the serializer is fed.
You will get it answered in the next paragraph.</p>
<h1>Required XML Format</h1>
<p>The HSSF Serializer expects data in the same XML language as the popular
spreadsheet progam Gnumeric. You have different possibilities to get such
a document:</p>
<ul>
<li>You can create and save a spreadsheet using Gnumeric.</li>
<li>You can write it yourself using the
<a class="external" href="http://cvs.gnome.org/lxr/source/gnumeric/gnumeric.xsd">
Schemas</a> or DTDs available at the
<a class="external" href="http://www.gnome.org/gnumeric">Gnumeric's website</a>
or in Gnumeric's CVS repository.
</li>
<li>You can take one of the samples delivered with Cocoon and start from
there.</li>
<li>Or you use the empty workbook from appendix A in
<a class="external" href="http://www.superlinksoftware.com/gnumeric-xml.pdf">
The Gnumeric File Format PDF</a> (all further references to 'PDF'
mean this file), which can further simplified to the following:<br>
</li>
</ul>
<pre class="code">
&lt;gmr:Workbook xmlns:gmr="http://www.gnome.org/gnumeric/v7"&gt;
&lt;gmr:SheetNameIndex&gt;
&lt;gmr:SheetName&gt;Sheet1&lt;/gmr:SheetName&gt;
&lt;/gmr:SheetNameIndex&gt;
&lt;gmr:Sheets&gt;
&lt;gmr:Sheet&gt;
&lt;gmr:Name&gt;Sheet1&lt;/gmr:Name&gt;
&lt;gmr:MaxCol&gt;-1&lt;/gmr:MaxCol&gt;
&lt;gmr:MaxRow&gt;-1&lt;/gmr:MaxRow&gt;
&lt;gmr:Cells&gt;
&lt;!-- add your cells here --&gt;
&lt;/gmr:Cells&gt;
&lt;/gmr:Sheet&gt;
&lt;/gmr:Sheets&gt;
&lt;/gmr:Workbook&gt;
</pre>
<p>While HSSFSerializer ignores the bulk of the elements, it is suggested
you provide at a minimum the basic elements as in the list below. As a
general rule, if Gnumeric in the versions 0.7 - 1.04 will load the XML
(provided it is tar'd and gzipped as expected), then the HSSFSerializer
should be able to handle it.</p>
<p>As a general guideline the following elements are supported in this
release. For the nesting have a look into the sample files or the PDF.
</p>
<ul>
<li>
<span class="codefrag">gmr:Workbook</span> - Required. Basically the root element.
</li>
<li>
<span class="codefrag">gmr:Sheets</span> - Required. Container for the spreadsheets.
</li>
<li>
<span class="codefrag">gmr:Sheet</span> - Required for each sheet. For the attributes
have a look at the example above or into the PDF.
</li>
<li>
<span class="codefrag">gmr:Name</span> - Required? Defines the sheet's name as it
appears on the little tabs under the workbook in your favorite GUI
spreadsheet application.
</li>
<li>
<span class="codefrag">gmr:MaxRow</span>, <span class="codefrag">gmr:MaxCol</span> - Used to set the
dimensions for the sheet. This can be wrong and your spreadsheet
application may not care, but some other ports depend upon this, so
we set it to be compatible.
</li>
<li>
<span class="codefrag">gmr:Rows</span>, <span class="codefrag">gmr:Cols</span> - Used to determine the
default row or column width in points via the attribute
<span class="codefrag">DefaultSizePts</span>.
</li>
<li>
<span class="codefrag">gmr:RowInfo</span>, <span class="codefrag">gmr:ColInfo</span> - Used to
determine the row height/column width for a specific row/column in
points.<br>
Attributes:
<ul>
<li>
<span class="codefrag">No</span> - row/column number</li>
<li>
<span class="codefrag">Unit</span> - row/column height</li>
</ul>
The count of the rows/columns starts with 0.
</li>
<li>
<span class="codefrag">gmr:Cells</span> - Required. Container for all cells.
</li>
<li>
<span class="codefrag">gmr:Cell</span> - Defines the actual column and row number as
well as the data type.<br>
Attributes:
<ul>
<li>
<span class="codefrag">Row</span> - row number</li>
<li>
<span class="codefrag">Col</span> - col number</li>
<li>
<span class="codefrag">ValueType</span> - the data type<br>
If you don't specify the data type, the cell content will not
be shown! The type is determined by a numerical key, where
the following are known: 10 - empty, 20 - boolean,
30 - integer, 40 - float, 50 - error, 60 - string,
70 - cell range, 80 - array
</li>
</ul>
</li>
<li>
<span class="codefrag">gmr:Content</span> - Defines the start of the value contained
in the cell. This is obsolete as of Gnumeric 1.03. It's not
recommended to use it, because it may not be supported in future
versions. With POI release 1.5.1 I didn't use <span class="codefrag">gmr:Content</span>,
but I had to specify '10' as <span class="codefrag">ValueType</span> on empty cells.
Otherwise I got strange output.
</li>
<li>
<span class="codefrag">gmr:Styles</span> - Required if you want to use styles.
Container for <span class="codefrag">gmr:StyleRegion's</span>.
</li>
<li>
<span class="codefrag">gmr:StyleRegion</span> - Defines the region that the style
applies to.<br>
Attributes:
<ul>
<li>
<span class="codefrag">startRow</span> - self-explanatory</li>
<li>
<span class="codefrag">startCol</span> - self-explanatory</li>
<li>
<span class="codefrag">endRow</span> - self-explanatory</li>
<li>
<span class="codefrag">endCol</span> - self-explanatory</li>
</ul>
Again: The count of the rows/columns starts with 0.
</li>
<li>
<span class="codefrag">gmr:Style</span> - Specifies the style for a StyleRegion.<br>
Attributes:
<ul>
<li>
<span class="codefrag">HAlign</span> - specifies the horizontal alignment.<br>
Possible values: 1 - general, 2 - left, 4 - right, 8 - center,
16 - fill, 32 - justify, 64 - center across selection
</li>
<li>
<span class="codefrag">VAlign</span> - specifies the vertical alignment.<br>
Possible values: 1 - top, 2 - bottom, 4 - center, 8 - justify
</li>
<li>
<span class="codefrag">WrapText</span> - specifies whether to wrap text around
or not<br>
Possible values: 0 - don't wrap, 1 - do wrap
</li>
<li>
<span class="codefrag">Shade</span> - kind a stupid flag<br>
If you're setting a background color and want it filled ...
use Shade="1".
</li>
<li>
<span class="codefrag">Format</span> - number format to use.<br>
Generally, Excel and Gnumeric have the same formats.
</li>
</ul>
</li>
<li>
<span class="codefrag">gmr:Font</span> - Defines the font used for the style region.<br>
Attributes:
<ul>
<li>
<span class="codefrag">Bold</span> - self-explanatory</li>
<li>
<span class="codefrag">Italic</span> - self-explanatory</li>
<li>
<span class="codefrag">Underline</span> - self-explanatory</li>
<li>
<span class="codefrag">StrikeThrough</span> - self-explanatory</li>
</ul>
Set the values of the attributes to 0 or 1 to disable or enable a
specific font style.
</li>
<li>
<span class="codefrag">gmr:StyleBorder</span> - Defines the borders that are used for
a style region. It contains one element for each possible border
specifying the style and the color of the border.
</li>
</ul>
<p>For more specific information on the Gnumeric file format, especially on
some more interesting attributes or attribute values or the nesting of
the elements, I only can recommend you to read the PDF or to have a look
at the sample files. If you want it more complicated, you can also get
the information from the Schema file (look above for the link).</p>
<h1>Automatic Excel Spreadsheet Generation</h1>
<p>Hmm, I don't want to say to much on this. I showed you the XML the
serializer wants to have. Now it's up to you to generate this XML
dynamically. The best way to do this is using a XSLT stylesheet. You need
some information on this? Hmm, have a look at the
<a class="external" href="http://www.w3.org/Style/XSL/">W3C's XSL page</a>. From
there you get many links to tutorials, articals, the surprisingly
readable XSLT spec and so on.
</p>
<h1>Future Features</h1>
<p>So HSSF Serializer is well on its way to being darn near everything you
need to create fancy smancy reports in Excel or OpenOffice. (And you can
just serialize the output from your stylesheets as XML for Gnumeric
version).</p>
<ul>
<li>Add support for formulas. (not yet supported by HSSF)</li>
<li>Add support for custom data formats. (not yet supported by HSSF)</li>
</ul>
</body>
</html>