| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| <html> |
| <head> |
| <META http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| <title>HSSF Serializer</title> |
| <link href="http://purl.org/DC/elements/1.0/" rel="schema.DC"> |
| <meta content="Andrew C. Oliver" name="DC.Creator"> |
| <meta content="Marcus W. Johnson" name="DC.Creator"> |
| <meta content="Jörg Heinicke" name="DC.Creator"> |
| <meta content="This document describes the HSSF serializer of Cocoon." name="DC.Description"> |
| </head> |
| <body> |
| |
| <h1>HSSF Serializer</h1> |
| |
| <p>The HSSF serializer catches SAX events and creates a spreadsheet in the |
| XLS format used by Microsoft Excel (but the output looks just dandy in |
| <a class="external" href="http://www.gnome.org/gnumeric">Gnumeric</a> or |
| <a class="external" href="http://www.openoffice.org">OpenOffice.org</a> as well). |
| </p> |
| |
| <p>The HSSF Serializer supports most of the functionality supplied by the |
| <a class="external" href="http://jakarta.apache.org/poi/hssf">HSSF API</a> which is |
| part of the |
| <a class="external" href="http://jakarta.apache.org/poi">Jakarta POI project</a>. |
| </p> |
| |
| <ul> |
| |
| <li>Name: xls</li> |
| |
| <li>Class: org.apache.cocoon.serialization.HSSFSerializer</li> |
| |
| <li>Cacheable: no</li> |
| |
| </ul> |
| |
| |
| <h1>Usage</h1> |
| |
| <p>Using the HSSF Serializer is fairly simple. You'll need a sitemap of |
| course. Once you have that, well, you're half there. Add </p> |
| |
| <pre class="code"> |
| <map:serializer name="xls" |
| src="org.apache.cocoon.serialization.HSSFSerializer" |
| locale="us"/> |
| </pre> |
| |
| <p>into the <span class="codefrag"><map:serializers/></span> section of your sitemap. |
| The locale is optional and is used only to validate numbers. Please note |
| that numbers not in US-default format may not be compatible with Gnumeric |
| (it's less cosmopolitan then the HSSF Serializer ;-) ). Setting the locale |
| lets you use default number formats from other locales. Set this to a two |
| letter lowercase country code. See java.util.Locale for details. |
| </p> |
| |
| <p>Next, set up an entry for each URL or set of URLs (via matching rules) |
| resembling this: |
| </p> |
| |
| <pre class="code"> |
| <map:match pattern="hello.xls"> |
| <map:generate src="docs/samples/hello-page.xml"/> |
| <map:transform src="stylesheets/page/simple-page2xls.xsl"/> |
| <map:serialize type="xls"/> |
| </map:match> |
| </pre> |
| |
| <p>The most important question is now, which what XML the serializer is fed. |
| You will get it answered in the next paragraph.</p> |
| |
| |
| <h1>Required XML Format</h1> |
| |
| <p>The HSSF Serializer expects data in the same XML language as the popular |
| spreadsheet progam Gnumeric. You have different possibilities to get such |
| a document:</p> |
| |
| <ul> |
| |
| <li>You can create and save a spreadsheet using Gnumeric.</li> |
| |
| <li>You can write it yourself using the |
| <a class="external" href="http://cvs.gnome.org/lxr/source/gnumeric/gnumeric.xsd"> |
| Schemas</a> or DTDs available at the |
| <a class="external" href="http://www.gnome.org/gnumeric">Gnumeric's website</a> |
| or in Gnumeric's CVS repository. |
| </li> |
| |
| <li>You can take one of the samples delivered with Cocoon and start from |
| there.</li> |
| |
| <li>Or you use the empty workbook from appendix A in |
| <a class="external" href="http://www.superlinksoftware.com/gnumeric-xml.pdf"> |
| The Gnumeric File Format PDF</a> (all further references to 'PDF' |
| mean this file), which can further simplified to the following:<br> |
| |
| </li> |
| |
| </ul> |
| |
| <pre class="code"> |
| <gmr:Workbook xmlns:gmr="http://www.gnome.org/gnumeric/v7"> |
| <gmr:SheetNameIndex> |
| <gmr:SheetName>Sheet1</gmr:SheetName> |
| </gmr:SheetNameIndex> |
| <gmr:Sheets> |
| <gmr:Sheet> |
| <gmr:Name>Sheet1</gmr:Name> |
| <gmr:MaxCol>-1</gmr:MaxCol> |
| <gmr:MaxRow>-1</gmr:MaxRow> |
| <gmr:Cells> |
| <!-- add your cells here --> |
| </gmr:Cells> |
| </gmr:Sheet> |
| </gmr:Sheets> |
| </gmr:Workbook> |
| </pre> |
| |
| <p>While HSSFSerializer ignores the bulk of the elements, it is suggested |
| you provide at a minimum the basic elements as in the list below. As a |
| general rule, if Gnumeric in the versions 0.7 - 1.04 will load the XML |
| (provided it is tar'd and gzipped as expected), then the HSSFSerializer |
| should be able to handle it.</p> |
| |
| <p>As a general guideline the following elements are supported in this |
| release. For the nesting have a look into the sample files or the PDF. |
| </p> |
| |
| <ul> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Workbook</span> - Required. Basically the root element. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Sheets</span> - Required. Container for the spreadsheets. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Sheet</span> - Required for each sheet. For the attributes |
| have a look at the example above or into the PDF. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Name</span> - Required? Defines the sheet's name as it |
| appears on the little tabs under the workbook in your favorite GUI |
| spreadsheet application. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:MaxRow</span>, <span class="codefrag">gmr:MaxCol</span> - Used to set the |
| dimensions for the sheet. This can be wrong and your spreadsheet |
| application may not care, but some other ports depend upon this, so |
| we set it to be compatible. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Rows</span>, <span class="codefrag">gmr:Cols</span> - Used to determine the |
| default row or column width in points via the attribute |
| <span class="codefrag">DefaultSizePts</span>. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:RowInfo</span>, <span class="codefrag">gmr:ColInfo</span> - Used to |
| determine the row height/column width for a specific row/column in |
| points.<br> |
| Attributes: |
| <ul> |
| |
| <li> |
| <span class="codefrag">No</span> - row/column number</li> |
| |
| <li> |
| <span class="codefrag">Unit</span> - row/column height</li> |
| |
| </ul> |
| The count of the rows/columns starts with 0. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Cells</span> - Required. Container for all cells. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Cell</span> - Defines the actual column and row number as |
| well as the data type.<br> |
| Attributes: |
| <ul> |
| |
| <li> |
| <span class="codefrag">Row</span> - row number</li> |
| |
| <li> |
| <span class="codefrag">Col</span> - col number</li> |
| |
| <li> |
| |
| <span class="codefrag">ValueType</span> - the data type<br> |
| If you don't specify the data type, the cell content will not |
| be shown! The type is determined by a numerical key, where |
| the following are known: 10 - empty, 20 - boolean, |
| 30 - integer, 40 - float, 50 - error, 60 - string, |
| 70 - cell range, 80 - array |
| </li> |
| |
| </ul> |
| |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Content</span> - Defines the start of the value contained |
| in the cell. This is obsolete as of Gnumeric 1.03. It's not |
| recommended to use it, because it may not be supported in future |
| versions. With POI release 1.5.1 I didn't use <span class="codefrag">gmr:Content</span>, |
| but I had to specify '10' as <span class="codefrag">ValueType</span> on empty cells. |
| Otherwise I got strange output. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Styles</span> - Required if you want to use styles. |
| Container for <span class="codefrag">gmr:StyleRegion's</span>. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:StyleRegion</span> - Defines the region that the style |
| applies to.<br> |
| Attributes: |
| <ul> |
| |
| <li> |
| <span class="codefrag">startRow</span> - self-explanatory</li> |
| |
| <li> |
| <span class="codefrag">startCol</span> - self-explanatory</li> |
| |
| <li> |
| <span class="codefrag">endRow</span> - self-explanatory</li> |
| |
| <li> |
| <span class="codefrag">endCol</span> - self-explanatory</li> |
| |
| </ul> |
| Again: The count of the rows/columns starts with 0. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Style</span> - Specifies the style for a StyleRegion.<br> |
| Attributes: |
| <ul> |
| |
| <li> |
| |
| <span class="codefrag">HAlign</span> - specifies the horizontal alignment.<br> |
| Possible values: 1 - general, 2 - left, 4 - right, 8 - center, |
| 16 - fill, 32 - justify, 64 - center across selection |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">VAlign</span> - specifies the vertical alignment.<br> |
| Possible values: 1 - top, 2 - bottom, 4 - center, 8 - justify |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">WrapText</span> - specifies whether to wrap text around |
| or not<br> |
| Possible values: 0 - don't wrap, 1 - do wrap |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">Shade</span> - kind a stupid flag<br> |
| If you're setting a background color and want it filled ... |
| use Shade="1". |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">Format</span> - number format to use.<br> |
| Generally, Excel and Gnumeric have the same formats. |
| </li> |
| |
| </ul> |
| |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:Font</span> - Defines the font used for the style region.<br> |
| Attributes: |
| <ul> |
| |
| <li> |
| <span class="codefrag">Bold</span> - self-explanatory</li> |
| |
| <li> |
| <span class="codefrag">Italic</span> - self-explanatory</li> |
| |
| <li> |
| <span class="codefrag">Underline</span> - self-explanatory</li> |
| |
| <li> |
| <span class="codefrag">StrikeThrough</span> - self-explanatory</li> |
| |
| </ul> |
| Set the values of the attributes to 0 or 1 to disable or enable a |
| specific font style. |
| </li> |
| |
| <li> |
| |
| <span class="codefrag">gmr:StyleBorder</span> - Defines the borders that are used for |
| a style region. It contains one element for each possible border |
| specifying the style and the color of the border. |
| </li> |
| |
| </ul> |
| |
| <p>For more specific information on the Gnumeric file format, especially on |
| some more interesting attributes or attribute values or the nesting of |
| the elements, I only can recommend you to read the PDF or to have a look |
| at the sample files. If you want it more complicated, you can also get |
| the information from the Schema file (look above for the link).</p> |
| |
| |
| <h1>Automatic Excel Spreadsheet Generation</h1> |
| |
| <p>Hmm, I don't want to say to much on this. I showed you the XML the |
| serializer wants to have. Now it's up to you to generate this XML |
| dynamically. The best way to do this is using a XSLT stylesheet. You need |
| some information on this? Hmm, have a look at the |
| <a class="external" href="http://www.w3.org/Style/XSL/">W3C's XSL page</a>. From |
| there you get many links to tutorials, articals, the surprisingly |
| readable XSLT spec and so on. |
| </p> |
| |
| |
| <h1>Future Features</h1> |
| |
| <p>So HSSF Serializer is well on its way to being darn near everything you |
| need to create fancy smancy reports in Excel or OpenOffice. (And you can |
| just serialize the output from your stylesheets as XML for Gnumeric |
| version).</p> |
| |
| <ul> |
| |
| <li>Add support for formulas. (not yet supported by HSSF)</li> |
| |
| <li>Add support for custom data formats. (not yet supported by HSSF)</li> |
| |
| </ul> |
| |
| |
| </body> |
| </html> |