| <?xml version="1.0" encoding="UTF-8"?> |
| <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd"> |
| |
| <document> |
| <header> |
| <title>The New Halloween Document</title> |
| <authors> |
| <person email="acoliver2@users.sourceforge.net" name="Andrew C. Oliver" id="AO"/> |
| <person email="glens@apache.org" name="Glen Stampoultzis" id="GJS"/> |
| </authors> |
| </header> |
| <body> |
| <section title="How to use the HSSF prototype API"> |
| |
| <section title="Capabilities"> |
| <p>This release of the how-to outlines functionality for the 1.5 release. |
| Those looking for information on previous releases should |
| look in the documentation distributed with that release.</p> |
| <p> |
| This release allows numeric and string cell values to be written to |
| or read from an XLS file as well as reading and writing dates. Also |
| in this release is row and column sizing, cell styling (bold, |
| italics, borders,etc), and support for built-in data formats. New |
| to this release is an event-based API for reading XLS files. |
| It differs greatly from the read/write API |
| and is intended for intermediate developers who need a smaller |
| memory footprint. It will also serve as the basis for the HSSF |
| Generator.</p> |
| </section> |
| <section title="General Use"> |
| <section title="User API"> |
| <section title="Writing a new one"> |
| |
| <p>The high level API (package: org.apache.poi.hssf.usermodel) |
| is what most people should use. Usage is very simple. |
| </p> |
| <p>Workbooks are created by creating an instance of |
| org.apache.poi.hssf.usermodel.HSSFWorkbook. |
| </p> |
| <p>Sheets are created by calling createSheet() from an existing |
| instance of HSSFWorkbook, the created sheet is automatically added in |
| sequence to the workbook. Sheets do not in themselves have a sheet |
| name (the tab at the bottom); you set |
| the name associated with a sheet by calling |
| HSSFWorkbook.setSheetName(sheetindex,"SheetName").</p> |
| <p>Rows are created by calling createRow(rowNumber) from an existing |
| instance of HSSFSheet. Only rows that have cell values should be |
| added to the sheet. To set the row's height, you just call |
| setRowHeight(height) on the row object. The height must be given in |
| twips, or 1/20th of a point. If you prefer, there is also a |
| setRowHeightInPoints method. |
| </p> |
| <p>Cells are created by calling createCell(column, type) from an |
| existing HSSFRow. Only cells that have values should be added to the |
| row. Cells should have their cell type set to either |
| HSSFCell.CELL_TYPE_NUMERIC or HSSFCell.CELL_TYPE_STRING depending on |
| whether they contain a numeric or textual value. Cells must also have |
| a value set. Set the value by calling setCellValue with either a |
| String or double as a parameter. Individual cells do not have a |
| width; you must call setColumnWidth(colindex, width) (use units of |
| 1/256th of a character) on the HSSFSheet object. (You can't do it on |
| an individual basis in the GUI either).</p> |
| <p>Cells are styled with HSSFCellStyle objects which in turn contain |
| a reference to an HSSFFont object. These are created via the |
| HSSFWorkbook object by calling createCellStyle() and createFont(). |
| Once you create the object you must set its parameters (colors, |
| borders, etc). To set a font for an HSSFCellStyle call |
| setFont(fontobj). |
| </p> |
| <p>Once you have generated your workbook, you can write it out by |
| calling write(outputStream) from your instance of Workbook, passing |
| it an OutputStream (for instance, a FileOutputStream or |
| ServletOutputStream). You must close the OutputStream yourself. HSSF |
| does not close it for you. |
| </p> |
| <p>Here is some example code (excerpted and adapted from |
| org.apache.poi.hssf.dev.HSSF test class):</p> |
| <source><![CDATA[ short rownum; |
| |
| // create a new file |
| FileOutputStream out = new FileOutputStream("workbook.xls"); |
| // create a new workbook |
| HSSFWorkbook wb = new HSSFWorkbook(); |
| // create a new sheet |
| HSSFSheet s = wb.createSheet(); |
| // declare a row object reference |
| HSSFRow r = null; |
| // declare a cell object reference |
| HSSFCell c = null; |
| // create 3 cell styles |
| HSSFCellStyle cs = wb.createCellStyle(); |
| HSSFCellStyle cs2 = wb.createCellStyle(); |
| HSSFCellStyle cs3 = wb.createCellStyle(); |
| // create 2 fonts objects |
| HSSFFont f = wb.createFont(); |
| HSSFFont f2 = wb.createFont(); |
| |
| //set font 1 to 12 point type |
| f.setFontHeightInPoints((short) 12); |
| //make it red |
| f.setColor((short) HSSFCellStyle.RED); |
| // make it bold |
| //arial is the default font |
| f.setBoldweight(f.BOLDWEIGHT_BOLD); |
| |
| //set font 2 to 10 point type |
| f2.setFontHeightInPoints((short) 10); |
| //make it the color at palette index 0xf (white) |
| f2.setColor((short) HSSFCellStyle.WHITE); |
| //make it bold |
| f2.setBoldweight(f2.BOLDWEIGHT_BOLD); |
| |
| //set cell stlye |
| cs.setFont(f); |
| //set the cell format see HSSFDataFromat for a full list |
| cs.setDataFormat(HSSFDataFormat.getFormat("($#,##0_);[Red]($#,##0)")); |
| |
| //set a thin border |
| cs2.setBorderBottom(cs2.BORDER_THIN); |
| //fill w fg fill color |
| cs2.setFillPattern((short) HSSFCellStyle.SOLID_FOREGROUND); |
| // set foreground fill to red |
| cs2.setFillForegroundColor((short) HSSFCellStyle.RED); |
| |
| // set the font |
| cs2.setFont(f2); |
| |
| // set the sheet name to HSSF Test |
| wb.setSheetName(0, "HSSF Test"); |
| // create a sheet with 300 rows (0-299) |
| for (rownum = (short) 0; rownum < 300; rownum++) |
| { |
| // create a row |
| r = s.createRow(rownum); |
| // on every other row |
| if ((rownum % 2) == 0) |
| { |
| // make the row height bigger (in twips - 1/20 of a point) |
| r.setHeight((short) 0x249); |
| } |
| |
| //r.setRowNum(( short ) rownum); |
| // create 50 cells (0-49) (the += 2 becomes apparent later |
| for (short cellnum = (short) 0; cellnum < 50; cellnum += 2) |
| { |
| // create a numeric cell |
| c = r.createCell(cellnum); |
| // do some goofy math to demonstrate decimals |
| c.setCellValue(rownum * 10000 + cellnum |
| + (((double) rownum / 1000) |
| + ((double) cellnum / 10000))); |
| |
| // on every other row |
| if ((rownum % 2) == 0) |
| { |
| // set this cell to the first cell style we defined |
| c.setCellStyle(cs); |
| } |
| |
| // create a string cell (see why += 2 in the |
| c = r.createCell((short) (cellnum + 1)); |
| |
| // set the cell's string value to "TEST" |
| c.setCellValue("TEST"); |
| // make this column a bit wider |
| s.setColumnWidth((short) (cellnum + 1), (short) ((50 * 8) / ((double) 1 / 20))); |
| |
| // on every other row |
| if ((rownum % 2) == 0) |
| { |
| // set this to the white on red cell style |
| // we defined above |
| c.setCellStyle(cs2); |
| } |
| |
| } |
| } |
| |
| //draw a thick black border on the row at the bottom using BLANKS |
| // advance 2 rows |
| rownum++; |
| rownum++; |
| |
| r = s.createRow(rownum); |
| |
| // define the third style to be the default |
| // except with a thick black border at the bottom |
| cs3.setBorderBottom(cs3.BORDER_THICK); |
| |
| //create 50 cells |
| for (short cellnum = (short) 0; cellnum < 50; cellnum++) |
| { |
| //create a blank type cell (no value) |
| c = r.createCell(cellnum); |
| // set it to the thick black border style |
| c.setCellStyle(cs3); |
| } |
| |
| //end draw thick black border |
| |
| |
| // demonstrate adding/naming and deleting a sheet |
| // create a sheet, set its title then delete it |
| s = wb.createSheet(); |
| wb.setSheetName(1, "DeletedSheet"); |
| wb.removeSheetAt(1); |
| //end deleted sheet |
| |
| // write the workbook to the output stream |
| // close our file (don't blow out our file handles |
| wb.write(out); |
| out.close(); |
| ]]></source> |
| </section> |
| <section title="Reading or modifying an existing file"> |
| |
| <p>Reading in a file is equally simple. To read in a file, create a |
| new instance of org.apache.poi.poifs.Filesystem, passing in an open InputStream, such as a FileInputStream |
| for your XLS, to the constructor. Construct a new instance of |
| org.apache.poi.hssf.usermodel.HSSFWorkbook passing the |
| Filesystem instance to the constructor. From there you have access to |
| all of the high level model objects through their assessor methods |
| (workbook.getSheet(sheetNum), sheet.getRow(rownum), etc). |
| </p> |
| <p>Modifying the file you have read in is simple. You retrieve the |
| object via an assessor method, remove it via a parent object's remove |
| method (sheet.removeRow(hssfrow)) and create objects just as you |
| would if creating a new xls. When you are done modifying cells just |
| call workbook.write(outputstream) just as you did above.</p> |
| <p>An example of this can be seen in |
| <link href="http://cvs.apache.org/viewcvs/~checkout~/jakarta-poi/src/java/org/apache/poi/hssf/dev/HSSF.java?rev=1.1">org.apache.poi.hssf.dev.HSSF</link>.</p> |
| </section> |
| </section> |
| <section title="Event API"> |
| |
| <p>The event API is brand new. It is intended for intermediate |
| developers who are willing to learn a little bit of the low level API |
| structures. Its relatively simple to use, but requires a basic |
| understanding of the parts of an Excel file (or willingness to |
| learn). The advantage provided is that you can read an XLS with a |
| relatively small memory footprint. |
| </p> |
| <p>To use this API you construct an instance of |
| org.apache.poi.hssf.eventmodel.HSSFRequest. Register a class you |
| create that supports the |
| org.apache.poi.hssf.eventmodel.HSSFListener interface using the |
| HSSFRequest.addListener(yourlistener, recordsid). The recordsid |
| should be a static reference number (such as BOFRecord.sid) contained |
| in the classes in org.apache.poi.hssf.record. The trick is you |
| have to know what these records are. Alternatively you can call |
| HSSFRequest.addListenerForAllRecords(mylistener). In order to learn |
| about these records you can either read all of the javadoc in the |
| org.apache.poi.hssf.record package or you can just hack up a |
| copy of org.apache.poi.hssf.dev.EFHSSF and adapt it to your |
| needs. TODO: better documentation on records.</p> |
| <p>Once you've registered your listeners in the HSSFRequest object |
| you can construct an instance of |
| org.apache.poi.poifs.filesystem.FileSystem (see POIFS howto) and |
| pass it your XLS file inputstream. You can either pass this, along |
| with the request you constructed, to an instance of HSSFEventFactory |
| via the HSSFEventFactory.processWorkbookEvents(request, Filesystem) |
| method, or you can get an instance of DocumentInputStream from |
| Filesystem.createDocumentInputStream("Workbook") and pass |
| it to HSSFEventFactory.processEvents(request, inputStream). Once you |
| make this call, the listeners that you constructed receive calls to |
| their processRecord(Record) methods with each Record they are |
| registered to listen for until the file has been completely read. |
| </p> |
| <p>A code excerpt from org.apache.poi.hssf.dev.EFHSSF (which is |
| in CVS or the source distribution) is reprinted below with excessive |
| comments:</p> |
| <source><![CDATA[ |
| /** |
| * This example shows how to use the event API for reading a file. |
| */ |
| public class EventExample |
| implements HSSFListener |
| { |
| private SSTRecord sstrec; |
| |
| /** |
| * This method listens for incoming records and handles them as required. |
| * @param record The record that was found while reading. |
| */ |
| public void processRecord(Record record) |
| { |
| switch (record.getSid()) |
| { |
| // the BOFRecord can represent either the beginning of a sheet or the workbook |
| case BOFRecord.sid: |
| BOFRecord bof = (BOFRecord) record; |
| if (bof.getType() == bof.TYPE_WORKBOOK) |
| { |
| System.out.println("Encountered workbook"); |
| // assigned to the class level member |
| } else if (bof.getType() == bof.TYPE_WORKSHEET) |
| { |
| System.out.println("Encountered sheet reference"); |
| } |
| break; |
| case BoundSheetRecord.sid: |
| BoundSheetRecord bsr = (BoundSheetRecord) record; |
| System.out.println("New sheet named: " + bsr.getSheetname()); |
| break; |
| case RowRecord.sid: |
| RowRecord rowrec = (RowRecord) record; |
| System.out.println("Row found, first column at " |
| + rowrec.getFirstCol() + " last column at " + rowrec.getLastCol()); |
| break; |
| case NumberRecord.sid: |
| NumberRecord numrec = (NumberRecord) record; |
| System.out.println("Cell found with value " + numrec.getValue() |
| + " at row " + numrec.getRow() + " and column " + numrec.getColumn()); |
| break; |
| // SSTRecords store a array of unique strings used in Excel. |
| case SSTRecord.sid: |
| sstrec = (SSTRecord) record; |
| for (int k = 0; k < sstrec.getNumUniqueStrings(); k++) |
| { |
| System.out.println("String table value " + k + " = " + sstrec.getString(k)); |
| } |
| break; |
| case LabelSSTRecord.sid: |
| LabelSSTRecord lrec = (LabelSSTRecord) record; |
| System.out.println("String cell found with value " |
| + sstrec.getString(lrec.getSSTIndex())); |
| break; |
| } |
| } |
| |
| /** |
| * Read an excel file and spit out what we find. |
| * |
| * @param args Expect one argument that is the file to read. |
| * @throws IOException When there is an error processing the file. |
| */ |
| public static void main(String[] args) throws IOException |
| { |
| // create a new file input stream with the input file specified |
| // at the command line |
| FileInputStream fin = new FileInputStream(args[0]); |
| // create a new org.apache.poi.poifs.filesystem.Filesystem |
| POIFSFileSystem poifs = new POIFSFileSystem(fin); |
| // get the Workbook (excel part) stream in a InputStream |
| InputStream din = poifs.createDocumentInputStream("Workbook"); |
| // construct out HSSFRequest object |
| HSSFRequest req = new HSSFRequest(); |
| // lazy listen for ALL records with the listener shown above |
| req.addListenerForAllRecords(new EventExample()); |
| // create our event factory |
| HSSFEventFactory factory = new HSSFEventFactory(); |
| // process our events based on the document input stream |
| factory.processEvents(req, din); |
| // once all the events are processed close our file input stream |
| fin.close(); |
| // and our document input stream (don't want to leak these!) |
| din.close(); |
| System.out.println("done."); |
| } |
| } |
| ]]></source> |
| </section> |
| <section title="Low Level APIs"> |
| |
| <p>The low level API is not much to look at. It consists of lots of |
| "Records" in the org.apache.poi.hssf.record.* package, |
| and set of helper classes in org.apache.poi.hssf.model.*. The |
| record classes are consistent with the low level binary structures |
| inside a BIFF8 file (which is embedded in a POIFS file system). You |
| probably need the book: "Microsoft Excel 97 Developer's Kit" |
| from Microsoft Press in order to understand how these fit together |
| (out of print but easily obtainable from Amazon's used books). In |
| order to gain a good understanding of how to use the low level APIs |
| should view the source in org.apache.poi.hssf.usermodel.* and |
| the classes in org.apache.poi.hssf.model.*. You should read the |
| documentation for the POIFS libraries as well.</p> |
| </section> |
| <section title="HSSF Class/Test Application"> |
| |
| <p>The HSSF application is nothing more than a test for the high |
| level API (and indirectly the low level support). The main body of |
| its code is repeated above. To run it: |
| </p> |
| <ul> |
| <li>download the poi-alpha build and untar it (tar xvzf |
| tarball.tar.gz) |
| </li> |
| <li>set up your classpath as follows: |
| <code>export HSSFDIR={wherever you put HSSF's jar files} |
| export LOG4JDIR={wherever you put LOG4J's jar files} |
| export CLASSPATH=$CLASSPATH:$HSSFDIR/hssf.jar:$HSSFDIR/poi-poifs.jar:$HSSFDIR/poi-util.jar:$LOG4JDIR/jog4j.jar</code> |
| </li><li>type: |
| <code>java org.apache.poi.hssf.dev.HSSF ~/myxls.xls write</code></li> |
| </ul> |
| <p></p> |
| <p>This should generate a test sheet in your home directory called <code>"myxls.xls"</code>. </p> |
| <ul> |
| <li>Type: |
| <code>java org.apache.poi.hssf.dev.HSSF ~/input.xls output.xls</code> |
| <br/> |
| <br/> |
| This is the read/write/modify test. It reads in the spreadsheet, modifies a cell, and writes it back out. |
| Failing this test is not necessarily a bad thing. If HSSF tries to modify a non-existant sheet then this will |
| most likely fail. No big deal. </li> |
| </ul> |
| </section> |
| <section title="HSSF Logging Facility"> |
| <p> |
| POI has a small amount of logging code embedded within it. Defining the system property |
| poi.logging will enable logging to standard out. |
| </p> |
| </section> |
| <section title="HSSF Developer's Tools"> |
| |
| <p>HSSF has a number of tools useful for developers to debug/develop |
| stuff using HSSF (and more generally XLS files). We've already |
| discussed the app for testing HSSF read/write/modify capabilities; |
| now we'll talk a bit about BiffViewer. Early on in the development of |
| HSSF, it was decided that knowing what was in a record, what was |
| wrong with it, etc. was virtually impossible with the available |
| tools. So we developed BiffViewer. You can find it at |
| org.apache.poi.hssf.dev.BiffViewer. It performs two basic |
| functions and a derivative. |
| </p> |
| <p>The first is "biffview". To do this you run it (assumes |
| you have everything setup in your classpath and that you know what |
| you're doing enough to be thinking about this) with an xls file as a |
| parameter. It will give you a listing of all understood records with |
| their data and a list of not-yet-understood records with no data |
| (because it doesn't know how to interpret them). This listing is |
| useful for several things. First, you can look at the values and SEE |
| what is wrong in quasi-English. Second, you can send the output to a |
| file and compare it. |
| </p> |
| <p>The second function is "big freakin dump", just pass a |
| file and a second argument matching "bfd" exactly. This |
| will just make a big hexdump of the file. |
| </p> |
| <p>Lastly, there is "mixed" mode which does the same as |
| regular biffview, only it includes hex dumps of certain records |
| intertwined. To use that just pass a file with a second argument |
| matching "on" exactly.</p> |
| <p>In the next release cycle we'll also have something called a |
| FormulaViewer. The class is already there, but its not very useful |
| yet. When it does something, we'll document it.</p> |
| |
| </section> |
| <section title="What's Next?"> |
| |
| <p>This release contains code that supports "internationalization" |
| or more accurately non-US/UK languages; however, it has not been |
| tested with the new API changes (please help us with this). We've |
| shifted focus a bit for this release in recognition of the |
| international support we've gotten. We're going to focus on western |
| European languages for our first beta. We're more than happy to |
| accept help in supporting non-Western European languages if someone |
| who knows what they're doing in this area is willing to pitch in! |
| (There is next to no documentation on what is necessary to support |
| such a move and its really hard to support a language when you don't even |
| know the alphabet).</p> |
| <p>This release of HSSF does not yet support Formulas. I've been |
| focusing on the requests I've gotten in. That being said, if we get |
| more user feedback on what is most useful first we'll aim for that. |
| As a general principal, HSSF's goal is to support HSSF-Serializer |
| (meaning an emphasis on write). We would like to hear from you! How |
| are you using HSSF/POIFS? How would you like to use it? What features |
| are most important first? |
| </p> |
| </section> |
| |
| </section> |
| |
| </section> |
| </body> |
| </document> |