| <!doctype html public "-//W3C//DTD HTML 4.0//EN//"> |
| <!-- |
| ==================================================================== |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| ==================================================================== |
| --> |
| <html> |
| <head> |
| <title>HPSF</title> |
| </head> |
| |
| <body> |
| <div> |
| <p>Processes streams in the Horrible Property Set Format (HPSF) in POI |
| filesystems. Microsoft Office documents, i.e. POI filesystems, usually |
| contain meta data like author, title, last saving time etc. These items |
| are called <strong>properties</strong> and stored in |
| <strong>property set streams</strong> along with the document itself. These |
| streams are commonly named <tt>\005SummaryInformation</tt> and |
| <tt>\005DocumentSummaryInformation</tt>. However, a POI filesystem may |
| contain further property sets of other names or types.</p> |
| |
| <p>In order to extract the properties from a POI filesystem, a property set |
| stream's contents must be parsed into a {@link |
| org.apache.poi.hpsf.PropertySet} instance. Its subclasses {@link |
| org.apache.poi.hpsf.SummaryInformation} and {@link |
| org.apache.poi.hpsf.DocumentSummaryInformation} deal with the well-known |
| property set streams <tt>\005SummaryInformation</tt> and |
| <tt>\005DocumentSummaryInformation</tt>. (However, the streams' names are |
| irrelevant. What counts is the property set's first section's format ID - |
| see below.)</p> |
| |
| <p>The factory method {@link org.apache.poi.hpsf.PropertySetFactory#create} |
| creates a {@link org.apache.poi.hpsf.PropertySet} instance. This method |
| always returns the <strong>most specific property set</strong>: If it |
| identifies the stream data as a Summary Information or as a Document |
| Summary Information it returns an instance of the corresponding class, else |
| the general {@link org.apache.poi.hpsf.PropertySet}.</p> |
| |
| <p>A {@link org.apache.poi.hpsf.PropertySet} contains a list of {@link |
| org.apache.poi.hpsf.Section}s which can be retrieved with {@link |
| org.apache.poi.hpsf.PropertySet#getSections}. Each {@link |
| org.apache.poi.hpsf.Section} contains a {@link |
| org.apache.poi.hpsf.Property} array which can be retrieved with {@link |
| org.apache.poi.hpsf.Section#getProperties}. Since the vast majority of |
| {@link org.apache.poi.hpsf.PropertySet}s contains only a single {@link |
| org.apache.poi.hpsf.Section}, the convenience method {@link |
| org.apache.poi.hpsf.PropertySet#getProperties} returns the properties of a |
| {@link org.apache.poi.hpsf.PropertySet}'s {@link |
| org.apache.poi.hpsf.Section} (throwing a {@link |
| org.apache.poi.hpsf.NoSingleSectionException} if the {@link |
| org.apache.poi.hpsf.PropertySet} contains more (or less) than exactly one |
| {@link org.apache.poi.hpsf.Section}).</p> |
| |
| <p>Each {@link org.apache.poi.hpsf.Property} has an <strong>ID</strong>, a |
| <strong>type</strong>, and a <strong>value</strong> which can be retrieved |
| with {@link org.apache.poi.hpsf.Property#getID}, {@link |
| org.apache.poi.hpsf.Property#getType}, and {@link |
| org.apache.poi.hpsf.Property#getValue}, respectively. The value's class |
| depends on the property's type. <!-- FIXME: --> The current implementation |
| does not yet support all property types and restricts the values' classes |
| to {@link java.lang.String}, {@link java.lang.Integer} and {@link |
| java.util.Date}. A value of a yet unknown type is returned as a byte array |
| containing the value's origin bytes from the property set stream.</p> |
| |
| <p>To retrieve the value of a specific {@link org.apache.poi.hpsf.Property}, |
| use {@link org.apache.poi.hpsf.Section#getProperty} or {@link |
| org.apache.poi.hpsf.Section#getPropertyIntValue}.</p> |
| |
| <p>The {@link org.apache.poi.hpsf.SummaryInformation} and {@link |
| org.apache.poi.hpsf.DocumentSummaryInformation} classes provide convenience |
| methods for retrieving well-known properties. For example, an application |
| that wants to retrieve a document's title string just calls {@link |
| org.apache.poi.hpsf.SummaryInformation#getTitle} instead of going through |
| the hassle of first finding out what the title's property ID is and then |
| using this ID to get the property's value.</p> |
| |
| <p>Writing properties can be done with the classes |
| {@link org.apache.poi.hpsf.MutablePropertySet}, {@link |
| org.apache.poi.hpsf.MutableSection}, and {@link |
| org.apache.poi.hpsf.MutableProperty}.</p> |
| |
| <p>Public documentation from Microsoft can be found in the <a |
| href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/stg/stg/properties_and_property_sets.asp" |
| target="_blank">appropriate section of the MSDN Library</a>.</p> |
| |
| <div> |
| <h2>History</h2> |
| |
| <dl> |
| <dt>2003-09-11:</dt> |
| |
| <dd> |
| <p>{@link org.apache.poi.hpsf.PropertySetFactory#create(InputStream)} no |
| longer throws an |
| {@link org.apache.poi.hpsf.UnexpectedPropertySetTypeException}.</p></dd> |
| </dl> |
| </div> |
| |
| |
| <div> |
| <h2>To Do</h2> |
| |
| <p>The following is still left to be implemented. Sponsering could foster |
| these issues considerably.</p> |
| |
| <ul> |
| |
| <li> |
| <p>Convenience methods for setting summary information and document |
| summary information properties</p> |
| </li> |
| |
| <li> |
| <p>Better codepage support</p> |
| </li> |
| |
| <li> |
| <p>Support for more property (variant) types</p> |
| </li> |
| |
| </ul> |
| |
| </div> |
| |
| <p> |
| @author Rainer Klute (klute@rainer-klute.de) |
| @version $Id$ |
| @since 2002-02-09 |
| </p> |
| </div> |
| |
| </body> |
| </html> |
| |
| <!-- Keep this comment at the end of the file |
| Local variables: |
| sgml-default-dtd-file:"HTML_4.0_Strict.ced" |
| mode: html |
| sgml-omittag:t |
| sgml-shorttag:nil |
| sgml-namecase-general:t |
| sgml-general-insert-case:lower |
| sgml-minimize-attributes:nil |
| sgml-always-quote-attributes:t |
| sgml-indent-step:1 |
| sgml-indent-data:t |
| sgml-parent-document:nil |
| sgml-exposed-tags:nil |
| sgml-local-catalogs:nil |
| sgml-local-ecat-files:nil |
| End: |
| --> |