blob: cf78c208081724024836abbd5b5a3f5d516968d9 [file] [log] [blame]
<!doctype html public "-//W3C//DTD HTML 4.0//EN//">
<!--
/* ====================================================================
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==================================================================== */
-->
<html>
<head>
<title></title>
</head>
<body>
<div>
<p>The <strong>POI Browser</strong> is a very simple Swing GUI tool that
displays the internal structure of a Microsoft Office file. It concentrates
on streams in the <em>Horrible Property Set Format (HPSF)</em>. In order to
access these streams the POI Browser uses the package
<tt>org.apache.poi.hpsf</tt>.</p>
<p>A file in Microsoft's Office format can be seen as a filesystem within a
file. For example, a Word document like <var>sample.doc</var> is just a
simple file from the operation system's point of view. However, internally
it is organized into various directories and files. For example,
<var>sample.doc</var> might consist of the three internal files (or
"streams", as Microsoft calls them) <tt>\001CompObj</tt>,
<tt>\005SummaryInformation</tt>, and <tt>WordDocument</tt>. (In these names
\001 and \005 denote the unprintable characters with the character codes 1
and 5, respectively.) A more complicated Word file typically contains a
directory named <tt>ObjectPool</tt> with more directories and files nested
within it.</p>
<p>The POI Browser makes these internal structures visible. It takes one or
more Microsoft files as input on the command line and shows directories and
files in a tree-like structure. On the top-level POI Browser displays the
(operating system) filenames. An internal file (i.e. a "stream" or a
"document") is shown with its name, its size and a hexadecimal dump of its
first bytes.</p>
</div>
<div>
<h3>Property Set Streams</h3>
<p>The POI Browser pays special attention to property set streams. For
example, the <tt>\005SummaryInformation</tt> stream contains information
like title and author of the document. The POI Browser opens every stream
in a POI filesystem. If it encounters a property set stream, it displays
not just its first bytes but analyses the whole stream and displays its
contents in a more or less readable manner.</p>
</div>
<div>
<h3>Running POI Browser</h3>
<p>Running the POI Browser requires you to start a Java Virtual Machine
(JVM) and to set up a valid classpath so that the JVM can find all the Java
classes it needs. These are the main POI classes and the "contrib" POI
classes.</p>
<p>The following instructions assume that you have set up your Java
enviromnent variables properly, i.e. the variable JAVA_HOME contains the
name of your Java installation directory and the variable PATH includes the
<var>bin</var> subdirectory of the Java installation directory. At the time
of this writing the current POI version was 2.5.1-final dating from August
4th, 2004. The example statements reflect version numbering and
date. Change the commands accordingly if you are running the POI Browser of
a later or earlier than this!</p>
<div>
<h4>Running POI Browser on Unix</h4>
<p>Suppose you have unpacked the POI&nbsp;2.5.1 release in the
<var>/opt/local/poi</var> directory of your Unix box. Then the following
command starts the POI Browser and displays the structure of the files
<var>MyWord.doc</var>, <var>MyExcel.xls</var> and
<var>MyPowerpoint.ppt</var>:</p>
<pre>java -classpath /opt/local/poi/poi-2.5.1-final-20040804.jar:/opt/local/poi/poi-contrib-2.5.1-final-20040804.jar org.apache.poi.contrib.poibrowser.POIBrowser MyWord.doc MyExcel.xls MyPowerpoint.ppt</pre>
</div>
<div>
<h4>Running POI Browser on Windows</h4>
<p>Suppose you have unpacked the POI&nbsp;2.5.1 release in the
<var>C:\Programs\POI</var> directory of your Windows box. Then the following
command starts the POI Browser and displays the structure of the files
<var>MyWord.doc</var>, <var>MyExcel.xls</var> and
<var>MyPowerpoint.ppt</var>:</p>
<pre>java -classpath C:\Programs\POI\poi-2.5.1-final-20040804.jar;C:\Programs\POI\poi-contrib-2.5.1-final-20040804.jar org.apache.poi.contrib.poibrowser.POIBrowser MyWord.doc MyExcel.xls MyPowerpoint.ppt</pre>
</div>
</div>
</body>
</html>
<!-- Keep this comment at the end of the file
Local variables:
sgml-default-dtd-file:"HTML_4.0_Strict.ced"
mode: html
sgml-omittag:t
sgml-shorttag:nil
sgml-namecase-general:t
sgml-general-insert-case:lower
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
-->