| |
| Apache UIMA (Unstructured Information Management Architecture) v2.4.0 SDK |
| ------------------------------------------------------------------------- |
| |
| Building from the Source Distribution |
| ------------------------------------- |
| |
| We use Maven 3.0 for building; download this if needed, |
| and set the environment variable MAVEN_OPTS to -Xmx800m -XX:MaxPerSize-256m. |
| |
| Then do the build by going into the .../uimaj directory, and issuing the command |
| mvn clean install |
| |
| This builds everything except the ...source-release.zip file. If you want that, |
| change the command to |
| |
| mvn clean install -Papache-release |
| |
| Look for the result here: |
| target/uimaj-[version]-source-release.zip (if run with -Papache-release) |
| |
| For more details, please see http://uima.apache.org/building-uima.html |
| |
| What's New in 2.4.0 |
| ------------------- |
| |
| There was a change in the API (methods added) used for JMX monitoring of performance statistics; |
| because of this, we incremented the version from 2.3.x to 2.4.0. |
| |
| Other than this, there are some bug fixes, and some tooling enhancements. |
| |
| CAS Editor (Eclipse Tooling) |
| ---------------------------- |
| The Cas Editor received a few important enhancements to make it |
| extensible by user provided plugins. It is now possible to save |
| generic preferences based on a type system scope and the Cas Editor |
| was split into two parts to allow the integration into RCP applications. |
| |
| Support for Cas Editor projects is now removed, existing Cas Editor |
| projects will be migrated automatically. Before this happens a dialog |
| ask for confirmation. |
| |
| There have been a couple of minor improvements: Adding of annotations |
| to a CAS is now much faster, reusing of Annotation Editor instances is |
| now possible, CAS file changes can now be detected and the changed file |
| is shown in the editor, many things are now remembered after a dialog |
| is reopened or an editor is reopened. |
| |
| Eclipse Analysis Engine Launcher Plugin |
| --------------------------------------- |
| We added a new launcher plugin to run and debug Analysis Engines directly |
| from Eclipse. The plugin can load input CASes in the XCAS or XMI format |
| from a specified folder and then processes them with the Analysis Engine. |
| The files can optionally be written to an output folder for inspection |
| with the Cas Editor. Plain text input files are supported as well. |
| |
| A command line Pear Installer tool was added. |
| |
| |
| Build |
| ----- |
| |
| The build process was redone to align it with normal Maven build procedures, where possible. |
| This includes moving the top level project, uimaj, up one level in SVN, so it now contains the |
| modules. |
| |
| |
| Supported Platforms |
| -------------------- |
| |
| Apache UIMA requires Java level 1.5; it has been tested with Sun Java SDK v5.0 and v6.0, and IBM Java 6.0. |
| Running the Eclipse plugin tooling for UIMA requires you start Eclipse using a Java 1.5 or later, as well. |
| The supported platforms are: Windows, Linux, Solaris, AIX and Mac OS X. |
| Other platforms and Java (1.5+) implementations should work, but have not been significantly tested. |
| |
| Many of the scripts in the /bin directory invoke Java. They use the value of the environment variable, JAVA_HOME, |
| to locate the Java to use; if it is not set, they invoke "java" expecting to find an appropriate Java in your PATH. |
| |
| |
| Environment Variables |
| ---------------------- |
| |
| After you have unpacked the Apache UIMA distribution from the package of your choice (e.g. .zip or .gz), |
| perform the steps below to set up UIMA so that it will function properly. |
| |
| * Set JAVA_HOME to the directory of your JRE installation you would like to use for UIMA. |
| * Set UIMA_HOME to the apache-uima directory of your unpacked Apache UIMA distribution |
| * Append UIMA_HOME/bin to your PATH |
| |
| * Please run the script UIMA_HOME/bin/adjustExamplePaths.bat (or .sh), to update |
| paths in the examples based on the actual UIMA_HOME directory path. |
| This script runs a Java program; |
| you must either have java in your PATH or set the environment variable JAVA_HOME to a |
| suitable JRE. |
| |
| Note: The Mac OS X operating system procedures for setting up global environment |
| variables are described here: see http://developer.apple.com/qa/qa2001/qa1067.html. |
| |
| |
| Verifying Your Installation |
| ---------------------------- |
| |
| To test the installation, run the documentAnalyzer.bat (or .sh) file located in the bin subdirectory. |
| This should pop up a "Document Analyzer" window. Set the values displayed in this GUI to as follows: |
| |
| * Input Directory: UIMA_HOME/examples/data |
| * Output Directory: UIMA_HOME/examples/data/processed |
| * Location of Analysis Engine XML Descriptor: UIMA_HOME/examples/descriptors/analysis_engine/PersonTitleAnnotator.xml |
| |
| Replace UIMA_HOME above with the path of your Apache UIMA installation. |
| |
| Next, click the "Run" button, which should, after a brief pause, pop up an "Analyzed Results" window. |
| Double-click on one of the documents to display the analysis results for that document. |
| |
| |
| Getting Started |
| ---------------- |
| |
| For an introduction to Apache UIMA and how to use it, please read the documentation |
| located in the docs subdirectory. A good place to start is the overview_and_setup |
| book's first chapter, which has a brief guide to the documentation. |