| |
| Apache UIMA (Unstructured Information Management Architecture) v2.3.1 SDK |
| ------------------------------------------------------------------------- |
| |
| Building from the Source Distribution |
| ------------------------------------- |
| |
| We use Maven 3.0 for building; download this if needed, |
| and set the environment variable MAVEN_OPTS to -Xmx800m -XX:MaxPerSize-256m. |
| |
| Then build everything by going into the .../uimaj directory, and issuing the command |
| mvn clean install. |
| |
| Look for the result in the two artifacts: |
| ../uimaj/target/uimaj-[version]-source-release.zip and |
| ../uimaj-distr/target/uimaj-[version]-bin.zip (or ...tar.gz) |
| |
| For more details, please see http://uima.apache.org/building-uima.html |
| |
| What's New in 2.3.1 |
| ------------------- |
| |
| CAS Editor |
| ---------- |
| The Cas Editor received a few important enhancements aimed at making it |
| the preferred editor and viewer for CAS files inside a UIMA project. |
| It is now possible to open CAS files from any location within an |
| eclipse workspace. |
| |
| Based on feedback from our users the Cas Editor received a number of |
| usability enhancements and bug fixes. |
| |
| A new view to show the annotation colors and visualization style was added. |
| The view interacts with the editor and can be used to define which |
| kind of annotations are displayed. |
| |
| There is now a new annotation style to display features values in between |
| the text lines. This is especially useful to show Part-Of-Speech tags, |
| entity categories or a stemming. |
| |
| There have been a couple of minor improvements: the editor context menu |
| now shows keyboard shortcuts, and it is possible to choose the encoding |
| and resulting CAS format when importing a text file. |
| |
| |
| Build |
| ----- |
| |
| The build process was redone to align it with normal Maven build procedures, where possible. |
| There is a new top-level directory in our SVN called "build" which holds the build tooling. |
| The new build tooling is described in general on the website |
| http://uima.apache.org/maven-design.html |
| |
| |
| Result Specification |
| -------------------- |
| |
| The implementation handling the result specification was redone to correct several corner case issues. |
| |
| |
| Performance |
| ----------- |
| |
| The Component Descriptor Editor (CDE) (an Eclipse tool) had potentially poor performance when |
| editing aggregate descriptors that referenced many remote components. The performance in this |
| case has been greatly improved, using a caching technique and eliminating repetitive remote accesses. |
| |
| A complete list of fixes is in issuesFixed/jira-report.hmtl. |
| |
| Supported Platforms |
| -------------------- |
| |
| Apache UIMA requires Java level 1.5; it has been tested with Sun Java SDK v5.0 and v6.0, and IBM Java 6.0. |
| Running the Eclipse plugin tooling for UIMA requires you start Eclipse using a Java 1.5 or later, as well. |
| The supported platforms are: Windows, Linux, Solaris, AIX and Mac OS X. |
| Other platforms and Java (1.5+) implementations should work, but have not been significantly tested. |
| |
| Many of the scripts in the /bin directory invoke Java. They use the value of the environment variable, JAVA_HOME, |
| to locate the Java to use; if it is not set, they invoke "java" expecting to find an appropriate Java in your PATH. |
| |
| |
| Environment Variables |
| ---------------------- |
| |
| After you have unpacked the Apache UIMA distribution from the package of your choice (e.g. .zip or .gz), |
| perform the steps below to set up UIMA so that it will function properly. |
| |
| * Set JAVA_HOME to the directory of your JRE installation you would like to use for UIMA. |
| * Set UIMA_HOME to the apache-uima directory of your unpacked Apache UIMA distribution |
| * Append UIMA_HOME/bin to your PATH |
| |
| * Please run the script UIMA_HOME/bin/adjustExamplePaths.bat (or .sh), to update |
| paths in the examples based on the actual UIMA_HOME directory path. |
| This script runs a Java program; |
| you must either have java in your PATH or set the environment variable JAVA_HOME to a |
| suitable JRE. |
| |
| Note: The Mac OS X operating system procedures for setting up global environment |
| variables are described here: see http://developer.apple.com/qa/qa2001/qa1067.html. |
| |
| |
| Verifying Your Installation |
| ---------------------------- |
| |
| To test the installation, run the documentAnalyzer.bat (or .sh) file located in the bin subdirectory. |
| This should pop up a "Document Analyzer" window. Set the values displayed in this GUI to as follows: |
| |
| * Input Directory: UIMA_HOME/examples/data |
| * Output Directory: UIMA_HOME/examples/data/processed |
| * Location of Analysis Engine XML Descriptor: UIMA_HOME/examples/descriptors/analysis_engine/PersonTitleAnnotator.xml |
| |
| Replace UIMA_HOME above with the path of your Apache UIMA installation. |
| |
| Next, click the "Run" button, which should, after a brief pause, pop up an "Analyzed Results" window. |
| Double-click on one of the documents to display the analysis results for that document. |
| |
| |
| Getting Started |
| ---------------- |
| |
| For an introduction to Apache UIMA and how to use it, please read the documentation |
| located in the docs subdirectory. A good place to start is the overview_and_setup |
| book's first chapter, which has a brief guide to the documentation. |