blob: a8bc5279b7d4036cba731052b8e9c6bb86aaae24 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
]>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<chapter id="org.apche.opennlp.uima">
<title>UIMA Integration</title>
<para>
The UIMA Integration wraps the OpenNLP components in UIMA Analysis Engines which can
be used to automatically annotate text and train new OpenNLP models from annotated text.
</para>
<section id="org.apche.opennlp.running-pear-sample">
<title>Running the pear sample in CVD</title>
<para>
The Cas Visual Debugger is shipped as part of the UIMA distribution and is a tool which can run
the OpenNLP UIMA Annotators and display their analysis results. The source distribution comes with a script
which can create a sample UIMA application. Which includes the sentence detector, tokenizer,
pos tagger, chunker and name finders for English. This sample application is packaged in the
pear format and must be installed with the pear installer before it can be run by CVD.
Please consult the UIMA documentation for further information about the pear installer.
</para>
<para>
The OpenNLP UIMA pear file must be build manually.
First download the source distribution, unzip it and go to the apache-opennlp/opennlp folder.
Type "mvn install" to build everything. Now build the pear file, go to apache-opennlp/opennlp-uima
and build it as shown below. Note the models will be downloaded
from the old SourceForge repository and are not licensed under the AL 2.0.
<screen>
<![CDATA[
$ ant -f createPear.xml
Buildfile: createPear.xml
createPear:
[echo] ##### Creating OpenNlpTextAnalyzer pear #####
[copy] Copying 13 files to OpenNlpTextAnalyzer/desc
[copy] Copying 1 file to OpenNlpTextAnalyzer/metadata
[copy] Copying 1 file to OpenNlpTextAnalyzer/lib
[copy] Copying 3 files to OpenNlpTextAnalyzer/lib
[mkdir] Created dir: OpenNlpTextAnalyzer/models
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-token.bin
[get] To: OpenNlpTextAnalyzer/models/en-token.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-sent.bin
[get] To: OpenNlpTextAnalyzer/models/en-sent.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-date.bin
[get] To: OpenNlpTextAnalyzer/models/en-ner-date.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-location.bin
[get] To: OpenNlpTextAnalyzer/models/en-ner-location.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-money.bin
[get] To: OpenNlpTextAnalyzer/models/en-ner-money.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-organization.bin
[get] To: OpenNlpTextAnalyzer/models/en-ner-organization.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-percentage.bin
[get] To: OpenNlpTextAnalyzer/models/en-ner-percentage.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-person.bin
[get] To: OpenNlpTextAnalyzer/models/en-ner-person.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-time.bin
[get] To: OpenNlpTextAnalyzer/models/en-ner-time.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-pos-maxent.bin
[get] To: OpenNlpTextAnalyzer/models/en-pos-maxent.bin
[get] Getting: http://opennlp.sourceforge.net/models-1.5/en-chunker.bin
[get] To: OpenNlpTextAnalyzer/models/en-chunker.bin
[zip] Building zip: OpenNlpTextAnalyzer.pear
BUILD SUCCESSFUL
Total time: 3 minutes 20 seconds]]>
</screen>
</para>
<para>
After the pear is installed start the Cas Visual Debugger shipped with the UIMA framework.
And click on Tools -> Load AE. Then select the opennlp.uima.OpenNlpTextAnalyzer_pear.xml
file in the file dialog. Now enter some text and start the analysis engine with
"Run -> Run OpenNLPTextAnalyzer". Afterwards the results will be displayed.
You should see sentences, tokens, chunks, pos tags and maybe some names. Remember the input text
must be written in English.
</para>
</section>
<section id="org.apche.opennlp.further-help">
<title>Further Help</title>
<para>
For more information about how to use the integration please consult the javadoc of the individual
Analysis Engines and checkout the included xml descriptors.
</para>
<para>
TODO: Extend this documentation with information about the individual components.
If you want to contribute please contact us on the mailing list
or comment on the jira issue <ulink url="https://issues.apache.org/jira/browse/OPENNLP-49">OPENNLP-49</ulink>.
</para>
</section>
</chapter>