| <?xml version="1.0" encoding="UTF-8"?> |
| <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" |
| "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[ |
| ]> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| <chapter id="org.apche.opennlp.uima"> |
| <title>UIMA Integration</title> |
| <para> |
| The UIMA Integration wraps the OpenNLP components in UIMA Analysis Engines which can |
| be used to automatically annotate text and train new OpenNLP models from annotated text. |
| </para> |
| <section id="org.apche.opennlp.running-pear-sample"> |
| <title>Running the pear sample in CVD</title> |
| <para> |
| The Cas Visual Debugger is shipped as part of the UIMA distribution and is a tool which can run |
| the OpenNLP UIMA Annotators and display their analysis results. The source distribution comes with a script |
| which can create a sample UIMA application. Which includes the sentence detector, tokenizer, |
| pos tagger, chunker and name finders for English. This sample application is packaged in the |
| pear format and must be installed with the pear installer before it can be run by CVD. |
| Please consult the UIMA documentation for further information about the pear installer. |
| </para> |
| <para> |
| The OpenNLP UIMA pear file must be build manually. |
| First download the source distribution, unzip it and go to the apache-opennlp/opennlp folder. |
| Type "mvn install" to build everything. Now build the pear file, go to apache-opennlp/opennlp-uima |
| and build it as shown below. Note the models will be downloaded |
| from the old SourceForge repository and are not licensed under the AL 2.0. |
| <screen> |
| <![CDATA[ |
| $ ant -f createPear.xml |
| Buildfile: createPear.xml |
| |
| createPear: |
| [echo] ##### Creating OpenNlpTextAnalyzer pear ##### |
| [copy] Copying 13 files to OpenNlpTextAnalyzer/desc |
| [copy] Copying 1 file to OpenNlpTextAnalyzer/metadata |
| [copy] Copying 1 file to OpenNlpTextAnalyzer/lib |
| [copy] Copying 3 files to OpenNlpTextAnalyzer/lib |
| [mkdir] Created dir: OpenNlpTextAnalyzer/models |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-token.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-token.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-sent.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-sent.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-date.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-ner-date.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-location.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-ner-location.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-money.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-ner-money.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-organization.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-ner-organization.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-percentage.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-ner-percentage.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-person.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-ner-person.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-ner-time.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-ner-time.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-pos-maxent.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-pos-maxent.bin |
| [get] Getting: http://opennlp.sourceforge.net/models-1.5/en-chunker.bin |
| [get] To: OpenNlpTextAnalyzer/models/en-chunker.bin |
| [zip] Building zip: OpenNlpTextAnalyzer.pear |
| |
| BUILD SUCCESSFUL |
| Total time: 3 minutes 20 seconds]]> |
| </screen> |
| </para> |
| <para> |
| After the pear is installed start the Cas Visual Debugger shipped with the UIMA framework. |
| And click on Tools -> Load AE. Then select the opennlp.uima.OpenNlpTextAnalyzer_pear.xml |
| file in the file dialog. Now enter some text and start the analysis engine with |
| "Run -> Run OpenNLPTextAnalyzer". Afterwards the results will be displayed. |
| You should see sentences, tokens, chunks, pos tags and maybe some names. Remember the input text |
| must be written in English. |
| </para> |
| </section> |
| <section id="org.apche.opennlp.further-help"> |
| <title>Further Help</title> |
| <para> |
| For more information about how to use the integration please consult the javadoc of the individual |
| Analysis Engines and checkout the included xml descriptors. |
| </para> |
| <para> |
| TODO: Extend this documentation with information about the individual components. |
| If you want to contribute please contact us on the mailing list |
| or comment on the jira issue <ulink url="https://issues.apache.org/jira/browse/OPENNLP-49">OPENNLP-49</ulink>. |
| </para> |
| </section> |
| </chapter> |