<?xml version="1.0" encoding="UTF-8"?> | |
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" | |
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[ | |
<!ENTITY imgroot "images/tools/tools.jcasgen/" > | |
<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" > | |
%uimaents; | |
]> | |
<!-- | |
Licensed to the Apache Software Foundation (ASF) under one | |
or more contributor license agreements. See the NOTICE file | |
distributed with this work for additional information | |
regarding copyright ownership. The ASF licenses this file | |
to you under the Apache License, Version 2.0 (the | |
"License"); you may not use this file except in compliance | |
with the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, | |
software distributed under the License is distributed on an | |
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | |
KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations | |
under the License. | |
--> | |
<chapter id="ugr.tools.jcasgen"> | |
<title>JCasGen User's Guide</title> | |
<para>JCasGen reads a descriptor for an application (either an Analysis Engine Descriptor, | |
or a Type System Descriptor), creates the merged type system | |
specification by merging all the type system information from all the components | |
referred to in the descriptor, and then uses this merged type system to create Java source | |
files for classes that enable JCas access to the CAS. Java classes are not produced for the | |
built-in types, since these classes are already provided by the UIMA SDK. (An exception is | |
the built-in type <literal>uima.tcas.DocumentAnnotation</literal>, see the warning below.) </para> | |
<warning><para>If the components comprising the input to the type merging process | |
have different definitions for the same type name, | |
JCasGen will show a warning, and in some environments may offer to abort the operation. | |
If you continue past this warning, | |
JCasGen will produce correct Java source files representing the merged types | |
(that is, the | |
type definition containing all of the features defined on that type by all of the | |
components). It is recommended that you do not use this capability (of having | |
two different definitions for the same type name, with different feature sets) since it can make it | |
difficult to combine/package your annotator with others. See <olink targetdoc="&uima_docs_ref;"/> | |
<olink targetdoc="&uima_docs_ref;" | |
targetptr="ugr.ref.jcas.merging_types_from_other_specs"/> for more information. | |
</para> | |
<para>Also note that if your type system declares a custom version of the | |
<literal>uima.tcas.DocumentAnnotation</literal> | |
built-in type, then JCasGen will generate a Java source file for it. If you do this, you need to be | |
aware of the issues discussed in <olink targetdoc="&uima_docs_ref;"/> | |
<olink | |
targetdoc="&uima_docs_ref;" | |
targetptr="ugr.ref.jcas.documentannotation_issues"/>.</para></warning> | |
<para>JCasGen can be run in many ways. For Eclipse users using the Component Descriptor Editor, there's a button | |
on the Type System Description page to run it on that type system. There's also a jcasgen-maven-plugin to use | |
in maven build scripts. There's a menu-driven GUI tool for it. | |
And, there are command line scripts you can use to invoke it.</para> | |
<para>There are several versions of JCasGen. The basic version reads an XML descriptor | |
which contains a type system descriptor, and generates the corresponding Java Class | |
Models for those types. Variants exist for the Eclipse environment that allow merging the | |
newly generated Java source code with previously augmented versions; see <olink | |
targetdoc="&uima_docs_ref;"/> <olink targetdoc="&uima_docs_ref;" | |
targetptr="ugr.ref.jcas.augmenting_generated_code"/> for a discussion of how the | |
Java Class Models can be augmented by adding additional methods and fields.</para> | |
<para>Input to JCasGen needs to be mostly self-contained. In particular, any types that are | |
defined to depend on user-defined supertypes must have that supertype defined, if the | |
supertype is <literal>uima.tcas.Annotation </literal>or a subtype of it. Any features | |
referencing ranges which are subtypes of uima.cas.String must have those subtypes | |
included. If this is not followed, a warning message is given stating that the resulting | |
generation may be inaccurate.</para> | |
<para>JCasGen is typically invoked automatically when using the Component Descriptor | |
Editor (see <olink targetdoc="&uima_docs_tools;" | |
targetptr="ugr.tools.cde.auto_jcasgen"/>), but can also be run using a shell | |
script. These scripts can take 0, 1, or 2 arguments. The first argument is the location of | |
the file containing the input XML descriptor. The second argument specifies where the | |
generated Java source code should go. If it isn't given, JCasGen generates its | |
output into a subfolder called JCas (or sometimes JCasNew – see below), of the first | |
argument's path.</para> | |
<para>The first argument, the input file, can be written as | |
<literal>jar:<url>!{entry}</literal>, for example: | |
<literal>jar:http://www.foo.com/bar/baz.jar!/COM/foo/quux.class</literal></para> | |
<para>If no arguments are given to JCasGen, then it launches a GUI to interact with the user | |
and ask for the same input. The GUI will remember the arguments you previously used. | |
Here's what it looks like: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.8in" format="JPG" fileref="&imgroot;image002.jpg"/> | |
</imageobject> | |
<textobject><phrase>JCasGen tool showing fields for input arguments</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>When running with automatic merging of the generated Java source with previously | |
augmented versions, the output location is where the merge function obtains the source | |
for the merge operation.</para> | |
<para>As is customary for Java, the generated class source files are placed in the | |
appropriate subdirectory structure according to Java conventions that correspond to | |
the package (name space) name.</para> | |
<para>The Java classes must be compiled and the resulting class files included in the class | |
path of your application; you make these classes available for other annotator writers | |
using your types, perhaps packaged as an xxx.jar file. If the xxx.jar file is made to | |
contain only the Java Class Models for the CAS types, it can be reused by any users of these | |
types.</para> | |
<section id="ugr.tools.jcasgen.running_without_eclipse"> | |
<title>Running stand-alone without Eclipse</title> | |
<para>There is no capability to automatically merge the generated Java source with | |
previous versions, unless running with Eclipse. If run without Eclipse, no automatic | |
merging of the generated Java source is done with any previous versions. In this case, | |
the output is put in a folder called <quote>JCasNew</quote> unless overridden by | |
specifying a second argument.</para> | |
<para>The distribution includes a shell script/bat file to run the stand-alone version, | |
called jcasgen.</para> | |
</section> | |
<section id="ugr.tools.jcasgen.running_standalone_with_eclipse"> | |
<title>Running stand-alone with Eclipse</title> | |
<para>If you have Eclipse and EMF (EMF = Eclipse Modeling Framework; both of these are | |
available from <ulink url="http://www.eclipse.org"/>) installed (version 3 or | |
later) JCasGen can merge the Java code it generates with previous versions, picking up | |
changes you might have inserted by hand. The output (and source of the merge input) is in a | |
folder <quote>JCas</quote> under the same path as the input XML file, unless | |
overridden by specifying a second argument.</para> | |
<para>You must install the UIMA plug-ins into Eclipse to enable this function.</para> | |
<para>The distribution includes a shell script/bat file to run the stand-alone with | |
Eclipse version, called jcasgen_merge. This works by starting Eclipse in | |
<quote>headless</quote> mode (no GUI) and invoking JCasGen within Eclipse. You will | |
need to set the ECLIPSE_HOME environment variable or modify the jcasgen_merge shell | |
script to specify where to find Eclipse. The version of Eclipse needed is 3 or higher, | |
with the EMF plug-in and the UIMA runtime plug-in installed. A temporary workspace is | |
used; the name/location of this is customizable in the shell script.</para> | |
<para>Log and error messages are written to the UIMA log. This file is called uima.log, and | |
is located in the default working directory, which if not overridden, is the startup | |
directory of Eclipse.</para> | |
</section> | |
<section id="ugr.tools.jcasgen.running_within_eclipse"> | |
<title>Running within Eclipse</title> | |
<para>There are two ways to run JCasGen within Eclipse. The first way is to configure an | |
Eclipse external tools launcher, and use it to run the stand-alone shell scripts, with | |
the arguments filled in. Here's a picture of a typical launcher configuration | |
screen (you get here by navigating from the top menu: Run –> External Tools | |
–> External tools...). | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.8in" format="JPG" fileref="&imgroot;image004.jpg"/> | |
</imageobject> | |
<textobject><phrase>Running JCasGen within Eclipse using the external tool launcher</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>The second way (which is the normal way it's done) to run within Eclipse is to use the | |
Component Descriptor Editor (CDE) (see <olink targetdoc="&uima_docs_tools;" | |
targetptr="ugr.tools.cde"/>). This tool can be configured to automatically | |
launch JCasGen whenever the type system descriptor is modified. In this release, this | |
operation completely regenerates the files, even if just a small thing changed. For | |
very large type systems, you probably don't want to enable this all the time. The | |
configurator tool has an option to enable/disable this function.</para> | |
</section> | |
<section id="ugr.tools.jcasgen.maven_plugin"> | |
<title>Using the jcasgen-maven-plugin</title> | |
<para>For Maven builds, you can use the jcasgen-maven-plugin to take one or more | |
top level descriptors (Type System or Analysis Engine descriptors), merge them | |
together in the standard way UIMA merges type definitions, and produce the corresponding | |
JCas source classes. These, by default, are generated to the standard spot for Maven | |
builds for generated files.</para> | |
<para>You can use ant-like include / exclude patterns to specify the top level descriptor | |
files. If you set <limitToProject> to true, then after a complete UIMA type system | |
merge is done with all of the types, including those that are imported, only those | |
types which are defined within this Maven project (that is, in some subdirectory of the project) | |
will be generated.</para> | |
<para>To use the jcasgen-maven-plugin, specify it in the POM as follows:</para> | |
<programlisting><![CDATA[<plugin> | |
<groupId>org.apache.uima</groupId> | |
<artifactId>jcasgen-maven-plugin</artifactId> | |
<version>2.4.1</version> <!-- change this to the latest version --> | |
<executions> | |
<execution> | |
<goals><goal>generate</goal></goals> <!-- this is the only goal --> | |
<!-- runs in phase process-resources by default --> | |
<configuration> | |
<!-- REQUIRED --> | |
<typeSystemIncludes> | |
<!-- one or more ant-like file patterns | |
identifying top level descriptors --> | |
<typeSystemInclude>src/main/resources/MyTs.xml | |
</typeSystemInclude> | |
</typeSystemIncludes> | |
<!-- OPTIONAL --> | |
<!-- a sequence of ant-like file patterns | |
to exclude from the above include list --> | |
<typeSystemExcludes> | |
</typeSystemExcludes> | |
<!-- OPTIONAL --> | |
<!-- where the generated files go --> | |
<!-- default value: | |
${project.build.directory}/generated-sources/jcasgen" --> | |
<outputDirectory> | |
</outputDirectory> | |
<!-- true or false, default = false --> | |
<!-- if true, then although the complete merged type system | |
will be created internally, only those types whose | |
definition is contained within this maven project will be | |
generated. The others will be presumed to be | |
available via other projects. --> | |
<!-- OPTIONAL --> | |
<limitToProject>false</limitToProject> | |
</configuration> | |
</execution> | |
</executions> | |
</plugin>]]></programlisting> | |
</section> | |
</chapter> |