blob: e88bf6a49b63cd8bcf34773d911b6ddf1f31d0c5 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[
<!ENTITY % uimaents SYSTEM "../entities.ent" >
%uimaents;
]>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<chapter id="ugr.tools.pear.merger">
<title>PEAR Merger User&apos;s Guide</title>
<para>The PEAR Merger utility takes two or more PEAR files and merges their contents,
creating a new PEAR which has, in turn, a new Aggregate analysis engine whose delegates are
the components from the original files being merged. It does this by (1) copying the
contents of the input components into the output component, placing each component into a
separate subdirectory, (2) generating a UIMA descriptor for the output Aggregate
analysis engine and (3) creating an output PEAR file that encapsulates the output
Aggregate.</para>
<para>The merge logic is quite simple, and is intended to work for simple cases. More complex
merging needs to be done by hand. Please see the Restrictions and Limitations section,
below.</para>
<para>To run the PearMerger command line utility you can use the runPearMerger scripts (.bat for Windows, and .sh for
Unix). The usage of the tooling is shown below:</para>
<para><programlisting>runPearMerger 1st_input_pear_file ... nth_input_pear_file
-n output_analysis_engine_name [-f output_pear_file ]</programlisting></para>
<para>The first group of parameters are the input PEAR files. No duplicates are allowed
here. The <literal>-n</literal> parameter is the name of the generated Aggregate
Analysis Engine. The optional <literal>-f</literal> parameter specifies the name of
the output file. If it is omitted, the output is written to
<literal>output_analysis_engine_name.pear</literal> in the current working directory.</para>
<para>During the running of this tool, work files are written to a temporary directory
created in the user&apos;s home directory.</para>
<section id="ugr.tools.pear.merger.merge_details">
<title>Details of the merging process</title>
<para>The PEARs are merged using the following steps:</para>
<orderedlist><listitem><para>A temporary working directory, is created for the
output aggregate component.</para></listitem>
<listitem><para>Each input PEAR file is extracted into a separate
&apos;input_component_name&apos; folder under the working directory.</para>
</listitem>
<listitem><para>The extracted files are processed to adjust the
&apos;$main_root&apos; macros. This operation differs from the PEAR installation
operation, because it does not replace the macros with absolute paths.</para>
</listitem>
<listitem><para>The output PEAR directory structure, &apos;metadata&apos; and
&apos;desc&apos; folders under the working directory, are created.</para>
</listitem>
<listitem><para>The UIMA AE descriptor for the output aggregate component is built
in the &apos;desc&apos; folder. This aggregate descriptor refers to the input
delegate components, specifying &apos;fixed flow&apos; based on the original
order of the input components in the command line. The aggregate descriptor&apos;s
&apos;capabilities&apos; and
&apos;operational properties&apos; sections are built based on the input
components&apos; specifications.</para></listitem>
<listitem><para>A new PEAR installation descriptor is created in the
&apos;metadata&apos; folder, referencing the new output aggregate descriptor
built in the previous step. </para></listitem>
<listitem><para>The content of the temporary output working directory is zipped to
created the output PEAR, and then the temporary working directory is deleted.
</para></listitem></orderedlist>
<para>The PEAR merger utility logs all the operations both to standard console output and
to a log file, pm.log, which is created in the current working directory.</para>
</section>
<section id="ugr.tools.pear.merger.testing_modifying_resulting_pear">
<title>Testing and Modifying the resulting PEAR</title>
<para>The output PEAR file can be installed and tested using the PEAR Installer. The
output aggregate component can also be tested by using the CVD or DocAnalyzer
tools.</para>
<para>The PEAR Installer creates Eclipse project files (.classpath and .project) in the
root directory of the installer PEAR, so the installed component can be imported into
the Eclipse IDE as an external project. Once the component is in the Eclipse IDE,
developers may use the Component Descriptor Editor and the PEAR Packager to modify the
output aggregate descriptor and re-package the component.</para>
</section>
<section id="ugr.tools.pear.merger.restrictions_limitations">
<title>Restrictions and Limitations</title>
<para>The PEAR Merger utility only does basic merging operations, and is limited as
follows. You can overcome these by editing the resulting PEAR file or the resulting
Aggregate Descriptor.</para>
<orderedlist><listitem><para>The Merge operation specifies Fixed Flow sequencing
for the Aggregate.</para></listitem>
<listitem><para>The merged aggregate does not define any parameters, so the delegate
parameters cannot be overridden.</para></listitem>
<listitem><para>No External Resource definitions are generated for the
aggregate.</para></listitem>
<listitem><para>No Sofa Mappings are generated for the aggregate.</para>
</listitem>
<listitem><para>Name collisions are not checked for. Possible name collisions could
occur in the fully-qualified class names of the implementing Java classes, the names
of JAR files, the names of descriptor files, and the names of resource bindings or
resource file paths.</para></listitem>
<listitem><para>The input and output capabilities are generated based on merging the
capabilities from the components (removing duplicates). Capability sets are
ignored - only the first of the set is used in this process, and only one set is created
for the generated Aggregate. There is no support for merging Sofa
specifications.</para></listitem>
<listitem><para>No Indexes or Type Priorities are created for the generated
Aggregate. No checking is done to see if the Indexes or Type Priorities of the
components conflict or are inconsistent.</para></listitem>
<listitem><para>You can only merge Analysis Engines and CAS Consumers. </para>
</listitem>
<listitem><para>Although PEAR file installation descriptors that are being merged
can have specific XML elements describing Collection Reader and CAS Consumer
descriptors, these elements are ignored during the merge, in the sense that the
installation descriptor that is created by the merge does not set these elements. The
merge process does not use these elements; the output PEAR&apos;s new aggregate only
references the merged components&apos; main PEAR descriptor element, as
identified by the PEAR element:
<programlisting><![CDATA[<SUBMITTED_COMPONENT>
<DESC>the_component.xml</DESC>...
</SUBMITTED_COMPONENT>
]]></programlisting></para>
</listitem></orderedlist>
</section>
</chapter>