uimaj-2.2.0-incubating/uima-docbooks/src/docbook/tools/tools.pear.merger.xml - uima-uimaj - Git at Google

 <?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[
 <!ENTITY % uimaents SYSTEM "../entities.ent" >
 %uimaents;
 ]>
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
 <chapter id="ugr.tools.pear.merger">
   <title>PEAR Merger User&apos;s Guide</title>

   <para>The PEAR Merger utility takes two or more PEAR files and merges their contents,
     creating a new PEAR which has, in turn, a new Aggregate analysis engine whose delegates are
     the components from the original files being merged. It does this by (1) copying the
     contents of the input components into the output component, placing each component into a
     separate subdirectory, (2) generating a UIMA descriptor for the output Aggregate
     analysis engine and (3) creating an output PEAR file that encapsulates the output
     Aggregate.</para>

   <para>The merge logic is quite simple, and is intended to work for simple cases. More complex
     merging needs to be done by hand. Please see the Restrictions and Limitations section,
     below.</para>

   <para>To run the PearMerger command line utility you can use the runPearMerger scripts (.bat for Windows, and .sh for
     Unix). The usage of the tooling is shown below:</para>

   <para><programlisting>runPearMerger 1st_input_pear_file ... nth_input_pear_file
   -n output_analysis_engine_name [-f output_pear_file ]</programlisting></para>

   <para>The first group of parameters are the input PEAR files. No duplicates are allowed
     here. The <literal>-n</literal> parameter is the name of the generated Aggregate
     Analysis Engine. The optional <literal>-f</literal> parameter specifies the name of
     the output file. If it is omitted, the output is written to
     <literal>output_analysis_engine_name.pear</literal> in the current working directory.</para>

   <para>During the running of this tool, work files are written to a temporary directory
     created in the user&apos;s home directory.</para>

   <section id="ugr.tools.pear.merger.merge_details">
     <title>Details of the merging process</title>

     <para>The PEARs are merged using the following steps:</para>

     <orderedlist><listitem><para>A temporary working directory, is created for the
       output aggregate component.</para></listitem>

       <listitem><para>Each input PEAR file is extracted into a separate
         &apos;input_component_name&apos; folder under the working directory.</para>
         </listitem>

       <listitem><para>The extracted files are processed to adjust the
         &apos;$main_root&apos; macros. This operation differs from the PEAR installation
         operation, because it does not replace the macros with absolute paths.</para>
         </listitem>

       <listitem><para>The output PEAR directory structure, &apos;metadata&apos; and
         &apos;desc&apos; folders under the working directory, are created.</para>
         </listitem>

       <listitem><para>The UIMA AE descriptor for the output aggregate component is built
         in the &apos;desc&apos; folder. This aggregate descriptor refers to the input
         delegate components, specifying &apos;fixed flow&apos; based on the original
         order of the input components in the command line. The aggregate descriptor&apos;s
         &apos;capabilities&apos; and
         &apos;operational properties&apos; sections are built based on the input
         components&apos; specifications.</para></listitem>

       <listitem><para>A new PEAR installation descriptor is created in the
         &apos;metadata&apos; folder, referencing the new output aggregate descriptor
         built in the previous step. </para></listitem>

       <listitem><para>The content of the temporary output working directory is zipped to
         created the output PEAR, and then the temporary working directory is deleted.
         </para></listitem></orderedlist>

     <para>The PEAR merger utility logs all the operations both to standard console output and
       to a log file, pm.log, which is created in the current working directory.</para>

   </section>

   <section id="ugr.tools.pear.merger.testing_modifying_resulting_pear">
     <title>Testing and Modifying the resulting PEAR</title>

     <para>The output PEAR file can be installed and tested using the PEAR Installer. The
       output aggregate component can also be tested by using the CVD or DocAnalyzer
       tools.</para>

     <para>The PEAR Installer creates Eclipse project files (.classpath and .project) in the
       root directory of the installer PEAR, so the installed component can be imported into
       the Eclipse IDE as an external project. Once the component is in the Eclipse IDE,
       developers may use the Component Descriptor Editor and the PEAR Packager to modify the
       output aggregate descriptor and re-package the component.</para>

   </section>
   <section id="ugr.tools.pear.merger.restrictions_limitations">
     <title>Restrictions and Limitations</title>

     <para>The PEAR Merger utility only does basic merging operations, and is limited as
       follows. You can overcome these by editing the resulting PEAR file or the resulting
       Aggregate Descriptor.</para>

     <orderedlist><listitem><para>The Merge operation specifies Fixed Flow sequencing
       for the Aggregate.</para></listitem>

       <listitem><para>The merged aggregate does not define any parameters, so the delegate
         parameters cannot be overridden.</para></listitem>

       <listitem><para>No External Resource definitions are generated for the
         aggregate.</para></listitem>

       <listitem><para>No Sofa Mappings are generated for the aggregate.</para>
         </listitem>

       <listitem><para>Name collisions are not checked for. Possible name collisions could
         occur in the fully-qualified class names of the implementing Java classes, the names
         of JAR files, the names of descriptor files, and the names of resource bindings or
         resource file paths.</para></listitem>

       <listitem><para>The input and output capabilities are generated based on merging the
         capabilities from the components (removing duplicates). Capability sets are
         ignored - only the first of the set is used in this process, and only one set is created
         for the generated Aggregate. There is no support for merging Sofa
         specifications.</para></listitem>

       <listitem><para>No Indexes or Type Priorities are created for the generated
         Aggregate. No checking is done to see if the Indexes or Type Priorities of the
         components conflict or are inconsistent.</para></listitem>

       <listitem><para>You can only merge Analysis Engines and CAS Consumers. </para>
         </listitem>

       <listitem><para>Although PEAR file installation descriptors that are being merged
         can have specific XML elements describing Collection Reader and CAS Consumer
         descriptors, these elements are ignored during the merge, in the sense that the
         installation descriptor that is created by the merge does not set these elements. The
         merge process does not use these elements; the output PEAR&apos;s new aggregate only
         references the merged components&apos; main PEAR descriptor element, as
         identified by the PEAR element:

         <programlisting><![CDATA[<SUBMITTED_COMPONENT>
   <DESC>the_component.xml</DESC>...
 </SUBMITTED_COMPONENT>
 ]]></programlisting></para>
         </listitem></orderedlist>

   </section>

 </chapter>
	<?xml version="1.0" encoding="UTF-8"?>
	<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
	"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[
	<!ENTITY % uimaents SYSTEM "../entities.ent" >
	%uimaents;
	]>
	<!--
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	-->
	<chapter id="ugr.tools.pear.merger">
	<title>PEAR Merger User's Guide</title>

	<para>The PEAR Merger utility takes two or more PEAR files and merges their contents,
	creating a new PEAR which has, in turn, a new Aggregate analysis engine whose delegates are
	the components from the original files being merged. It does this by (1) copying the
	contents of the input components into the output component, placing each component into a
	separate subdirectory, (2) generating a UIMA descriptor for the output Aggregate
	analysis engine and (3) creating an output PEAR file that encapsulates the output
	Aggregate.</para>

	<para>The merge logic is quite simple, and is intended to work for simple cases. More complex
	merging needs to be done by hand. Please see the Restrictions and Limitations section,
	below.</para>

	<para>To run the PearMerger command line utility you can use the runPearMerger scripts (.bat for Windows, and .sh for
	Unix). The usage of the tooling is shown below:</para>

	<para><programlisting>runPearMerger 1st_input_pear_file ... nth_input_pear_file
	-n output_analysis_engine_name [-f output_pear_file ]</programlisting></para>

	<para>The first group of parameters are the input PEAR files. No duplicates are allowed
	here. The <literal>-n</literal> parameter is the name of the generated Aggregate
	Analysis Engine. The optional <literal>-f</literal> parameter specifies the name of
	the output file. If it is omitted, the output is written to
	<literal>output_analysis_engine_name.pear</literal> in the current working directory.</para>

	<para>During the running of this tool, work files are written to a temporary directory
	created in the user's home directory.</para>

	<section id="ugr.tools.pear.merger.merge_details">
	<title>Details of the merging process</title>

	<para>The PEARs are merged using the following steps:</para>

	<orderedlist><listitem><para>A temporary working directory, is created for the
	output aggregate component.</para></listitem>

	<listitem><para>Each input PEAR file is extracted into a separate
	'input_component_name' folder under the working directory.</para>
	</listitem>

	<listitem><para>The extracted files are processed to adjust the
	'$main_root' macros. This operation differs from the PEAR installation
	operation, because it does not replace the macros with absolute paths.</para>
	</listitem>

	<listitem><para>The output PEAR directory structure, 'metadata' and
	'desc' folders under the working directory, are created.</para>
	</listitem>

	<listitem><para>The UIMA AE descriptor for the output aggregate component is built
	in the 'desc' folder. This aggregate descriptor refers to the input
	delegate components, specifying 'fixed flow' based on the original
	order of the input components in the command line. The aggregate descriptor's
	'capabilities' and
	'operational properties' sections are built based on the input
	components' specifications.</para></listitem>

	<listitem><para>A new PEAR installation descriptor is created in the
	'metadata' folder, referencing the new output aggregate descriptor
	built in the previous step. </para></listitem>

	<listitem><para>The content of the temporary output working directory is zipped to
	created the output PEAR, and then the temporary working directory is deleted.
	</para></listitem></orderedlist>

	<para>The PEAR merger utility logs all the operations both to standard console output and
	to a log file, pm.log, which is created in the current working directory.</para>

	</section>

	<section id="ugr.tools.pear.merger.testing_modifying_resulting_pear">
	<title>Testing and Modifying the resulting PEAR</title>

	<para>The output PEAR file can be installed and tested using the PEAR Installer. The
	output aggregate component can also be tested by using the CVD or DocAnalyzer
	tools.</para>

	<para>The PEAR Installer creates Eclipse project files (.classpath and .project) in the
	root directory of the installer PEAR, so the installed component can be imported into
	the Eclipse IDE as an external project. Once the component is in the Eclipse IDE,
	developers may use the Component Descriptor Editor and the PEAR Packager to modify the
	output aggregate descriptor and re-package the component.</para>

	</section>
	<section id="ugr.tools.pear.merger.restrictions_limitations">
	<title>Restrictions and Limitations</title>

	<para>The PEAR Merger utility only does basic merging operations, and is limited as
	follows. You can overcome these by editing the resulting PEAR file or the resulting
	Aggregate Descriptor.</para>

	<orderedlist><listitem><para>The Merge operation specifies Fixed Flow sequencing
	for the Aggregate.</para></listitem>

	<listitem><para>The merged aggregate does not define any parameters, so the delegate
	parameters cannot be overridden.</para></listitem>

	<listitem><para>No External Resource definitions are generated for the
	aggregate.</para></listitem>

	<listitem><para>No Sofa Mappings are generated for the aggregate.</para>
	</listitem>

	<listitem><para>Name collisions are not checked for. Possible name collisions could
	occur in the fully-qualified class names of the implementing Java classes, the names
	of JAR files, the names of descriptor files, and the names of resource bindings or
	resource file paths.</para></listitem>

	<listitem><para>The input and output capabilities are generated based on merging the
	capabilities from the components (removing duplicates). Capability sets are
	ignored - only the first of the set is used in this process, and only one set is created
	for the generated Aggregate. There is no support for merging Sofa
	specifications.</para></listitem>

	<listitem><para>No Indexes or Type Priorities are created for the generated
	Aggregate. No checking is done to see if the Indexes or Type Priorities of the
	components conflict or are inconsistent.</para></listitem>

	<listitem><para>You can only merge Analysis Engines and CAS Consumers. </para>
	</listitem>

	<listitem><para>Although PEAR file installation descriptors that are being merged
	can have specific XML elements describing Collection Reader and CAS Consumer
	descriptors, these elements are ignored during the merge, in the sense that the
	installation descriptor that is created by the merge does not set these elements. The
	merge process does not use these elements; the output PEAR's new aggregate only
	references the merged components' main PEAR descriptor element, as
	identified by the PEAR element:

	<programlisting><![CDATA[<SUBMITTED_COMPONENT>
	<DESC>the_component.xml</DESC>...
	</SUBMITTED_COMPONENT>
	]]></programlisting></para>
	</listitem></orderedlist>

	</section>

	</chapter>