uimafit-docbook/src/docbook/tools.uimafit.experiments.xml - uima-uimafit - Git at Google

 <!--
 	Licensed to the Apache Software Foundation (ASF) under one
 	or more contributor license agreements. See the NOTICE file
 	distributed with this work for additional information
 	regarding copyright ownership. The ASF licenses this file
 	to you under the Apache License, Version 2.0 (the
 	"License"); you may not use this file except in compliance
 	with the License. You may obtain a copy of the License at

 	http://www.apache.org/licenses/LICENSE-2.0

 	Unless required by applicable law or agreed to in writing,
 	software distributed under the License is distributed on an
 	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 	KIND, either express or implied. See the License for the
 	specific language governing permissions and limitations
 	under the License.
 -->
 <chapter id="ugr.tools.uimafit.experiments">
   <title>Running Experiments</title>
   <para>The <emphasis>uimafit-examples</emphasis> module contains a package
       <package>org.apache.uima.fit.examples.experiment.pos</package> which demonstrates a very simple
     experimental setup for testing a part-of-speech tagger. You may find this example more
     accessible if you check out the code from subversion and build it in your own
     environment.</para>
   <para>The documentation for this example can be found in the code itself. Please refer to
       <classname>RunExperiment</classname> as a starting point. The following is copied from the
     javadoc comments of that file:</para>
   <blockquote>
     <para><classname>RunExperiment</classname> demonstrates a very common (though simplified)
       experimental setup in which gold standard data is available for some task and you want to
       evaluate how well your analysis engine works against that data. Here we are evaluating
         <classname>BaselineTagger</classname> which is a (ridiculously) simple part-of-speech tagger
       against the part-of-speech tags found in
         <filename>src/main/resources/org/apache/uima/fit/examples/pos/sample-gold.txt</filename></para>
   </blockquote>
   <para>The basic strategy is as follows:</para>
   <itemizedlist>
     <listitem>
       <para>post the data <emphasis>as is</emphasis> into the default view,</para>
     </listitem>
     <listitem>
       <para>parse the gold-standard tokens and part-of-speech tags and put the results into another
         view we will call <emphasis>GOLD_VIEW</emphasis>,</para>
     </listitem>
     <listitem>
       <para>create another view called <emphasis>SYSTEM_VIEW</emphasis> and copy the text and
           <classname>Token</classname> annotations from the <emphasis>GOLD_VIEW</emphasis> into this
         view,</para>
     </listitem>
     <listitem>
       <para>run the <classname>BaselineTagger</classname> on the <emphasis>SYSTEM_VIEW</emphasis>
         over the copied <classname>Token</classname> annoations,</para>
     </listitem>
     <listitem>
       <para>evaluate the part-of-speech tags found in the <emphasis>SYSTEM_VIEW</emphasis> with
         those in the <emphasis>GOLD_VIEW.</emphasis></para>
     </listitem>
   </itemizedlist>
 </chapter>
	<!--
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	-->
	<chapter id="ugr.tools.uimafit.experiments">
	<title>Running Experiments</title>
	<para>The <emphasis>uimafit-examples</emphasis> module contains a package
	<package>org.apache.uima.fit.examples.experiment.pos</package> which demonstrates a very simple
	experimental setup for testing a part-of-speech tagger. You may find this example more
	accessible if you check out the code from subversion and build it in your own
	environment.</para>
	<para>The documentation for this example can be found in the code itself. Please refer to
	<classname>RunExperiment</classname> as a starting point. The following is copied from the
	javadoc comments of that file:</para>
	<blockquote>
	<para><classname>RunExperiment</classname> demonstrates a very common (though simplified)
	experimental setup in which gold standard data is available for some task and you want to
	evaluate how well your analysis engine works against that data. Here we are evaluating
	<classname>BaselineTagger</classname> which is a (ridiculously) simple part-of-speech tagger
	against the part-of-speech tags found in
	<filename>src/main/resources/org/apache/uima/fit/examples/pos/sample-gold.txt</filename></para>
	</blockquote>
	<para>The basic strategy is as follows:</para>
	<itemizedlist>
	<listitem>
	<para>post the data <emphasis>as is</emphasis> into the default view,</para>
	</listitem>
	<listitem>
	<para>parse the gold-standard tokens and part-of-speech tags and put the results into another
	view we will call <emphasis>GOLD_VIEW</emphasis>,</para>
	</listitem>
	<listitem>
	<para>create another view called <emphasis>SYSTEM_VIEW</emphasis> and copy the text and
	<classname>Token</classname> annotations from the <emphasis>GOLD_VIEW</emphasis> into this
	view,</para>
	</listitem>
	<listitem>
	<para>run the <classname>BaselineTagger</classname> on the <emphasis>SYSTEM_VIEW</emphasis>
	over the copied <classname>Token</classname> annoations,</para>
	</listitem>
	<listitem>
	<para>evaluate the part-of-speech tags found in the <emphasis>SYSTEM_VIEW</emphasis> with
	those in the <emphasis>GOLD_VIEW.</emphasis></para>
	</listitem>
	</itemizedlist>
	</chapter>