| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| <chapter id="ugr.tools.uimafit.experiments"> |
| <title>Running Experiments</title> |
| <para>The <emphasis>uimafit-examples</emphasis> module contains a package |
| <package>org.apache.uima.fit.examples.experiment.pos</package> which demonstrates a very simple |
| experimental setup for testing a part-of-speech tagger. You may find this example more |
| accessible if you check out the code from subversion and build it in your own |
| environment.</para> |
| <para>The documentation for this example can be found in the code itself. Please refer to |
| <classname>RunExperiment</classname> as a starting point. The following is copied from the |
| javadoc comments of that file:</para> |
| <blockquote> |
| <para><classname>RunExperiment</classname> demonstrates a very common (though simplified) |
| experimental setup in which gold standard data is available for some task and you want to |
| evaluate how well your analysis engine works against that data. Here we are evaluating |
| <classname>BaselineTagger</classname> which is a (ridiculously) simple part-of-speech tagger |
| against the part-of-speech tags found in |
| <filename>src/main/resources/org/apache/uima/fit/examples/pos/sample-gold.txt</filename></para> |
| </blockquote> |
| <para>The basic strategy is as follows:</para> |
| <itemizedlist> |
| <listitem> |
| <para>post the data <emphasis>as is</emphasis> into the default view,</para> |
| </listitem> |
| <listitem> |
| <para>parse the gold-standard tokens and part-of-speech tags and put the results into another |
| view we will call <emphasis>GOLD_VIEW</emphasis>,</para> |
| </listitem> |
| <listitem> |
| <para>create another view called <emphasis>SYSTEM_VIEW</emphasis> and copy the text and |
| <classname>Token</classname> annotations from the <emphasis>GOLD_VIEW</emphasis> into this |
| view,</para> |
| </listitem> |
| <listitem> |
| <para>run the <classname>BaselineTagger</classname> on the <emphasis>SYSTEM_VIEW</emphasis> |
| over the copied <classname>Token</classname> annoations,</para> |
| </listitem> |
| <listitem> |
| <para>evaluate the part-of-speech tags found in the <emphasis>SYSTEM_VIEW</emphasis> with |
| those in the <emphasis>GOLD_VIEW.</emphasis></para> |
| </listitem> |
| </itemizedlist> |
| </chapter> |