blob: 164a2b69c390bb79af91cae8e5378b4d435e6fb7 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
<!ENTITY imgroot "images/tools/tm/workbench/" >
<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
<section id="">
<para> The TextMarker workbench comes bundled with its own testing environment, that allows you to
test and evaluate TextMarker scripts. It provides full back-end testing capabilities and allows
you to examine test results in detail.
<para> To test the quality of a written TextMarker script, the testing procedure compares a
previously annotated gold standard file with the resulting xmiCAS file created by the selected
TextMarker script. As a product of the testing operation a new xmiCAS file will be created,
containing detailed information about the test results. The evaluators compare the offsets of
annotations and, depending on the selected evaluator, add true positive, false positive or false
negative annotations for each tested annotation to the resulting xmiCAS file. Afterwards
precision, recall and f1-score are calculated for each test file and each type in the test file.
The f1-score is also calculated for the whole test set. The testing environment consists of four
views: Annotation Test, True Positive, False Positive and False Negative. The Annotation Test
view is by default associated with the TextMarker perspective.
<xref linkend='' />
shows the script explorer. Every TextMarker project contains a folder called
. This folder is the default location for the test-files. In the folder each script file has its
own subfolder with a relative path equal to the scripts package path in the
folder. This folder contains the test files. In every scripts test folder you will also find a
result folder where the results of the tests are saved. If you like to use test files from
another location in the file system, the results will be saved in the
subfolder of the projects test folder. All files in the temp folder will be deleted, once
eclipse is closed.
<figure id="">
<title>Test folder structure. </title>
<imageobject role="html">
<imagedata width="576px" format="PNG" align="center"
fileref="&imgroot;testing/script_explorer.png" />
<imageobject role="fo">
<imagedata width="3.0in" format="PNG" align="center"
fileref="&imgroot;testing/script_explorer.png" />
<phrase> The test folder structure. </phrase>
<section id="">
<para> This section describes the general proceeding when using the testing environment. </para>
Currently the testing environment has no own perspective associated to it. It is recommended
to start within the TextMarker perspective. There, the Annotation Test view is open by
default. The True Positive, False Positive and False Negative views have to be opened
<quote>Window -> Show View -> True Positive/False Positive/False Negative </quote>.
<para> To explain the usage of the TextMarker testing environment the TextMarker example project
is again used. Therefore, open this project.
Firstly one has to select a script for testing: TextMarker will always test the script, that
is currently open and active in the script editor. So open the
script file of the TextMarker example project.
The next <link linkend=''>figure</link>.
shows the Annotation Test view after doing this.
<figure id="">
The Annotation Test view.
<emphasis role="bold">(1)</emphasis>
Start Button;
<emphasis role="bold">(2)</emphasis>
Buttons to switch test cases;
<emphasis role="bold">(3)</emphasis>
<quote>Load all test files from selected folder</quote>
<emphasis role="bold">(4)</emphasis>
<quote>Select excluded types</quote>
button ;
<emphasis role="bold">(5)</emphasis>
<quote>Select evaluator</quote>
<emphasis role="bold">(6)</emphasis>
Export button;
<emphasis role="bold">(7)</emphasis>
Active/Tested script file;
<emphasis role="bold">(8)</emphasis>
Selected view;
<emphasis role="bold">(9)</emphasis>
Statistics summary over all ran tests;
<emphasis role="bold">(10)</emphasis>
List of all tested files;
<emphasis role="bold">(11)</emphasis>
Results per test file;
<imageobject role="html">
<imagedata width="576px" format="PNG" align="center"
fileref="&imgroot;testing/annotation_test_initial_view.png" />
<imageobject role="fo">
<imagedata width="5.5in" format="PNG" align="center"
fileref="&imgroot;testing/annotation_test_initial_view.png" />
<phrase> The Annotation Test view. </phrase>
<para> All control elements, that are needed for the interaction with the testing environment,
are located here. At the right top, there is the buttons bar (label (1)-(6)). At the left top
of the view (label (7)) the name of the script that is going to be tested is shown. It is
always same to the script active in the editor. Below this (label (10)) the test list is
located. This list contains the different files to be tested. Right next to name of the script
file (label (8)) you get select the desired view. Right to this (label (9)) you get statistics
over all ran tests: the number of all true positives (TP), false positives (FP) and false
negatives (FN). In the field bellow (label (11)), you will find a table with statistic
information for a single selected test file. To change this view select a file in the test
list field. The table shows a total TP, FP and FN information, as well as precision, recall
and f1-score for every type as well as for the whole file. </para>
Next you have to add test files to your project. A test file is a previously annotated xmiCAS
file that can be used as a golden standard for the test. You can use any xmiCAS file. The
TextMarker example project already contains such test files. Therefore these files are listed
in the Annotation Test view. Try do delete these files by selecting them and clicking on
. Add these files again by simply dragging them from the Script Explorer into the test file
list. A different way to add test-files is to use the
<quote>Load all test files from selected folder</quote>
button (green plus). It can be used to add all xmiCAS files from a selected folder.
Sometimes it is necessary to create some annotations manually: To create annotations manually,
use the
<quote>Cas Editor</quote>
perspective delivered with the UIMA workbench.
Selecting a CAS View to test: TextMarker supports different views, that allow you to operate
on different levels in a document. The
is selected as default, however you can also switch the evaluation to another view by typing
the views name into the list or selecting the view you wish to use from the list.
Selecting the evaluator: The testing environment supports different evaluators that allow a
sophisticated analysis of the behavior of a TextMarker script. The evaluator can be chosen in
the testing environment's preference page. The preference page can be opened either through
the menu or by clicking on the
<quote>Select evaluator</quote>
button (blue gear wheels) in the testing view's toolbar. Clicking the button will open a
filtered version of the TextMarker preference page. The default evaluator is the "Exact CAS
Evaluator" which compares the offsets of the annotations between the test file and the file
annotated by the tested script. To get an overview of all available evaluators, see
<xref linkend='' />
This preference page (see
<xref linkend='' />
) offers a few options that will modify the plug-ins general behavior. For example the
preloading of previously collected result data can be turned off. An important option in the
preference page is the evaluator you can select. On default the "exact evaluator" is selected,
which compares the offsets of the annotations, that are contained in the file produced by the
selected script, with the annotations in the test file. Other evaluators will compare
annotations in a different way.
<figure id="">
<title>The testing preference page view </title>
<imageobject role="html">
<imagedata width="576px" format="PNG" align="center" fileref="&imgroot;testing/preference.png" />
<imageobject role="fo">
<imagedata width="3.5in" format="PNG" align="center" fileref="&imgroot;testing/preference.png" />
<phrase> The testing preference page view. </phrase>
Excluding Types: During a test-run it might be convenient to disable testing for specific
types like punctuation or tags. The
<quote>Select excluded types</quote>
button (white exclamation in a red disk) will open a dialog (see
<xref linkend='' />
) where all types can be selected that should not be considered in the test.
<figure id="">
<title>Excluded types window </title>
<imageobject role="html">
<imagedata width="576px" format="PNG" align="center"
fileref="&imgroot;testing/excluded_types.png" />
<imageobject role="fo">
<imagedata width="3.0in" format="PNG" align="center"
fileref="&imgroot;testing/excluded_types.png" />
<phrase> Excluded types window. </phrase>
Running the test: A test-run can be started by clicking on the start button. Do this for the
TextMarker example project.
<xref linkend='' />
shows the results.
<figure id="">
<title>The Annotation Test view. </title>
<imageobject role="html">
<imagedata width="576px" format="PNG" align="center"
fileref="&imgroot;testing/annotation_test_test_run.png" />
<imageobject role="fo">
<imagedata width="5.5in" format="PNG" align="center"
fileref="&imgroot;testing/annotation_test_test_run.png" />
<phrase> The Annotation Test view. </phrase>
<para> Result Overview: The testing main view displays some information, on how well the script
did, after every test run. It will display an overall number of true positive, false positive
and false negatives annotations of all result files as well as an overall f1-score.
Furthermore a table will be displayed that contains the overall statistics of the selected
test file as well as statistics for every single type in the test file. The information
displayed are true positives, false positives, false negatives, precision, recall and
f1-measure. </para>
The testing environment also supports the export of the overall data in form of a
comma-separated table. Clicking the
<quote>export data</quote>
button will open a dialog window that contains this table. The text in this table can be
copied and easily imported into other applications.
Result Files: When running a test, the evaluator will create a new result xmiCAS file and will
add new true positive, false positive and false negative annotations. By clicking on a file in
the test-file list, you can open the corresponding result xmiCAS file in the CAS
Editor. While displaying the result xmiCAS file in the CAS Editor, the True Positive, False
Positive and False Negative views allow easy navigation through the new tp, fp and fn
annotations. The corresponding annotations are displayed in a hierarchic tree structure. This
allows an easy tracing of the results inside the testing document. Clicking on one of the
annotations in those views, will highlight the annotation in the CAS Editor. Opening
in the TextMarker example project, changes the True Positive view as shown in
<xref linkend='' />.
Notice that the type system, which will be used by the CAS Editor to open the evaluated file,
can only be resolved for the tested script, if the test files are located in the associated
folder structure, that is the folder with the name of the script. If the files are located
in the temp folder, for example by adding the files to the list of test cases by drag and drop,
then other strategies to find the correct type system will be applied. For TextMarker projects,
for example, this will be the type system of the last launched script in this project.
<figure id="">
<title>The True Positive view. </title>
<imageobject role="html">
<imagedata width="576px" format="PNG" align="center"
fileref="&imgroot;testing/true_positive.png" />
<imageobject role="fo">
<imagedata width="5.0in" format="PNG" align="center"
fileref="&imgroot;testing/true_positive.png" />
<phrase> The True Positive view. </phrase>
<section id="">
<para> When testing a CAS file, the system compared the offsets of the annotations of a
previously annotated gold standard file with the offsets of the annotations of the result file
the script produced. Responsible for comparing annotations in the two CAS files are
evaluators. These evaluators have different methods and strategies, for comparing the
annotations, implemented. Also a extension point is provided that allows easy implementation
of new evaluators. </para>
<para> Exact Match Evaluator: The Exact Match Evaluator compares the offsets of the annotations
in the result and the golden standard file. Any difference will be marked with either an false
positive or false negative annotations. </para>
<para> Partial Match Evaluator: The Partial Match Evaluator compares the offsets of the
annotations in the result and golden standard file. It will allow differences in the beginning
or the end of an annotation. For example "corresponding" and "corresponding " will not be
annotated as an error. </para>
<para> Core Match Evaluator: The Core Match Evaluator accepts annotations that share a core
expression. In this context a core expression is at least four digits long and starts with a
capitalized letter. For example the two annotations "L404-123-421" and "L404-321-412" would be
considered a true positive match, because of "L404" is considered a core expression that is
contained in both annotations. </para>
<para> Word Accuracy Evaluator: Compares the labels of all words/numbers in an annotation,
whereas the label equals the type of the annotation. This has the consequence, for example,
that each word or number that is not part of the annotation is counted as a single false
negative. For example we have the sentence: "Christmas is on the 24.12 every year." The script
labels "Christmas is on the 12" as a single sentence, while the test file labels the sentence
correctly with a single sentence annotation. While for example the Exact CAS Evaluator while
only assign a single False Negative annotation, Word Accuracy Evaluator will mark every word
or number as a single False Negative. </para>
<para> Template Only Evaluator: This Evaluator compares the offsets of the annotations and the
features, that have been created by the script. For example the text "Alan Mathison Turing" is
marked with the author annotation and "author" contains 2 features: "FirstName" and
"LastName". If the script now creates an author annotation with only one feature, the
annotation will be marked as a false positive. </para>
<para> Template on Word Level Evaluator: The Template On Word Evaluator compares the offsets of
the annotations. In addition it also compares the features and feature structures and the
values stored in the features. For example the annotation "author" might have features like
"FirstName" and "LastName" The authors name is "Alan Mathison Turing" and the script correctly
assigns the author annotation. The feature assigned by the script are "Firstname : Alan",
"LastName : Mathison", while the correct feature values would be "FirstName Alan", "LastName
Turing". In this case the Template Only Evaluator will mark an annotation as a false positive,
since the feature values differ. </para>