<?xml version="1.0" encoding="UTF-8"?> | |
<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" | |
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[ | |
<!ENTITY imgroot "images/tools/tools.ruta/" > | |
<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" > | |
%uimaents; | |
]> | |
<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor | |
license agreements. See the NOTICE file distributed with this work for additional | |
information regarding copyright ownership. The ASF licenses this file to | |
you under the Apache License, Version 2.0 (the "License"); you may not use | |
this file except in compliance with the License. You may obtain a copy of | |
the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required | |
by applicable law or agreed to in writing, software distributed under the | |
License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS | |
OF ANY KIND, either express or implied. See the License for the specific | |
language governing permissions and limitations under the License. --> | |
<section id="ugr.tools.ruta.language.expressions"> | |
<title>Expressions</title> | |
<para> | |
UIMA Ruta provides six different kinds of expressions. These are | |
type expressions, annotations expressions, number expressions, string expressions, | |
boolean expressions and list expressions. | |
</para> | |
<para> | |
<emphasis role="bold">Definition:</emphasis> | |
<programlisting><![CDATA[RutaExpression -> TypeExpression | AnnotationExpression | |
| StringExpression | BooleanExpression | |
| NumberExpression | ListExpression]]></programlisting> | |
</para> | |
<section id="ugr.tools.ruta.language.expressions.type"> | |
<title>Type Expressions</title> | |
<para> | |
UIMA Ruta provides several kinds of type expressions. | |
<orderedlist numeration="arabic"> | |
<listitem> | |
Declared annotation types (see | |
<xref linkend='ugr.tools.ruta.language.declarations.type' /> | |
) also including any types present in the type system of the CAS or defined in imported type systems. | |
</listitem> | |
<listitem> | |
Type variables | |
(see | |
<xref linkend='ugr.tools.ruta.language.declarations.variable' /> | |
). | |
</listitem> | |
<listitem> | |
Type of an annotation expression | |
(see | |
<xref linkend='ugr.tools.ruta.language.expressions.features' /> | |
). | |
</listitem> | |
</orderedlist> | |
</para> | |
<section> | |
<title> | |
<emphasis role="bold">Definition:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[TypeExpression -> AnnotationType | TypeVariable | |
| AnnotationExpression.type]]></programlisting> | |
</para> | |
</section> | |
<section> | |
<title> | |
<emphasis role="bold">Examples:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[DECLARE Author; // Author defines a type, therefore it is | |
// a type expression | |
TYPE typeVar; // type variable typeVar is a type expression | |
Document{->ASSIGN(typeVar, Author)};]]></programlisting> | |
</para> | |
<para> | |
<programlisting><![CDATA[e:Entity{-> e.type}; // the dot notation type refers to the type | |
// of the anotation stored in the label a. In this example, | |
// this type expression refers to the type Entity or a specific | |
// subtype of Entity.]]></programlisting> | |
</para> | |
</section> | |
</section> | |
<section id="ugr.tools.ruta.language.expressions.annotation"> | |
<title>Annotation Expressions</title> | |
<para> | |
UIMA Ruta provides several kinds of annotation expressions. | |
<orderedlist numeration="arabic"> | |
<listitem> | |
Annotation variables | |
(see | |
<xref linkend='ugr.tools.ruta.language.declarations.variable' /> | |
). | |
</listitem> | |
<listitem> | |
Label expressions storing matched annotations (see | |
<xref linkend=' ugr.tools.ruta.language.labels' /> | |
). Label expressions are on-the-fly defined (local) variables in the context of a rule. | |
</listitem> | |
<listitem> | |
Annotation implicit referenced by a type expression in the match context (see | |
<xref linkend='ugr.tools.ruta.language.expressions.type' /> | |
). | |
</listitem> | |
<listitem> | |
Annotations stored in features of other annotations. | |
(see | |
<xref linkend='ugr.tools.ruta.language.expressions.features' /> | |
). | |
</listitem> | |
</orderedlist> | |
</para> | |
<section> | |
<title> | |
<emphasis role="bold">Definition:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[AnnotationExpression -> AnnotationVariable | LabelExpression | |
| TypeExpression | FeatureExpression]]></programlisting> | |
</para> | |
</section> | |
<section> | |
<title> | |
<emphasis role="bold">Examples:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[ANNOTATION anno; // a variable declaration for storing an annotation. | |
e:Entity; // label expression e stored the annotation matched by the | |
// rule element with the matching condition Entity. | |
er:EmplRelation{-> er.employer = Employer}; // the type expression | |
// Employer implicitly refers to annotations of the | |
// type Employer in the context of the EmplRelation match. | |
e:EmplRelation.employer; // this feature expression represents the | |
// annotation stored in the feature employer. | |
]]></programlisting> | |
</para> | |
</section> | |
</section> | |
<section id="ugr.tools.ruta.language.expressions.number"> | |
<title>Number Expressions</title> | |
<para> | |
UIMA Ruta provides several possibilities to define number | |
expressions. As expected, every number expression evaluates to a | |
number. UIMA Ruta supports integer and floating-point numbers. A | |
floating-point number can be in single or in double precision. To get | |
a complete overview, have a look at the following syntax definition | |
of number expressions. | |
</para> | |
<section> | |
<title> | |
<emphasis role="bold">Definition:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[NumberExpression -> AdditiveExpression | |
AdditiveExpression -> MultiplicativeExpression ( ( "+" | "-" ) | |
MultiplicativeExpression )* | |
MultiplicativeExpression -> SimpleNumberExpression ( ( "*" | "/" | "%" ) | |
SimpleNumberExpression )* | |
| ( "EXP" | "LOGN" | "SIN" | "COS" | "TAN" ) | |
"(" NumberExpression ")" | |
SimpleNumberExpression -> "-"? ( DecimalLiteral | FloatingPointLiteral | |
| NumberVariable) | "(" NumberExpression ")" | |
DecimalLiteral -> ('0' | '1'..'9' Digit*) IntegerTypeSuffix? | |
IntegerTypeSuffix -> ('l'|'L') | |
FloatingPointLiteral -> Digit+ '.' Digit* Exponent? FloatTypeSuffix? | |
| '.' Digit+ Exponent? FloatTypeSuffix? | |
| Digit+ Exponent FloatTypeSuffix? | |
| Digit+ Exponent? FloatTypeSuffix | |
FloatTypeSuffix -> ('f'|'F'|'d'|'D') | |
Exponent -> ('e'|'E') ('+'|'-')? Digit+ | |
Digit -> ('0'..'9') ]]></programlisting> | |
</para> | |
</section> | |
<para> | |
For more information on number variables, see | |
<xref linkend='ugr.tools.ruta.language.declarations.variable' /> | |
. | |
</para> | |
<section> | |
<title> | |
<emphasis role="bold">Examples:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[98 // a integer number literal | |
104 // a integer number literal | |
170.02 // a floating-point number literal | |
1.0845 // a floating-point number literal]]></programlisting> | |
</para> | |
<para> | |
<programlisting><![CDATA[INT intVar1; | |
INT intVar2; | |
... | |
Document{->ASSIGN(intVar1, 12 * intVar1 - SIN(intVar2))};]]></programlisting> | |
</para> | |
</section> | |
</section> | |
<section id="ugr.tools.ruta.language.expressions.string"> | |
<title>String Expressions</title> | |
<para> | |
There are two kinds of string expressions in UIMA Ruta. | |
<orderedlist numeration="arabic"> | |
<listitem> | |
<para> | |
String literals: String literals are defined by any sequence of | |
characters within quotation marks. | |
</para> | |
</listitem> | |
<listitem> | |
<para> | |
String variables (see | |
<xref linkend='ugr.tools.ruta.language.declarations.variable' /> | |
) | |
</para> | |
</listitem> | |
</orderedlist> | |
</para> | |
<section> | |
<title> | |
<emphasis role="bold">Definition:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[StringExpression -> SimpleStringExpression | |
SimpleStringExpression -> StringLiteral ("+" StringExpression)* | |
| StringVariable]]></programlisting> | |
</para> | |
</section> | |
<section> | |
<title> | |
<emphasis role="bold">Example:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[STRING strVar; // define string variable | |
// add prefix "strLiteral" to variable strVar | |
Document{->ASSIGN(strVar, "strLiteral" + strVar)};]]></programlisting> | |
</para> | |
</section> | |
</section> | |
<section id="ugr.tools.ruta.language.expressions.boolean"> | |
<title>Boolean Expressions</title> | |
<para> | |
UIMA Ruta provides several possibilities to define boolean | |
expressions. As expected, every boolean expression evaluates to | |
either | |
true or false. To get a complete overview, have a look at the | |
following syntax definition of boolean expressions. | |
</para> | |
<section> | |
<title> | |
<emphasis role="bold">Definition:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[BooleanExpression -> ComposedBooleanExpression | |
| SimpleBooleanExpression | |
ComposedBooleanExpression -> BooleanCompare | BooleanTypeExpression | |
| BooleanNumberExpression | BooleanFunction | |
SimpleBooleanExpression -> BooleanLiteral | BooleanVariable | |
BooleanCompare -> SimpleBooleanExpression ( "==" | "!=" ) | |
BooleanExpression | |
BooleanTypeExpression -> TypeExpression ( "==" | "!=" ) TypeExpression | |
BooleanNumberExpression -> "(" NumberExpression ( "<" | "<=" | ">" | |
| ">=" | "==" | "!=" ) NumberExpression ")" | |
BooleanFunction -> XOR "(" BooleanExpression "," BooleanExpression ")" | |
BooleanLiteral -> "true" | "false"]]></programlisting> | |
</para> | |
<para> | |
Boolean variables are defined in | |
<xref linkend='ugr.tools.ruta.language.declarations.variable' /> | |
. | |
</para> | |
</section> | |
<section> | |
<title> | |
<emphasis role="bold">Examples:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[Document{->ASSIGN(boolVar, false)};]]></programlisting> | |
</para> | |
<para> | |
The boolean literal 'false' is assigned to boolean variable | |
boolVar. | |
</para> | |
<para> | |
<programlisting><![CDATA[Document{->ASSIGN(boolVar, typeVar == Author)};]]></programlisting> | |
</para> | |
<para> | |
If the type variable typeVar represents annotation type Author, | |
the | |
boolean | |
type expression evaluates to true, otherwise it evaluates | |
to | |
false. The result is assigned to boolean variable boolVar. | |
</para> | |
<para> | |
<programlisting><![CDATA[Document{->ASSIGN(boolVar, (intVar == 10))};]]></programlisting> | |
</para> | |
<para> | |
This rule shows a boolean number expression. If the value in | |
variable intVar is equal to 10, the boolean number expression | |
evaluates to true, otherwise it evaluates to | |
false. The result is | |
assigned to boolean variable boolVar. The brackets | |
surrounding the number expression are necessary. | |
</para> | |
<para> | |
<programlisting><![CDATA[Document{->ASSIGN(booleanVar1, booleanVar2 == (10 > intVar))};]]></programlisting> | |
</para> | |
<para> | |
This rule shows a more complex boolean expression. If the | |
value | |
in | |
variable intVar is equal to 10, the boolean number | |
expression | |
evaluates to true, otherwise it evaluates to | |
false. The | |
result of | |
this | |
evaluation is compared to booleanVar2. The end result | |
is | |
assigned to | |
boolean variable boolVar1. Realize that the syntax | |
definition defines | |
exactly this order. It is not possible to have | |
the | |
boolean number | |
expression on the left side of the complex number | |
expression. | |
</para> | |
</section> | |
</section> | |
<section id="ugr.tools.ruta.language.expressions.lists"> | |
<title>List Expressions</title> | |
<para> | |
List expression are a rather simple kind of expression. | |
</para> | |
<section> | |
<title> | |
<emphasis role="bold">Definition:</emphasis> | |
</title> | |
<para> | |
<programlisting><![CDATA[ListExpression -> WordListExpression | WordTableExpression | |
| TypeListExpression | AnnotationListExpression | |
| NumberListExpression | StringListExpression | |
| BooleanListExpression | |
WordListExpression -> RessourceLiteral | WordListVariable | |
WordTableExpression -> RessourceLiteral | WordTableVariable | |
TypeListExpression -> TypeListVariable | |
| "{" TypeExpression ("," TypeExpression)* "}" | |
NumberListExpression -> IntListVariable | FloatListVariable | |
| DoubleListVariable | |
| "{" NumberExpression | |
("," NumberExpression)* "}" | |
StringListExpression -> StringListVariable | |
| "{" StringExpression | |
("," StringExpression)* "}" | |
BooleanListExpression -> BooleanListVariable | |
| "{" BooleanExpression | |
("," BooleanExpression)* "}" | |
AnnotationListExpression -> AnnotationListVariable | |
| "{" AnnotationExpression | |
("," AnnotationExpression)* "}" | |
]]></programlisting> | |
A ResourceLiteral is something | |
like 'folder/file.txt' (Attention: Use single quotes). | |
</para> | |
<para> | |
List variables are defined in | |
<xref linkend='ugr.tools.ruta.language.declarations.variable' />. | |
</para> | |
</section> | |
</section> | |
<section id="ugr.tools.ruta.language.expressions.features"> | |
<title>Feature Expressions</title> | |
<para> | |
Feature expression can be used in different situations, e.g., for restricting the match of a rule element, | |
as an implicit condition or as an implicit action. | |
<programlisting><![CDATA[FeatureExpression -> TypeExpression "." DottedIdentifier | |
FeatureMatchExpression -> FeatureExpression | |
("==" | "!=" | "<=" | "<" | ">=" | ">") | |
Expression | |
FeatureAssignmentExpression -> FeatureExpression "=" Expression | |
]]></programlisting> | |
</para> | |
<para> | |
Ruta allows the access of two special attributes of an annotation with the feature notation: | |
The covered text of an annotation can be accessed as a string expression and the type of | |
an annotation can be accessed as an type expression. | |
</para> | |
<para> | |
The covered text of an annotation can be referred to with "coveredText" or "ct". | |
The latter one is an abbreviation and returns the covered text of an annotation | |
only if the type of the annotation does not define a feature with the name "ct". | |
The following example creates an annotation of the type TypeA for each word with the | |
covered text "A". | |
<programlisting><![CDATA[W.ct == "A" {-> TypeA};]]></programlisting> | |
</para> | |
<para> | |
The type of an annotation can be referred to with "type". | |
The following example creates an annotation of the type TypeA for each pair of ANY annotation. | |
<programlisting><![CDATA[(a1:ANY a2:ANY){a1.type == a2.type -> TypeA};]]></programlisting> | |
</para> | |
</section> | |
</section> |