blob: 88d47ab2687898fa7a3242c219e0828e307f3297 [file] [log] [blame]
---
active_crumb: Intent Matching
layout: documentation
id: intent_matching
---
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<div id="intent-matching" class="col-md-8 second-column">
<section>
<h2 class="section-title">Overview <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
<a href="/data-model.html">Data Model</a> processing logic is defined as a collection of one or more intents. The sections
below explain what intent is, how to define it in your model, and how it works.
</p>
</section>
<section id="intent">
<h2 class="section-title">Intent <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
The goal of the data model implementation is to take the user input text and
match it to a specific user-defined code that will execute for that input. The mechanism that
provides this matching is called an <em>intent</em>.
</p>
<p>
The intent generally refers to the goal that the end-user had in mind when speaking or typing the input utterance.
The intent has a <em>declarative part or template</em> written in <a href="#idl">Intent Definition Language</a> that strictly defines
a particular form the user input.
Intent is also <a href="#binding">bound</a> to a callback method that will be executed when that intent, i.e. its template, is detected as the best match
for a given input. A typical data model will have multiple intents defined for each form of the expected user input
that model wants to react to.
</p>
<p>
For example, a data model for banking chat bot or analytics application can have multiple intents for each domain-specific group of input such as
opening an account, closing an account, transferring money, getting statements, etc.
</p>
<p>
Intents can be specific or generic in terms of what input they match.
Multiple intents can overlap and NLPCraft will disambiguate such cases to select the intent with the
overall best match. In general, the most specific intent match wins.
</p>
</section>
<section id="idl">
<h2 class="section-title">IDL - Intent Definition Language <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
NLPCraft intents are written in Intent Definition Language (IDL).
IDL is a relatively straightforward declarative language. For example,
here's a simple intent <code>x</code> with two terms <code>a</code> and <code>b</code>:
</p>
<pre class="brush: idl">
intent=x
term(a)~{tok_id() == 'my_elm'}
term(b)={has(tok_groups(), "my_group")}
</pre>
<p>
IDL intent defines a match between the parsed user input represented as the collection of
<a class="not-code" target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCToken.html">tokens</a>,
and the user-define callback method. IDL intents are <a href="#binding">bound</a> to their callbacks via <a href="#binding">Java
annotation</a> and can be <a href="#idl_location">located</a> in the same Java annotations or placed in model YAML/JSON file
as well as in external <code>*.idl</code> files.
</p>
<p>
You can review the formal
<a target="github" href="https://github.com/apache/incubator-nlpcraft/blob/master/nlpcraft/src/main/scala/org/apache/nlpcraft/model/intent/compiler/antlr4/NCIdl.g4">ANTLR4 grammar</a> for IDL,
but here are the general properties of IDL:
</p>
<ul>
<li>
IDL has
<a target="wiki" href="https://en.wikipedia.org/wiki/Context-free_grammar">context-free grammar</a>. In simpler terms,
all whitespaces outside of string literals are ignored.
</li>
<li>
IDL supports Java-style comments, both single line <code>// Comment.</code> as well as multi-line <code>/* Comment. */</code>.
</li>
<li>
String literals can use either single quotes (<code>'text'</code>) or double quotes (<code>"text"</code>) simplifying IDL usage in JSON or Java languages - you
don't have to escape double quotes. Both quotes can be escaped in string, i.e. <code>"text with \" quote"</code> or
<code>'text with \' quote'</code>
</li>
<li>
Built-in literals <code>true</code>, <code>false</code> and <code>null</code> for boolean and null values.
</li>
<li>
Algebraic and logical expression including operator precedence follow standard Java language conventions.
</li>
<li>
Both integer and real numeric literals can use underscore <code>'_'</code> character for separation as in <code>200_000</code>.
</li>
<li>
Numeric literals use Java string conversions.
</li>
<li>
IDL has only 10 reserved keywords: <code>flow fragment import intent meta options term true false null</code>
</li>
<li>
Identifiers and literals can use the same Unicode space as Java.
</li>
<li>
IDL provides over 50 <a href="#idl_functions">built-in functions</a> to aid in intent matching. IDL functions are pure immutable mathematical functions
that work on a runtime stack. In other words, they look like Python functions: IDL <code>length(trim(" text "))</code> vs.
OOP-style <code>" text ".trim().length()</code>.
</li>
<li>
IDL is a lazily evaluated language, i.e. expressions are evaluated only when required during runtime. That
means that evaluated left-to-right logical AND and OR operators, for example, skip their right-part expressions if the left expression result is
determinative for the overall result - so-called short-circuit evaluation. Some IDL functions like
<code>if</code> and <code>or_else</code> also provide the similar short-circuit evaluation.
</li>
</ul>
<p>
IDL program consists of
<a href="#intent_statement">intent</a>,
<a href="#fragment_statement">fragment</a>, or
<a href="#import_statement">import</a> statements in any order or combination:
</p>
<ul>
<li>
<p id="intent_statement">
<b><code>intent</code> statement</b>
</p>
<p>
Intent is defined as one or more terms. Each term is a predicate over a instance of
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCToken.html">NCToken</a> interface.
For an intent to match all of its terms have to evaluate to true.
Intent definition can be informally explained using the following full-feature example:
</p>
<pre class="brush: idl">
intent=xa
flow="^(?:login)(^:logout)*$"
meta={'enabled': true}
term(a)={month() >= 6 && !(tok_id()) != "z" && meta_intent('enabled') == true}[1,3]
term(b)~{
@tokId = tok_id()
@usrTypes = meta_model('user_types')
(tokId == 'order' || tokId == 'order_cancel') && has_all(@usrTypes, list(1, 2, 3))
}
intent=xb
options={
'ordered': false,
'unused_free_words': true,
'unused_sys_toks': true,
'unused_usr_toks': false,
'allow_stm_only': false
}
flow=/#flowModelMethod/
term(a)=/org.mypackage.MyClass#termMethod/?
fragment(frag, {'p1': 25, 'p2': {'a': false}})
</pre>
<p><b>NOTES:</b></p>
<dl>
<dt>
<code>intent=xa</code> <sup><small>line 1</small></sup><br/>
<code>intent=xb</code> <sup><small>line 12</small></sup>
</dt>
<dd>
<code>xa</code> and <code>xb</code> are the mandatory intent IDs. Intent ID is any arbitrary unique string matching the following lexer template:
<code>(UNI_CHAR|UNDERSCORE|LETTER|DOLLAR)+(UNI_CHAR|DOLLAR|LETTER|[0-9]|COLON|MINUS|UNDERSCORE)*</code>
</dd>
<dt><code>options={...}</code> <sup><small>line 13</small></sup></dt>
<dd>
<em>Optional.</em>
Matching options specified as JSON object. Entire JSON object as well as each individual JSON field is optional. Allows
to customize the matching algorithm for this intent:
<table class="gradient-table">
<thead>
<tr>
<td>Option</td>
<td>Type</td>
<td>Description</td>
<td>Default Value</td>
</tr>
</thead>
<tbody>
<tr>
<td><code>ordered</code></td>
<td><code>Boolean</code></td>
<td>
<p>
Whether or not this intent is ordered.
For ordered intent the specified order of terms is important for matching this intent.
If intent is unordered its terms can be found in any order in the input text.
Note that ordered intent significantly limits the user input it can match. In most cases
the ordered intent is only applicable to processing of a formal grammar (like a programming language)
and mostly unsuitable for the natural language processing.
</p>
<p>
Note that while the <code>ordered</code> flag affect entire intent and all its
terms, you can define the individual term that depends on the position of the token.
This, in fact, allows you have a subset of terms that order dependant. See
the following <a href="#idl_functions">IDL functions</a> for details:
</p>
<ul>
<li><code>tok_index()</code></li>
<li><code>tok_all()</code></li>
<li><code>tok_count()</code></li>
<li><code>tok_is_last()</code></li>
<li><code>tok_is_first()</code></li>
<li><code>tok_is_before_id()</code></li>
<li><code>tok_is_before_parent()</code></li>
<li><code>tok_is_before_group()</code></li>
<li><code>tok_is_after_id()</code></li>
<li><code>tok_is_after_parent()</code></li>
<li><code>tok_is_after_group()</code></li>
</ul>
</td>
<td><code>false</code></td>
</tr>
<tr>
<td><code>unused_free_words</code></td>
<td><code>Boolean</code></td>
<td>
Whether or not free words - that are unused by intent matching - should be
ignored (value <code>true</code>) or reject the intent match (value <code>false</code>).
Free words are the words in the user input that were not recognized as any user or system
token. Typically, for the natural language comprehension it is safe to ignore free
words. For the formal grammar, however, this could make the matching logic too loose.
</td>
<td><code>true</code></td>
</tr>
<tr>
<td><code>unused_sys_toks</code></td>
<td><code>Boolean</code></td>
<td>
Whether or not unused <a href="/data-model.html#builtin">system tokens</a> should be
ignored (value <code>true</code>) or reject the intent match (value <code>false</code>).
By default, tne unused system tokens are ignored.
</td>
<td><code>true</code></td>
</tr>
<tr>
<td><code>unused_usr_toks</code></td>
<td><code>Boolean</code></td>
<td>
Whether or not unused user-defined tokens should be
ignored (value <code>true</code>) or reject the intent match (value <code>false</code>).
By default, tne unused user tokens are not ignored since it is assumed that user would
define his or her own tokens on purpose and construct the intent logic appropriate.
</td>
<td><code>false</code></td>
</tr>
<tr>
<td><code>allow_stm_only</code></td>
<td><code>Boolean</code></td>
<td>
Whether or not the intent can match when all of the matching tokens came from STM.
By default, this special case is disabled (value <code>false</code>). However, in specific intents
designed for free-form language comprehension scenario, like, for example, SMS messaging - you
may want to enable this option.
</td>
<td><code>false</code></td>
</tr>
</tbody>
</table>
</dd>
<dt>
<code>flow="^(?:login)(^:logout)*$"</code> <sup><small>line 2</small></sup><br/>
<code>flow=/#flowModelMethod/</code> <sup><small>line 20</small></sup>
</dt>
<dd>
<p>
<em>Optional.</em> Dialog flow is a history of previously matched intents to match on. If provided,
the intent will first match on the history of the previously matched intents before processing its
terms. There are two way to define a match on the dialog flow:
</p>
<ul>
<li>
<p><b>Regular Expression</b></p>
<p>
In this case dialog flow specification is a string with the standard <a target=_blank href="https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html">Java regular expression</a>.
The history of previously matched intents is presented as a space separated string of intent IDs that were
selected as the best match during the current conversation, in the chronological order with the most
recent matched intent ID being the first element in the string. Dialog flow regular expression
will be matched against that string representing intent IDs.
</p>
<p>
In the line 2, the <code>^(?:login)(^:logout)*$</code> dialog flow regular expression defines that intent
should only match when the immediate previous intent was <code>login</code> and no <code>logout</code> intents
are in the history. If the history is <code>"login order order"</code> - this intent will match. However, for
<code>"login logout"</code> or <code>"order login"</code> history this dialog flow will not match.
</p>
</li>
<li>
<p><b>User-Defined Callback</b></p>
<p>
In this case the dialog flow specification is defined as a callback in a form <code>/x.y.z.Cass#method/</code>,
where <code>x.y.z.Class</code> should be a fully qualified name of the class where callback is defined, and
<code>method</code> must be the name of the callback method. This method should take one
parameter of type <code>java.util.List[<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCDialogFlowItem.html">NCDialogFlowItem</a>]</code>
and return <code>boolean</code> result.
</p>
<p>
Class name is optional in which case the model class will be used by default. Note that if the custom class
is in fact specified, the instance of this class will be created for each dialog flow test.
This class must have a no-arg constructor to instantiate via standard Java reflection
and its creation must be as light as possible to avoid performance degradation during its
instantiation. For this reasons it is recommended to have dialog flow callback
on the model class itself which will avoid instantiating the class on each dialog flow evaluation.
</p>
</li>
</ul>
<p>
Note that if dialog flow is defined and it doesn't match the history the terms of the intent won't be tested at all.
</p>
</dd>
<dt>
<code>meta={'enabled': true}</code> <sup><small>line 3</small></sup>
</dt>
<dd>
<p>
<em>Optional.</em>
Just like the most of the components in NLPCraft, the intent can have its own metadata.
Intent metadata is defined as a standard JSON object which will be converted into <code>java.util.Map</code>
instance and can be accessed in intent's terms via
<a href="#idl_functions"><code>meta_intent()</code></a> IDL function.
The typical use case for declarative intent metadata is to parameterize its behavior, i.e. the behavior of its terms,
with a clearly defined properties that are provided right inside of intent definition itself.
</p>
</dd>
<dt>
<code>term(a)={month() >= 6 && !(tok_id()) != "z" && meta_intent('enabled') == true}[1,3]</code> <sup><small>line 4</small></sup><br>
<code>term(b)~{</code> <sup><small>line 5</small></sup><br>
<code style="padding-left: 20px">@tokId = tok_id()</code><br>
<code style="padding-left: 20px">@usrTypes = meta_model('user_types')</code><br>
<code style="padding-left: 20px">(tokId == 'order' || tokId == 'order_cancel') && has_all(@usrTypes, list(1, 2, 3))</code><br>
<code>}</code><br>
<code>term(a)=/org.mypackage.MyClass#termMethod/?</code> <sup><small>line 21</small></sup>
</dt>
<dd>
<p>
Term is a building block of the intent. Intent must have at least one term.
Term has optional ID, a token predicate and optional quantifiers.
It supports conversation context if it uses <code>'~'</code> symbol or not if it uses <code>'='</code>
symbol in its definition. For the conversational term the system will search for a match using tokens from
the current request as well as the tokens from conversation STM (short-term-memory). For a non-conversational
term - only tokens from the current request will be considered.
</p>
<p>
A term is matched if its token predicate returns true.
The matched term represents one or more tokens, sequential or not, that were detected in the user input. Intent has a list of terms
(always at least one) that all have to be matched in the user input for the intent to match. Note that term
can be optional if its min quantifier is zero. Whether or not the order of the terms is important
for matching is governed by intent's <code>ordered</code> parameter.
</p>
<p>
Term ID (<code>a</code> and <code>b</code>) is optional. It is only required by
<a href="#binding"><code>@NCIntentTerm</code></a>
annotation to link term's tokens to a formal parameter of the callback method. Note that term ID follows
the same lexical rules as intent ID.
</p>
<p>
Term's body can be defined in two ways:
</p>
<ul>
<li>
<p><b>IDL Expression</b></p>
<p>
Inside of curly brackets <code>{</code> <code>}</code> you can have an optional list of term variables
and the mandatory term expression that must evaluate to a boolean value. Term variable name must start with
<code>@</code> symbol and be unique within the scope of the current term. All term variables must be defined
and initialized before term expression which must be the last statement in the term:
</p>
<pre class="brush: idl">
term(b)~{
@a = meta_model('a')
@lst = list(1, 2, 3, 4)
has_all(@lst, list(@a, 2))
}
</pre>
<p>
Term variable initialization expression as well as term's expression follow
<em>Java-like expression grammar</em> including precedence rules, brackets and logical combinators, as well as
built-in <a href="#idl_functions">IDL functions</a> calls:
</p>
<pre class="brush: idl">
term={true} // Special case of 'constant' term.
term={
// Variable declarations.
@a = round(1.25)
@b = meta_model('my_prop')
// Last expression must evaluate to boolean.
(@a + 2) * @b > 0
}
term={
// Variable declarations.
@c = meta_tok('prop')
@lst = list(1, 2, 3)
// Last expression must evaluate to boolean.
abs(@c) > 1 && size(@lst) != 5
}
</pre>
<div class="bq info">
<p>
<b>NOTE:</b> while term variable initialization expressions can have any type - the
term's expression itself, i.e. the last expression in the term's body, <em>must evaluate to a boolean result only.</em>
Failure to do so will result in a runtime exception during intent evaluation. Note also
that such errors cannot be detected during intent compilation phase.
</p>
</div>
</li>
<li>
<p><b>User-Defined Callback</b></p>
<p>
In this case the term's body is defined as a callback in a form <code>/x.y.z.Cass#method/</code>,
where <code>x.y.z.Class</code> should be a fully qualified name of the class where callback is defined, and
<code>method</code> must be the name of the callback method. This method should take one
parameter of type <code><a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCTokenPredicateContext.html">NCTokenPredicateContext</a></code>
and return an instance of <code><a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCTokenPredicateResult.html">NCTokenPredicateResult</a></code>
as its result:
</p>
<pre class="brush: idl">
term(a)=/org.mypackage.MyClass#termMethod/?
</pre>
<p>
Class name is optional in which case the model class will be used by default. Note that if the custom class
is in fact specified, the instance of this class will be created for each term evaluation.
This class must have a no-arg constructor to instantiate via standard Java reflection
and its creation must be as light as possible to avoid performance degradation during its
instantiation. For this reasons it is recommended to have user-defined term callback
on the model class itself which will avoid instantiating the class on each term evaluation.
</p>
</li>
</ul>
<p>
<code>?</code> and <code>[1,3]</code> define an inclusive quantifier for that term, i.e. how many times
the match for this term should found. You can use the following quick abbreviations:
</p>
<ul class="recover-bottom-margin">
<li><code>*</code> is equal to <code>[0,∞]</code></li>
<li><code>+</code> is equal to <code>[1,∞]</code></li>
<li><code>?</code> is equal to <code>[0,1]</code></li>
<li>No quantifier defaults to <code>[1,1]</code></li>
</ul>
<p>
As mentioned above the quantifier is inclusive, i.e. the <code>[1,3]</code> means that
the term should appear once, two times or three times.
</p>
</dd>
<dt>
<code>fragment(frag, {'p1': 25, 'p2': {'a': false}})</code> <sup><small>line 22 </small></sup><br>
</dt>
<dd>
<p>
Fragment reference allows to insert the terms defined by that fragment in place of this fragment reference.
Fragment reference has mandatory fragment ID parameter and optional JSON second parameter. Optional
JSON parameter allows to parameterize the inserted terms' behavior and it is available to the
terms via <code>meta_frag()</code> <a href="#idl_functions">IDL function.</a>
</p>
</dd>
</dl>
</li>
<li>
<p id="fragment_statement">
<b><code>fragment</code> statement</b>
</p>
<p>
Fragments allow to group and name a set of reusable terms. Such groups can be further parameterized
at the place of reference and enable the reuse of one or more terms by multiple intents. For
example:
</p>
<pre class="brush: idl, highlight: [2, 3, 15, 18]">
// Fragments.
fragment=buzz term~{tok_id() == meta_frag('id')}
fragment=when
term(nums)~{
// Term variable.
@type = meta_tok('nlpcraft:num:unittype')
@iseq = meta_tok('nlpcraft:num:isequalcondition')
tok_id() == 'nlpcraft:num' && @type == 'datetime' && @iseq == true
}[0,7]
// Intents.
intent=alarm
// Insert parameterized terms from fragment 'buzz'.
fragment(buzz, {"id": "x:alarm"})
// Insert terms from fragment 'when'.
fragment(when)
</pre>
<p><b>NOTES:</b></p>
<ul class="recover-bottom-margin">
<li>
Fragment statements (line 2 and 3) have a name (<code>buzz</code> and <code>when</code>) and a list of terms.
</li>
<li>
Terms follow the same syntax as in intent definition.
</li>
<li>
When a fragment is referenced in intent (lines 15 and 18) it is replaced with its terms.
</li>
</ul>
</li>
<li>
<p id="import_statement">
<b><code>import</code> statement</b>
</p>
<p>
Import statement allows to import IDL declarations from either local file, classpath resource or URL:
</p>
<pre class="brush: idl">
// Import using absolute path.
import('/opt/globals.idl')
// Import using classpath resource.
import('org/apache/nlpcraft/examples/alarm/intents.idl')
// Import using URL.
import('ftp://user:password@myhost:22/opt/globals.idl')
</pre>
<p>
<b>NOTES:</b>
</p>
<ul class="recover-bottom-margin">
<li>
The effect of importing is the same as if the imported declarations were inserted in place of import
statement.
</li>
<li>
Recursive and cyclic imports are detected and safely ignored.
</li>
<li>
Import statement starts with <code>import</code> keyword and has a string parameter that indicates
the location of the resource to import.
</li>
<li>
For the classpath resource you don't need to specify leading forward slash.
</li>
</ul>
<p></p>
</li>
</ul>
<h2 class="section-sub-title">Intent Lifecycle <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
During NLPCraft data probe start it scans the models provided in its configuration for the intents. The
scanning process goes through JSON/YAML external configurations as well as model classes when looking
for IDL intents. All found intents are compiled into an internal representation before the data probe
completes its start up sequence.
</p>
<p>
Note that not all intents problems can be detected at the compilation phase, and probe can start with intents
not being completely validated. For example, each term in the intent must evaluate to a boolean result. This can
only be checked at runtime. Another example is the number and the types of parameters passed into IDL function
which is only checked at runtime as well.
</p>
<p>
Intents are compiled only once during the data probe start up sequence and cannot be re-compiled
without data probe restart. Model logic, however, can affect the intent behavior through <a href="/data-model.html#callbacks">model callbacks</a>,
<a target=_ href="/apis/latest/org/apache/nlpcraft/model/NCModelView.html#getMetadata()">model metadata</a>,
user and company metadata, as well as request data all of which can change at runtime and
are accessible through <a href="#idl_functions">IDL functions.</a>
</p>
<h2 class="section-sub-title">Intent Examples <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
Here's few of intent examples with explanations:
</p>
<p>
<b>Example 1:</b>
</p>
<pre class="brush: idl">
intent=a
term~{tok_id() == 'x:id'}
term(nums)~{tok_id() == 'nlpcraft:num' && lowercase(meta_tok('nlpcraft:num:unittype')) == 'datetime'}[0,2]
</pre>
<p><b>NOTES:</b></p>
<ul>
<li>
Intent has ID <code>a</code>.
</li>
<li>
Intent uses default conversational support (<code>true</code>) and default order (<code>false</code>).
</li>
<li>
Intent has two conversational terms (<code>~</code>) that have to be found for the intent to match. Note that second
term is optional as it has <code>[0,2]</code> quantifier.
</li>
<li>
Both terms have to be found in the user input for the intent to match.
</li>
<li>
First term matches any single token with ID <code>x:id</code>.
</li>
<li>
Second term can appear zero, once or two times and it matches token with ID <code>nlpcraft:num</code> with
<code>nlpcraft:num:unittype</code> metadata property equal to <code>'datetime'</code> string.
</li>
<li>
IDL function <code>lowercase</code> used on <code>nlpcraft:num:unittype</code> metadata property value.
</li>
<li>
Note that since second term has ID (<code>nums</code>) it can be references by <code>@NCIntentTerm</code>
annotation by the callback formal parameter.
</li>
</ul>
<br/>
<p>
<b>Example 2:</b>
</p>
<pre class="brush: idl">
intent=id2
flow='id1 id2'
term={tok_id() == 'mytok' && signum(get(meta_tok('score'), 'best')) != -1}
term={has_any(tok_groups(), list('actors', 'owners')) && size(meta_part('partAlias, 'text')) > 10}
</pre>
<p><b>NOTES:</b></p>
<ul>
<li>
Intent has ID <code>id2</code>.
</li>
<li>
Intent has dialog flow pattern: <code>'id1 id2'</code>. It expects the sequence of intents <code>id1</code> and
<code>id2</code> somewhere in the history of previously matched intents in the course of the current conversation.
</li>
<li>
Intent has two non-conversational terms (<code>=</code>). Both terms have to be present only once (their implicit quantifiers are <code>[1,1]</code>).
</li>
<li>
Both terms have to be found in the user input for the intent to match.
</li>
<li>
First term should be a token with ID <code>mytok</code> and have metadata property <code>score</code> of type
map. This map should have a value with the string key <code>'best'</code>. <code>signum</code> of this map value
should not equal <code>-1</code>. Note that <code>meta_tok()</code>, <code>get()</code> and
<code>signum()</code> are all built-in <a href="#idl_functions">IDL functions</a>.
</li>
<li>
Second term should be a token that belongs to either <code>actors</code> or <code>owners</code> group.
It should have a part token whose with alias <code>partAlias</code>. That
part token should have metadata property <code>text</code> of type string, list or map. The length of
this string, list or map should be greater than <code>10</code>.
</li>
</ul>
<h2 id="idl_functions" class="section-title">IDL Functions <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
IDL provides over 140 built-in functions that can be used in IDL intent definitions.
IDL function call takes on traditional
<code><b>fun_name</b>(p1, p2, ... pk)</code> syntax form.
IDL function operates on stack - its parameters
are taken from the stack and its result is put back onto stack which in turn can become a parameter for the next function
call and so on. IDL functions can have zero or more parameters and always have one result value (i.e. no pure side-effect functions). Some IDL
functions support variable number of parameters. Note that you cannot define your own functions in IDL - in such
cases you need to use the term with the user-defined callback method.
</p>
<p>
When chaining the function
calls IDL uses mathematical notation (a-la Python) rather than object-oriented one: IDL <code>length(trim(" text "))</code> vs. OOP-style <code>" text ".trim().length()</code>.
</p>
<p>
IDL functions operate with the following types:
</p>
<table class="gradient-table">
<thead>
<tr>
<th>JVM Type</th>
<th>IDL Name</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr><td><code>java.lang.String</code></td><td><code>String</code></td><td></td></tr>
<tr>
<td>
<code>java.lang.Long</code><br/>
<code>java.lang.Integer</code><br/>
<code>java.lang.Short</code><br/>
<code>java.lang.Byte</code>
</td>
<td><code>Long</code></td>
<td>
Smaller numerical types will be converted to <code>java.lang.Long</code>.
</td>
</tr>
<tr>
<td>
<code>java.lang.Double</code><br/>
<code>java.lang.Float</code>
</td>
<td><code>Double</code></td>
<td>
<code>java.lang.Float</code> will be converted to <code>java.lang.Double</code>.
</td>
</tr>
<tr><td><code>java.lang.Boolean</code></td><td><code>Boolean</code></td><td>You can use <code><b>true</b></code> or <code><b>false</b></code> literals.</td></tr>
<tr><td><code>java.util.List&lt;T&gt;</code></td><td><code>List[T]</code></td><td>Use <code>list(...)</code> IDL function to create new list.</td></tr>
<tr><td><code>java.util.Map&lt;K,V&gt;</code></td><td><code>Map[K,V]</code></td><td></td></tr>
<tr><td><code><a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCToken.html">NCToken</a></code></td><td><code>Token</code></td><td></td></tr>
<tr><td><code>java.lang.Object</code></td><td><code>Any</code></td><td>Any of the supported types above. Use <code><b>null</b></code> literal for null value.</td></tr>
</tbody>
</table>
<p>
Some IDL functions are polymorphic, i.e. they can accept arguments and return result of multiple types.
Encountering unsupported types will result in a runtime error during intent matching. It is
especially important to watch out for the types when adding objects to various metadata containers
and using that metadata in the IDL expressions.
</p>
<div class="bq warn">
<p><b>Unsupported Types</b></p>
<p>
Detection of the unsupported types by IDL functions cannot be done during IDL compilation and
can <em>only be done during runtime execution</em>. This means that even though the data probe compiles IDL
intents and starts successfully - it does not guarantee that intents will operate correctly.
</p>
</div>
<p>
All IDL functions are organized into the following groups:
</p>
<nav>
<div class="nav nav-tabs" role="tablist">
<a class="nav-item nav-link active" data-toggle="tab" href="#fn_token" role="tab">Token</a>
<a class="nav-item nav-link" data-toggle="tab" href="#fn_text" role="tab">Text</a>
<a class="nav-item nav-link" data-toggle="tab" href="#fn_math" role="tab">Math</a>
<a class="nav-item nav-link" data-toggle="tab" href="#fn_collection" role="tab">Collection</a>
<a class="nav-item nav-link" data-toggle="tab" href="#fn_metadata" role="tab">Metadata</a>
<a class="nav-item nav-link" data-toggle="tab" href="#fn_datetime" role="tab">Date <span class="amp">&amp;</span> Time</a>
<a class="nav-item nav-link" data-toggle="tab" href="#fn_req" role="tab">Request</a>
<a class="nav-item nav-link" data-toggle="tab" href="#fn_user" role="tab">User</a>
<a class="nav-item nav-link" data-toggle="tab" href="#fn_company" role="tab">Company</a>
<a class="nav-item nav-link" data-toggle="tab" href="#fn_other" role="tab">Other</a>
</div>
</nav>
<div class="tab-content">
<div class="tab-pane fade show active" id="fn_token" role="tabpanel">
<div class="accordion" id="token_fns">
{% for fn in site.data.idl-fns.fn-token %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#token_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
<div class="tab-pane fade show" id="fn_text" role="tabpanel">
<div class="accordion" id="text_fns">
{% for fn in site.data.idl-fns.fn-text %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#text_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
<div class="tab-pane fade show" id="fn_math" role="tabpanel">
<div class="accordion" id="math_fns">
{% for fn in site.data.idl-fns.fn-math %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#math_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
<div class="tab-pane fade show" id="fn_collection" role="tabpanel">
<div class="accordion" id="collections_fns">
{% for fn in site.data.idl-fns.fn-collections %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#collections_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
<div class="tab-pane fade show" id="fn_metadata" role="tabpanel">
<div class="accordion" id="metadata_fns">
{% for fn in site.data.idl-fns.fn-metadata %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#metadata_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
<div class="tab-pane fade show" id="fn_datetime" role="tabpanel">
<div class="accordion" id="datetime_fns">
{% for fn in site.data.idl-fns.fn-datetime %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#datetime_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
<div class="tab-pane fade show" id="fn_req" role="tabpanel">
<div class="accordion" id="req_fns">
{% for fn in site.data.idl-fns.fn-req %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#req_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
<div class="tab-pane fade show" id="fn_user" role="tabpanel">
<div class="accordion" id="user_fns">
{% for fn in site.data.idl-fns.fn-user %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#user_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
<div class="tab-pane fade show" id="fn_company" role="tabpanel">
<div class="accordion" id="company_fns">
{% for fn in site.data.idl-fns.fn-company %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#company_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
<div class="tab-pane fade show" id="fn_other" role="tabpanel">
<div class="accordion" id="other_fns">
{% for fn in site.data.idl-fns.fn-other %}
<div class="card">
<div class="card-header">
<h2 class="mb-0">
<button class="btn btn-link btn-block text-left" type="button" data-toggle="collapse" data-target="#fn_{{fn.name}}">
<span><code>{{fn.sig}}</code></span>
<span class="fn-short-desc">{{fn.synopsis}}</span>
</button>
</h2>
</div>
<div id="fn_{{fn.name}}" class="collapse" data-parent="#other_fns">
<div class="card-body">
<p class="fn-desc">
<em>Description:</em><br>
{{fn.desc}}
</p>
<p class="fn-usage">
<em>Usage:</em><br>
</p>
<pre class="brush:idl">{{fn.usage}}</pre>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
</div>
</section>
<section>
<h2 id="idl_location" class="section-title">IDL Location <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
IDL declarations can be placed in different locations based on user preferences:
</p>
<ul>
<li>
<p>
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntent.html">@NCIntent</a> annotation
takes a string as its parameter that should be a valid IDL declaration. For example, Scala code snippet:
</p>
<pre class="brush: scala, highlight: [1, 2]">
&#64;NCIntent("import('/opt/myproj/global_fragments.idl')") // Importing.
&#64;NCIntent("intent=act term(act)={has(tok_groups(), 'act')} fragment(f1)") // Defining in place.
def onMatch(
&#64;NCIntentTerm("act") actTok: NCToken,
&#64;NCIntentTerm("loc") locToks: List[NCToken]
): NCResult = {
...
}
</pre>
</li>
<li>
<p>
External JSON/YAML <a href="/data-model.html#config">data model configuration</a> can provide one or more
IDL declarations in <code>intents</code> field. For example:
</p>
<pre class="brush: js, highlight: [7]">
{
"id": "nlpcraft.alarm.ex",
"name": "Alarm Example Model",
.
.
.
"intents": [
"import('/opt/myproj/global_fragments.idl')", // Importing.
"import('/opt/myproj/my_intents.idl')", // Importing.
"intent=alarm term~{tok_id()=='x:alarm'}" // Defining in place.
]
}
</pre>
</li>
<li>
External <code>*.idl</code> files contain IDL declarations and can be imported in any other places where
IDL declarations are allowed. See <code>import()</code> statement explanation below. For example:
<pre class="brush: idl">
/*
* File 'my_intents.idl'.
* ======================
*/
import('/opt/globals.idl') // Import global intents and fragments.
// Fragments.
// ----------
fragment=buzz term~{tok_id() == 'x:alarm'}
fragment=when
term(nums)~{
// Term variables.
@type = meta_tok('nlpcraft:num:unittype')
@iseq = meta_tok('nlpcraft:num:isequalcondition')
tok_id() == 'nlpcraft:num' && @type != 'datetime' && @iseq == true
}[0,7]
// Intents.
// --------
intent=alarm
fragment(buzz)
fragment(when)
</pre>
</li>
</ul>
</section>
<section id="binding">
<h2 class="section-title">Binding Intent <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
IDL intents must be bound to their callback methods. This binding is accomplished using the
following Java annotations:
</p>
<table class="gradient-table">
<thead>
<tr>
<th>Annotation</th>
<th>Target</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntent.html">@NCIntent</a></td>
<td>Callback method or model class</td>
<td>
<p>
When applied to a method this annotation allows to defines IDL intent in-place on the method
serving as its callback.
This annotation can also be applied to a model's class in which case it will just declare the intent
without binding it and the
callback method will need to use <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentRef.html">@NCIntentRef</a> annotation to actually bind it to the
declared intent above. Note that multiple intents can be bound to the same callback method, but only
one callback method can be bound with a given intent.
</p>
<p>
This method is ideal for simple intents and quick declaration right in the source code and has
all the benefits of having IDL to be part of the source code. However, multi-line IDL declaration can be awkward
to add and maintain depending on JVM language, i.e. multi-line string literal support. In such
cases it is advisable to move IDL declarations into separate <code>*.idl</code> file or files
and import them either in the JSON/YAML model or at the model class level.
</p>
</td>
</tr>
<tr>
<td><a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentRef.html">@NCIntentRef</a></td>
<td>Callback method</td>
<td>
This annotation allows to reference an intent defined elsewhere like an external JSON or YAML
model definition, <code>*.idl</code> file, or other <code>@NCIntent</code> annotations. In real
applications, this is a most common way to bound an externally defined intent to its callback method.
</td>
</tr>
<tr>
<td><a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentTerm.html">@NCIntentTerm</a></td>
<td>Callback method parameter</td>
<td>
This annotation marks a formal callback method parameter to receive term's tokens when the intent
to which this term belongs is selected as the best match.
</td>
</tr>
<tr>
<td><a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentSample.html">@NCIntentSample</a></td>
<td>Callback method</td>
<td>
Annotation that provides one or more sample of the input that associated intent should match on.
Although this annotation is optional it's <b>highly recommended</b> to provide at least several samples per intent. There's no upper
limit on how many examples can be provided and typically the more examples the better for the built-in tools.
These samples serve documentation purpose as well as used in built-in model <a href="/tools/test_framework.html">auto-validation</a>
and and <a href="/tools/syn_tool.html">synonym suggesting</a> tools.
</td>
</tr>
<tr>
<td><a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentSampleRef.html">@NCIntentSampleRef</a></td>
<td>Callback method</td>
<td>
Annotation that allows to load samples of the input that associated intent should match on from the external
sources like local file, classpath resource or URL.
Although this annotation is optional it's <b>highly recommended</b> to provide at least several samples per intent. There's no upper
limit on how many examples can be provided and typically the more examples the better for the built-in tools.
These samples serve documentation purpose as well as used in built-in model <a href="/tools/test_framework.html">auto-validation</a>
and and <a href="/tools/syn_tool.html">synonym suggesting</a> tools.
</td>
</tr>
</tbody>
</table>
<p>
Here's a couple of examples of intent declarations to illustrate the basics of intent declaration and usage.
</p>
<p>
An intent from
<a href="examples/light_switch.html">Light Switch</a> Scala example:
</p>
<pre class="brush: scala">
&#64;NCIntent("intent=act term(act)={groups @@ 'act'} term(loc)={trim(id) == 'ls:loc'}*")
&#64;NCIntentSample(Array(
"Turn the lights off in the entire house.",
"Switch on the illumination in the master bedroom closet.",
"Get the lights on.",
"Please, put the light out in the upstairs bedroom.",
"Set the lights on in the entire house.",
"Turn the lights off in the guest bedroom.",
"Could you please switch off all the lights?",
"Dial off illumination on the 2nd floor.",
"Please, no lights!",
"Kill off all the lights now!",
"No lights in the bedroom, please."
))
def onMatch(
&#64;NCIntentTerm("act") actTok: NCToken,
&#64;NCIntentTerm("loc") locToks: List[NCToken]
): NCResult = {
...
}
</pre>
<p>
<b>NOTES:</b>
</p>
<ul>
<li>
The intent is defined in-place using <code>@NCIntent</code> annotation.
</li>
<li>
A term match is defined as one or more tokens. Term can be optional if its min quantifier is zero.
</li>
<li>
An intent <code>act</code> has two non-conversational terms: one mandatory term and another
that can match zero or more tokens with method <code>onMatch(...)</code> as its callback.
</li>
<li>
Terms is conversational if it uses <code>'~'</code> and non-conversational if it uses <code>'='</code>
symbol in its definition. If term is conversational, the matching algorithm will look into the conversation
context short-term-memory (STM) to seek the matching tokens for this term. Note that the terms that were fully or partially matched using tokens from
the conversation context will contribute a smaller weight to the overall intent matching weight since these terms are <em>less specific.</em>
Non-conversational terms will be matched using tokens found only in the current user input without looking at the conversation context.
</li>
<li>
Method <code>onMatch(...)</code> will be called if and when this intent is selected as the best match.
</li>
<li>
Note that terms have <code>min=1, max=1</code> quantifiers by default, i.e. one and only one.
</li>
<li>
First term defines any single token that belongs to the group <code>act</code>. Note that model elements
can belong to multiple groups.
</li>
<li>
Second term would match zero or more tokens with ID <code>ls:loc</code>. Note that we use function <code>trim</code>
on the token ID.
</li>
<li>
Note that both terms have IDs (<code>act</code> and <code>loc</code>) that are used in <code>onMatch(...)</code>
method parameters to automatically assign terms' tokens to the formal method parameters using <code>@NCIntentTerm</code>
annotations.
</li>
</ul>
<br/>
<p>
In the following <a href="examples/alarm_clock.html">Alarm Clock</a> Java example
the intent is defined in JSON model definition and referenced in Java code using <code>@NCIntentTerm</code>
annotation:
</p>
<pre class="brush: js, highlight: [19]">
{
"id": "nlpcraft.alarm.ex",
"name": "Alarm Example Model",
"version": "1.0",
"enabledBuiltInTokens": [
"nlpcraft:num"
],
"elements": [
{
"id": "x:alarm",
"description": "Alarm token indicator.",
"synonyms": [
"{ping|buzz|wake|call|hit} {me|up|me up|_}",
"{set|_} {my|_} {wake|wake up|_} {alarm|timer|clock|buzzer|call} {up|_}"
]
}
],
"intents": [
"intent=alarm term~{tok_id()=='x:alarm'} term(nums)~{tok_id() == 'nlpcraft:num' && meta_tok('nlpcraft:num:unittype') == 'datetime' && meta_tok('nlpcraft:num:isequalcondition') == true}[0,7]"
]
}
</pre>
<pre class="brush: java, highlight: [1]">
&#64;NCIntentRef("alarm")
&#64;NCIntentSample({
"Ping me in 3 minutes",
"Buzz me in an hour and 15mins",
"Set my alarm for 30s"
})
private NCResult onMatch(
NCIntentMatch ctx,
&#64;NCIntentTerm("nums") List&lt;NCToken&gt; numToks
) {
...
}
</pre>
<p>
<b>NOTES:</b>
</p>
<ul>
<li>
Intent is defined in the external JSON model declaration (see line 19 in JSON file).
</li>
<li>
This intent is referenced by annotation <code>@NCIntentRef("alarm")</code> with method <code>onMatch(...)</code>
as its callback.
</li>
<li>
This example defines an intent with two conversational terms both of which have to found for the
intent to match.
</li>
<li>
Terms is conversational if it uses <code>'~'</code> and non-conversational if it uses <code>'='</code>
symbol in its definition. If term is conversational, the matching algorithm will look into the conversation
context short-term-memory (STM) to seek the matching tokens for this term. Note that the terms that were fully or partially matched using tokens from
the conversation context will contribute a smaller weight to the overall intent matching weight since these terms are <em>less specific.</em>
Non-conversational terms will be matched using tokens found only in the current user input without looking at the conversation context.
</li>
<li>
Method <code>onMatch(...)</code> will be called when this intent is the best match detected.
</li>
<li>
Note that terms have <code>min=1, max=1</code> quantifiers by default.
</li>
<li>
First term is defined as a single mandatory (<code>min=1, max=1</code>) user token with ID <code>x:alarm</code>
whose element is defined in the model.
</li>
<li>
Second term is defined as a zero or up to seven numeric built-in <code>nlpcraft:num</code> tokens that
have unit type of <code>datetime</code> and are single numbers. Note that <a href="examples/alarm_clock.html">Alarm Clock</a>
model allows zero tokens in this term which would mean the current time.
</li>
<li>
Given data model definition above the following sentences will be matched by this intent:
<ul>
<li><code>Ping me in 3 minutes</code></li>
<li><code>Buzz me in an hour and 15mins</code></li>
<li><code>Set my alarm for 30s</code></li>
</ul>
</li>
</ul>
</section>
<section id="logic">
<h2 class="section-title">Intent Matching Logic <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
In order to understand the intent matching logic lets review the overall user request processing workflow:
</p>
<figure>
<img class="img-fluid" style="border: none; padding: 0;" src="/images/intent_matching1.png" alt="">
<figcaption><b>Fig. 1</b> User Request Workflow</figcaption>
</figure>
<ul>
<li>
<b>Step: 0</b><br>
<p>
Server receives REST call <code>/ask</code> or <code>/ask/sync</code> that contains the text
of the sentence that needs to be processed.
</p>
</li>
<li>
<b>Step: 1</b><br>
<p>
At this step the server attempts to find additional variations of the input sentence by substituting
certain words in the original text with synonyms from Google's BERT dataset. Note that server will not use the synonyms that
are already defined in the model itself - it only tries to compensate for the potential incompleteness
of the model. The result of this step is one or more sentences that all have the same meaning as the
original text.
</p>
</li>
<li>
<b>Step: 2</b><br>
<p>
At this step the server takes one or more sentences from the previous step and tokenizes them. This
process involves converting the text into a sequence of enriched tokens representing named entities.
This step also performs the initial server-side enrichment and detection of the
<a href="/data-model.html#builtin">built-in named entities</a>.
</p>
<p>
The result of this step is a sequence of converted sentences, where each element is a sequence
of tokens. These sequences are send down to the data probe that has requested data model deployed.
</p>
</li>
<li>
<b>Step: 3</b><br>
<p>
This is the first step of the probe-side processing. At this point the data probe receives one or more
sequences of tokens. Probe then takes each sequence and performs the final enrichment by detecting user-defined
elements additionally to the built-in tokens that were detected on the server during step 2 above.
</p>
</li>
<li>
<b>Step: 4</b><br>
<p>
This is an important step for understanding intent matching logic. At this step the data probe
takes sequences of tokens generated at the last step and comes up with one or more parsing
variants. A parsing variant is a sequence of tokens that is free from token overlapping and other parsing
ambiguities. Typically, a single sequence of tokens can produce one (always) or more parsing variants.
</p>
<p>
Let's consider the input text <code>'A B C D'</code> and the following elements defined in our model:
</p>
<pre class="brush: js">
"elements": [
{
"id": "elm1",
"synonyms": ["A B"]
},
{
"id": "elm2",
"synonyms": ["B C"]
},
{
"id": "elm3",
"synonyms": ["D"]
}
],
</pre>
<p>
All of these elements will be detected but since two of them are overlapping (<code>elm1</code> and
<code>elm2</code>) there should be <b>two</b> parsing variants at the output of this step:
</p>
<ol>
<li><code>elm1</code>('A', 'B') <code>freeword</code>('C') <code>elm3</code>('D')</li>
<li><code>freeword</code>('A') <code>elm2</code>('B', 'C') <code>elm3</code>('D')</li>
</ol>
<p></p>
<p>
Note that at this point the <em>system cannot determine which of these variants is the best one
for matching - there's simply not enough information at this stage</em>. It can only be determined
when each variant is matched against model's intents - which happens in the next step.
</p>
</li>
<li>
<b>Step: 5</b><br>
<p>
At this step the actual matching between intents and variants happens. Each parsing variant from the previous
step is matched against each intent. Each matching pair of a variant and an intent produce a match with a
<em>certain weight</em>. If there are no matches at all - an error is returned. If matches were found, the match
with the biggest weight is selected as a winning match. If multiple matches have the same weight, their
respective variants' weights will be used to further sort them out. Finally, the intent's callback from the winning match is
called.
</p>
<p>
Although details on exact algorithm on weight calculation are too complex, here's the general guidelines
on what determines the weight of the match between a parsing variant and the intent. Note that these rules
coalesce around the principle idea that the <b>more specific match always wins</b>:
</p>
<ul>
<li>
A match that captures more tokens has more weight than a match with less tokens. As a corollary, the match
with less free words (i.e. unused words) has bigger weight than a match with more free words.
</li>
<li>
Tokens for user-defined elements are more important than built-in tokens.
</li>
<li>
A more specific match has bigger weight. In other words, a match that uses a token from the conversation
context (i.e short-term-memory) has less weight than a match that only uses tokens from the current request. In the same
way older tokens from the conversation give less weight than the more recent ones.
</li>
</ul>
</li>
</ul>
</section>
<section id="intent_callback">
<h2 class="section-title">Intent Callback <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
Whether the intent is defined directly in <code>@NCIntent</code> annotation or indirectly via <code>@NCIntentRef</code>
annotation - it is always bound to a callback method:
</p>
<ul>
<li>
Callback can only be an instance method on the class implementing <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModel.html">NCModel</a>
interface.
</li>
<li>
Method must have return type of <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCResult.html">NCResult</a>.
</li>
<li>
Method can have zero or more parameters:
<ul>
<li>
Parameter of type <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentMatch.html">NCIntentMatch</a>,
if present, must be first.
</li>
<li>
Any other parameters (other than the first optional
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentMatch.html">NCIntentMatch</a>) must
have <code>@NCIntentTerm</code> annotation.
</li>
</ul>
</li>
<li>
Method must support reflection-based invocation.
</li>
</ul>
<p>
<code>@NCIntentTerm</code> annotation marks callback parameter to receive term's tokens. This annotations can
only be used for the parameters of the callbacks, i.e. methods that are annotated with <code>@NCIntnet</code> or
<code>@NCIntentRef</code>. <code>@NCIntentTerm</code> takes a term ID as its only mandatory parameter and
should be applied to callback method parameters to get the tokens associated with that term (if and when
the intent was matched and that callback was invoked).
</p>
<p>
Depending on the term quantifier the method parameter type can only be one of the following types:
</p>
<table class="gradient-table">
<thead>
<tr>
<th>Quantifier</th>
<th>Java Type</th>
<th>Scala Type</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>[1,1]</code></td>
<td><code>NCToken</code></td>
<td><code>NCToken</code></td>
</tr>
<tr>
<td><code>[0,1]</code></td>
<td><code>Optional&lt;NCToken&gt;</code></td>
<td><code>Option[NCToken]</code></td>
</tr>
<tr>
<td><code>[1,∞]</code> or <code>[0,∞]</code></td>
<td><code>java.util.List&lt;NCToken&gt;</code></td>
<td><code>List[NCToken]</code></td>
</tr>
</tbody>
</table>
<p>
For example:
</p>
<pre class="brush: java">
&#64;NCIntent("intent=id term(termId)~{tok_id() == 'my_token'}?")
private NCResult onMatch(
&#64;NCIntentTerm("termId") Optional&lt;NCToken&gt; myTok
) {
...
}
</pre>
<p><b>NOTES:</b></p>
<ul>
<li>
Conversational term <code>termId</code> has <code>[0,1]</code> quantifier (it's optional).
</li>
<li>
The formal parameter on the callback has a type of <code>Optional&lt;NCToken&gt;</code> because the
term's quantifier is <code>[0,1]</code>.
</li>
<li>
Note that callback doesn't have an optional <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentMatch.html">NCIntentMatch</a>
parameter.
</li>
</ul>
<h2 class="section-sub-title"><code>NCRejection</code> and <code>NCIntentSkip</code> Exceptions <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
There are two exceptions that can be used by intent callback logic to control intent matching process.
</p>
<p>
When <a href="/apis/latest/org/apache/nlpcraft/model/NCRejection.html">NCRejection</a> exception is thrown by the callback it indicates that user input cannot
be processed as is. This exception typically indicates that user has not provided enough information in
the input string to have it processed automatically. In most cases this means that the user's input is
either too short or too simple, too long or too complex, missing required context, or is unrelated to the
requested data model.
</p>
<p>
<a href="/apis/latest/org/apache/nlpcraft/model/NCIntentSkip.html">NCIntentSkip</a> is a control flow exception to
skip current intent. This exception can be thrown by the intent callback to indicate that current intent
should be skipped (even though it was matched and its callback was called). If there's more than one intent
matched the next best matching intent will be selected and its callback will be called.
<p>
<p>
This exception becomes useful when it is hard or impossible to encode the entire matching logic using only
declarative IDL. In these cases the intent definition can be relaxed and the "last mile" of intent
matching can happen inside of the intent callback's user logic. If it is determined that intent in fact does
not match then throwing this exception allows to try next best matching intent, if any.
</p>
<p>
Note that there's a significant difference between <a href="/apis/latest/org/apache/nlpcraft/model/NCIntentSkip.html">NCIntentSkip</a>
exception and model's <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModel.html">onMatchedIntent(...)</a>
callback. Unlike this callback, the exception does not force re-matching of all intents, it simply
picks the next best intent from the list of already matched ones. The model's callback can force
a full reevaluation of all intents against the user input.
</p>
<div class="bq info">
<p>
<b>IDL Expressiveness</b>
</p>
<p>
Note that usage of <code>NCIntentSkip</code> exception (as well as model's life-cycle callbacks) is a
required technique when you cannot express the desired matching logic with only IDL alone.
IDL is a high-level declarative language and it does
not support a complex programmable logic or other types of sophisticated matching algorithms. In such cases, you can
define a broad intent that would <em>broadly match</em> and then define the rest of the more complex matching logic in the callback
using <code>NCIntentSkip</code> exception to effectively indicate when intent doesn't match and other
intents, if any, have to be tried.
</p>
<p>
There are many use cases where IDL is not expressive enough. For example, if your intent matching depends
on financial market conditions, weather, state from external systems or details of the current user geographical
location or social network status - you will need to use <code>NCIntentSkip</code>-based logic or model's callbacks to support
that type of matching.
</p>
</div>
<h2 class="section-sub-title"><code>NCIntentMatch</code> Interface <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentMatch.html">NCIntentMatch</a> interface
can be passed into intent callback as its first parameter. This interface provide runtime information
about the intent that was matched (i.e. the intent with which this callback was annotated with). Note also that
intent context can only be the 1st parameter in the callback, and if not declared as such - it won't be passed in.
</p>
</section>
<section id="model_callbacks">
<h2 class="section-title">Model Callbacks <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModel.html">NCModel</a> interface provides
several callbacks that are invoked before, during and after intent matching. They provide an opportunity to inject
user cross-cutting concerns into a standard intent matching workflow of NLPCraft. Usage of these callbacks
is completely optional, yet they provide convenient joint points for logging, statistic collections, security
audit and validation, explicit conversation context management, model metadata updates, and many other aspects
that may depend on the standard intent matching workflow:
</p>
<table class="gradient-table">
<thead>
<tr>
<th>Callback</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onParsedVariant(org.apache.nlpcraft.model.NCVariant)"><code>NCModel#<b>onParsedVariant(...)</b></code></a></td>
<td>
<p>
A callback to accept or reject a parsed variant. This callback is called before any other
callbacks at the beginning of the processing pipeline and it is called for each parsed variant.
Note that a given user input can have one or more possible parsing variants. Depending on
model configuration a user input can produce hundreds or even thousands of parsing variants that
can significantly slow down the overall processing. This method allows to filter out unnecessary
parsing variants based on variety of user-defined factors like number of tokens, presence
of a particular token in the variant, etc.
</p>
</td>
</tr>
<tr>
<td><a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onContext(org.apache.nlpcraft.model.NCContext)"><code>NCModel#<b>onContext(...)</b></code></a></td>
<td>
<p>
A callback that is called when a fully assembled query context is ready. This callback is called
after all <a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onParsedVariant(org.apache.nlpcraft.model.NCVariant)"><code>onParsedVariant(...)</code></a>
callbacks are called but before any <a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onMatchedIntent(org.apache.nlpcraft.model.NCIntentMatch)"><code>onMatchedIntent(...)</code></a>
are called, i.e. right before the intent matching is performed. It's called always once per user request processing.
Typical use case for this callback is to perform logging, debugging, statistic or usage collection,
explicit update or initialization of conversation context, security audit or validation, etc.
</p>
</td>
</tr>
<tr>
<td><a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onMatchedIntent(org.apache.nlpcraft.model.NCIntentMatch)"><code>NCModel#<b>onMatchedIntent(...)</b></code></a></td>
<td>
<p>
A callback that is called when intent was successfully matched but right before its callback is called.
This callback is called after <a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onContext(org.apache.nlpcraft.model.NCContext)"><code>onContext(...)</code></a>
is called and may be called multiple times depending on its return value. If <code>true</code> is
returned than the default workflow will continue and the matched intent's callback will be called.
However, if <code>false</code> is returned than the entire existing set of parsing variants will be
re-matched against all declared intents again. Returning <code>false</code> allows this method to alter the state
of the model (like soft-reset conversation or change metadata) and force the full re-evaluation
of the parsing variants against all declared intents. Note that user logic should be careful not
to induce infinite loop in this behavior.
</p>
<p>
Note that this callback may not be called at all based on the return value of
<a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onContext(org.apache.nlpcraft.model.NCContext)"><code>onContext(...)</code></a> callback.
Typical use case for this callback is to perform logging, debugging, statistic or usage collection,
explicit update or initialization of conversation context, security audit or validation, etc. This
callback is especially useful for a soft reset of the conversation context when a condition for such
reset can only be derived from within of intent callback.
</p>
</td>
</tr>
<tr>
<td><a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onResult(org.apache.nlpcraft.model.NCIntentMatch,org.apache.nlpcraft.model.NCResult)"><code>NCModel#<b>onResult(...)</b></code></a></td>
<td>
<p>
A callback that is called when successful result is obtained from the intent callback and right
before sending it back to the caller. This callback is called after
<a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onMatchedIntent(org.apache.nlpcraft.model.NCIntentMatch)"><code>onMatchedIntent(...)</code></a> is called.
Note that this callback may not be called at all, and if called - it's called only once. Typical
use case for this callback is to perform logging, debugging, statistic or usage collection,
explicit update or initialization of conversation context, security audit or validation, etc.
</p>
</td>
</tr>
<tr>
<td><a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onRejection(org.apache.nlpcraft.model.NCIntentMatch,org.apache.nlpcraft.model.NCRejection)"><code>NCModel#<b>onRejection(...)</b></code></a></td>
<td>
<p>
A callback that is called when intent callback threw <a href="/apis/latest/org/apache/nlpcraft/model/NCRejection.html"><code>NCRejection</code></a> exception.
This callback is called after <a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onMatchedIntent(org.apache.nlpcraft.model.NCIntentMatch)"><code>onMatchedIntent(...)</code></a> is called.
Note that this callback may not be called at all, and if called - it's called only once. Typical
use case for this callback is to perform logging, debugging, statistic or usage collection,
explicit update or initialization of conversation context, security audit or validation, etc.
</p>
</td>
</tr>
<tr>
<td><a href="/apis/latest/org/apache/nlpcraft/model/NCModel.html#onError(org.apache.nlpcraft.model.NCContext,java.lang.Throwable)"><code>NCModel#<b>onError(...)</b></code></a></td>
<td>
<p>
A callback that is called when intent callback failed with unexpected exception. Note that this
callback may not be called at all, and if called - it's called only once. Typical use case for
this callback is to perform logging, debugging, statistic or usage collection, explicit update
or initialization of conversation context, security audit or validation, etc.
</p>
</td>
</tr>
</tbody>
</table>
</section>
</div>
<div class="col-md-2 third-column">
<ul class="side-nav">
<li class="side-nav-title">On This Page</li>
<li><a href="#intent">Overview</a></li>
<li><a href="#idl">IDL Syntax</a></li>
<li><a href="#idl_functions">IDL Functions</a></li>
<li><a href="#idl_location">IDL Location</a></li>
<li><a href="#binding">Intent Binding</a></li>
<li><a href="#logic">Intent Matching</a></li>
<li><a href="#intent_callback">Intent Callback</a></li>
<li><a href="#model_callbacks">Model Callbacks</a></li>
{% include quick-links.html %}
</ul>
</div>