| --- |
| active_crumb: Docs |
| layout: documentation |
| id: api-review |
| --- |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <div class="col-md-8 second-column"> |
| <section id="overview"> |
| <h2 class="section-title">API review <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| NLPCraft library is based on two main concepts <code>Data Model</code> and <code>Client</code> |
| which have API representations |
| <a href="apis/latest/org/apache/nlpcraft/NCModel.html">NCModel</a> and |
| <a href="apis/latest/org/apache/nlpcraft/NCModelClient.html">NCModelClient</a>. |
| For work with the system you should prepare <a href="apis/latest/org/apache/nlpcraft/NCModel.html">NCModel</a> instance |
| which is based on configuration and list of components named <code>pipeline</code>. |
| After you just communicate with prepared model via client's methods. |
| </p> |
| |
| <ul> |
| <li> |
| <code>Data Model</code> is domain specific object which responsible for user input interpretation. |
| </li> |
| <li> |
| <code>Client</code> is object which allows to communicate with the given data model. |
| </li> |
| </ul> |
| |
| <p>Typical part of code:</p> |
| |
| <pre class="brush: scala, highlight: []"> |
| // Initialized prepared domain model. |
| val mdl = new CustomNlpModel() |
| |
| // Creates client for given model. |
| val client = new NCModelClient(mdl) |
| |
| // Sends text request to model by user ID "userId". |
| val result = client.ask("Some user command", "userId") |
| |
| // Clears dialog session for user with ID "userId". |
| client.clearDialog("userId") |
| </pre> |
| </section> |
| |
| <section id="model"> |
| <h2 class="section-title">Data Model responsibility<a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Let's start with terminology and describe the system work workflow. |
| </p> |
| |
| <ul> |
| <li> |
| <code>Token</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCToken.html">NCToken</a>. |
| It is simple string, part of user input, which split according to some rules, |
| for instance by spaces and some additional conditions, which depends on language and some expectations. |
| So user input "<b>Where is it?</b>" contains four tokens: |
| "<code>Where</code>", "<code>is</code>", "<code>it</code>", "<code>?</code>". |
| Usually <code>tokens</code> are words and punctuation symbols which can also contain some additional |
| information like point of speech etc. |
| <code>Tokens</code> are input for searching the <code>entities</code>. |
| </li> |
| <li> |
| <code>Entity</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntity.html">NCEntity</a>. |
| According to wikipedia, named entity is a real-world object, such as a person, location, organization, |
| product, etc., that can be denoted with a proper name. It can be abstract or have a physical existence. |
| Each <code>entity</code> can contain one or more tokens. |
| <code>Entities</code> are input for searching <code>intents</code> according to <a href="intent-matching.html">Intent matching</a> conditions. |
| </li> |
| <li> |
| <code>Variant</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCVariant.html">NCVariant</a>. |
| It is a list of <code>entities</code>. Potentially, each <code>token</code> can be recognized as |
| different <code>entities</code>, |
| so user input can be processed as set of <code>variants</code>. |
| For example user input "Mercedes" can be processed as two <code>variants</code>, |
| both of them contains single element list of <code>entities</code>: <b>car brand</b> or <b>Spanish female name</b>. |
| When words are not overlapped with different <code>entities</code> there is only one |
| <code>variant</code> detected. |
| </li> |
| <li> |
| <code>Intent</code> is user defined callback and rule, according to which this callback should be called. |
| Rule is most often some template, based on expected set of <code>entities</code> in user input, |
| but it can be more flexible. |
| Parameters extracted from user text input are passed into callback methods. |
| These methods execution results are provided to user as answer on his request. |
| <code>Intent</code> callbacks are methods defined in <code>Data Model</code> class annotated by |
| <code>intent</code> rules via <a href="intent-matching.html">IDL</a>. |
| </li> |
| </ul> |
| |
| <p> |
| <code>Data Model</code> must be able to do tree following things: |
| </p> |
| |
| <ul> |
| <li> |
| Parse user input text as the <code>tokens</code>. |
| They are input for searching <code>named entities</code>. |
| <code>Tokens</code> parsing components should be included into <a href="#model-pipeline">Model pipeline</a>. |
| </li> |
| <li> |
| Find <code>named entities</code> based on these parsed <code>tokens</code>. |
| They are input for searching <code>intents</code>. |
| <code>Entity</code> parsing components should be included into <a href="#model-pipeline">Model pipeline</a>. |
| </li> |
| <li> |
| Prepare <code>intents</code> with their callbacks methods which contain business logic. |
| These methods should be defined directly in the model class definition or the model should have references on them. |
| It will be described below. |
| </li> |
| </ul> |
| |
| <p> |
| As example, let's prepare the system which can call persons from your contact list. |
| Typical commands are: "<b>Please call to John Smith</b>" or "<b>Connect me with Barbara Dillan</b>". |
| For solving this task this model should be able to recognize in user text following entities: |
| <code>command</code> and <code>person</code> to apply this command. |
| </p> |
| |
| <p> |
| So, when request "<b>Please call to John Smith</b>" received, our model should be able to: |
| </p> |
| |
| <ul> |
| <li> |
| Parse tokens splitting user text input: |
| "<code>please</code>", "<code>call</code>", "<code>to</code>", "<code>john</code>", "<code>smith</code>". |
| </li> |
| <li> |
| Find two named entities: |
| <ul> |
| <li> |
| <code>command</code> by token "<code>call</code>". |
| </li> |
| <li> |
| <code>person</code> by tokens "<code>john</code>" and "<code>smith</code>". |
| </li> |
| </ul> |
| </li> |
| <li> |
| Have prepared intent: |
| <pre class="brush: scala, highlight: [1, 2, 5, 6]"> |
| @NCIntent("intent=call term(command)={# == command'} term(person)={# == 'person'}") |
| def onCommand( |
| ctx: NCContext, |
| im: NCIntentMatch, |
| @NCIntentTerm("command") command: NCEntity, |
| @NCIntentTerm("person") person: NCEntity |
| ): NCResult = ? // Implement business logic here. |
| </pre> |
| |
| <ul> |
| <li> |
| <code>Line 1</code> defines intent <code>call</code> with two conditions. |
| </li> |
| <li> |
| <code>Line 2</code> defines related callback method <code>onCommand()</code>. |
| </li> |
| <li> |
| <code>Lines 4 and 5</code> define two callback method's arguments which are corresponded to |
| <code>call</code> intent terms conditions. You can extract normalized value |
| <code>john smith</code> from the <code>person</code> parameter and use it in the method body |
| for getting his phone number etc. |
| </li> |
| </ul> |
| |
| <p> |
| Note that there are a lot of options of defining intents and their callback methods. |
| They can be defined in configuration files, nested classes, etc. |
| Look at the chapter <a href="intent-matching.html">Intent Matching</a> content for get more details. |
| </p> |
| </li> |
| </ul> |
| |
| </section> |
| |
| <section id="client"> |
| <h2 class="section-title">Client responsibility<a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| Client which is represented as <a href="apis/latest/org/apache/nlpcraft/NCModelClient.html">NCModelClient</a> |
| is necessary for communication with the model. Base client methods are described below. |
| </p> |
| |
| <ul> |
| <li> |
| <code>ask()</code> passes user text input to the model and receives back execution |
| <a href="apis/latest/org/apache/nlpcraft/NCResult.html">NCResult</a> or |
| rejection exception if there isn't any triggered intents. |
| <a href="apis/latest/org/apache/nlpcraft/NCResult.html">NCResult</a> is wrapper on |
| callback method execution result with additional information. |
| </li> |
| <li> |
| <code>debugAsk()</code> passes user text input to the model and receives back callback and its parameters or |
| rejection exception if there isn't any triggered intents. |
| Main difference from <code>ask</code> that triggered intent callback method is not called. |
| This method and this parameter can be useful for tests scenarios. |
| </li> |
| <li> |
| <code>clearStm()</code> clears STM state. Memory is cleared wholly or with some predicate. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| </li> |
| <li> |
| <code>clearDialog()</code> clears dialog state. Dialog is cleared wholly or with some predicate. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| </li> |
| <li> |
| <code>close()</code> - Closes client. You can't call another client's methods after this method was closed. |
| </li> |
| </ul> |
| </section> |
| |
| <section id="model-configuration"> |
| <h2 class="section-title">Model configuration <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| Data Model configuration <a href="apis/latest/org/apache/nlpcraft/NCModelConfig.html">NCModelConfig</a> represents set of model parameter values. |
| Its properties are described below. |
| </p> |
| <ul> |
| <li> |
| <code>id</code>, <code>name</code> and <code>version</code> are mandatory model descriptors. |
| </li> |
| <li> |
| <code>description</code> and <code>origin</code> are optional model descriptors. |
| </li> |
| <li> |
| <code>conversationTimeout</code> - timeout of the user's conversation. |
| If user doesn't communicate with the model this time period STM is going to be cleared. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| Mandatory parameter with default value. |
| </li> |
| <li> |
| <code>conversationDepth</code> - maximum supported depth the user's conversation. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| Mandatory parameter with default value. |
| </li> |
| </ul> |
| </section> |
| |
| <section id="model-pipeline"> |
| <h2 class="section-title">Model pipeline <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| Model <code>Pipeline</code> is represented as <a href="apis/latest/org/apache/nlpcraft/NCPipeline.html">NCPipeline</a> and |
| contains following components: |
| </p> |
| <ul> |
| <li> |
| <code>Token parser</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCTokenParser.html">NCTokenParser</a>. |
| Mandatory pipeline component, it is required for parsing plain text, user input, and split this text |
| into tokens list. NLPCraft provides default EN implementation of token parser. |
| Also, project contain various examples for <a href="examples/light_switch_fr.html">French</a> and |
| <a href="examples/light_switch_ru.html">Russia</a> languages. |
| </li> |
| <li> |
| <code>Tokens enrichers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCTokenEnricher.html">NCTokenEnricher</a> optional list. |
| Tokens enricher is component which allows to add additional properties to prepared tokens, |
| like part of speech, quote, stop-words flags or any other. |
| NLPCraft provides default set of EN tokens enrichers implementations. |
| </li> |
| <li> |
| <code>Tokens validators</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCTokenValidator.html">NCTokenValidator</a> optional list. |
| Tokens validator is user defined component, where tokens are inspected and exception can be thrown |
| from user code to break user input processing. |
| </li> |
| <li> |
| <code>Entity parsers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityParser.html">NCEntityParser</a> mandatory list. |
| At least one entity parser must be defined. Having prepared tokens as input, |
| each entity parser tries to find user defined named entities. |
| NLPCraft provides wrappers for named-entity recognition components of OpenNLP and Stanford libraries. |
| </li> |
| <li> |
| <code>Entity enrichers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityEnricher.html">NCEntityEnricher</a> optional list. |
| Entity enricher is component which allows to add additional properties to prepared entities. |
| Can be useful for extending existing entity enrichers functionality. |
| </li> |
| <li> |
| <code>Entity mappers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityMapper.html">NCEntityMapper</a> optional list. |
| Entity mapper is component which allows to map one set of entities into another after the entities |
| were parsed and enriched. Can be useful for building complex parsers based on existing. |
| </li> |
| <li> |
| <code>Entity validators</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityValidator.html">NCEntityValidator</a> optional list. |
| Entities validator is user defined component, where prepared entities are inspected and |
| exceptions can be thrown from user code to break user input processing. |
| </li> |
| <li> |
| <code>Variant filter</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCVariantFilter.html">NCVariantFilter</a>. |
| Optional component which allows filtering detected variants, rejecting undesirable. |
| </li> |
| </ul> |
| |
| <p> |
| Below <code>Model</code> creation example. |
| <code>Pipeline</code> is prepared using <code>NCPipelineBuilder</code> class helper. |
| </p> |
| |
| <pre class="brush: scala, highlight: []"> |
| val pipeline = |
| new NCPipelineBuilder(). |
| withTokenParser(new NCFrTokenParser()). |
| withTokenEnricher(new NCFrLemmaPosTokenEnricher()). |
| withTokenEnricher(new NCFrStopWordsTokenEnricher()). |
| withEntityParser(new NCFrSemanticEntityParser("lightswitch_model_fr.yaml")). |
| build |
| val cfg = NCModelConfig("nlpcraft.lightswitch.fr.ex", "LightSwitch Example Model FR", "1.0") |
| |
| val mdl = new NCModelAdapter(cfg, pipeline) |
| </pre> |
| |
| <p> |
| This flexible system allows to create any pipelines on any language. |
| You can collect NLPCraft predefined components, write your own and easy reuse custom components. |
| </p> |
| </section> |
| |
| <section id="model-behavior"> |
| <h2 class="section-title">Model behavior overriding<a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| There are also several <a href="/apis/latest/org/apache/nlpcraft/NCModel.html">NCModel</a> |
| callbacks that you can override to affect model behavior during |
| <a href="/intent-matching.html#model_callbacks">intent matching</a> |
| to perform logging, debugging, statistic or usage collection, explicit update or initialization of |
| conversation context, security audit or validation: |
| </p> |
| <ul> |
| <li> |
| Overriding <code>onMatchedIntent()</code> allows to reject matched intent and continue matching process. |
| </li> |
| <li> |
| Overriding <code>onResult()</code> allows to replace callback method execution result. |
| </li> |
| <li> |
| Overriding <code>onRejection()</code> allows to change operation result when rejection occurs. |
| </li> |
| <li> |
| Overriding <code>onError()</code> allows to change operation result when any error occurs. |
| </li> |
| </ul> |
| </section> |
| </div> |
| <div class="col-md-2 third-column"> |
| <ul class="side-nav"> |
| <li class="side-nav-title">On This Page</li> |
| <li><a href="#overview">Overview</a></li> |
| <li><a href="#model">Model responsibility overview</a></li> |
| <li><a href="#client">Client responsibility overview</a></li> |
| <li><a href="#model-configuration">Model configuration</a></li> |
| <li><a href="#model-pipeline">Model pipeline</a></li> |
| <li><a href="#model-behavior">Model behavior overriding</a></li> |
| {% include quick-links.html %} |
| </ul> |
| </div> |
| |
| |
| |
| |