| --- |
| active_crumb: Docs |
| layout: documentation |
| id: overview |
| --- |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <div class="col-md-8 second-column"> |
| <section id="overview"> |
| <h2 class="section-title">Library API review <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| NlpCraft library contains two base concepts: <code>Model</code> and <code>Client</code> which have API representations |
| <a href="apis/latest/org/apache/nlpcraft/NCModel.html">NCModel</a> and |
| <a href="apis/latest/org/apache/nlpcraft/NCModelClient.html">NCModelClient</a>. |
| When you work with the system - you should prepare model configuring its parameters and defining its components. |
| After you just communicate with this model via client's methods. |
| </p> |
| |
| <ul> |
| <li> |
| <code>Model</code> is domain specific object which responsible for user input interpretation. |
| <code>Model</code> contains intents, defined via NlpCraft IDL with related code callbacks. |
| Intent is user defined callback and rule, according to which this callback should be called. |
| Rule is most often some template, based on expected set of entities in user input, but it can be more flexible. |
| </li> |
| |
| <li> |
| <code>Client</code> is object, which allows to communicate with the given model. |
| </li> |
| </ul> |
| |
| <p>Typical part of code:</p> |
| |
| <pre class="brush: scala, highlight: []"> |
| // Prepares domain model. |
| val mdl = new CustomNlpModel() |
| |
| // Prepares client for given model. |
| val client = new NCModelClient(mdl) |
| |
| // Sends text request to model by user ID "userId". |
| val result = client.ask("Some user command", "userId") |
| |
| // Clears dialog session for user with ID "userId". |
| client.clearDialog("userId") |
| </pre> |
| |
| <p> |
| <code>Model</code> definition includes two parts: |
| </p> |
| <ul> |
| <li> |
| <code>Configuration</code>. Static configuration parameters including name, version, etc. |
| </li> |
| <li> |
| <code>Pipeline</code>. Most important component, which defines user input processing chain. |
| <code>Pipeline</code> can be based on standard and custom user defined components. |
| </li> |
| </ul> |
| |
| <p> |
| <b>Base client methods:</b> |
| </p> |
| <ul> |
| <li> |
| <code>ask()</code> passes user text input to the model and receives back triggered callback method execution result or |
| rejection exception if there isn't any triggered intents. |
| </li> |
| <li> |
| <code>debugAsk()</code> passes user text input to the model and receives back callback and its parameters or |
| rejection exception if there isn't any triggered intents. |
| Main difference from <code>ask</code> that triggered intent callback method is not called. |
| This method and this parameter can be useful for tests scenarios. |
| </li> |
| <li> |
| <code>clearStm()</code> clears STM state. Memory is cleared wholly or with some predicate. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| </li> |
| <li> |
| <code>clearDialog()</code> clears dialog state. Dialog is cleared wholly or with some predicate. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| </li> |
| <li> |
| <code>close()</code> - Closes client. You can't call another client's methods after this method was closed. |
| </li> |
| </ul> |
| </section> |
| |
| <section id="model-configuration"> |
| <h2 class="section-title">Model configuration <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| <code>Model configuration</code> which is represented as is set of model parameter values. |
| Its API representation is <a href="apis/latest/org/apache/nlpcraft/NCModelConfig.html">NCModelConfig</a>. |
| </p> |
| <ul> |
| <li> |
| <code>id</code>, <code>name</code> and <code>version</code> are mandatory model descriptors. |
| </li> |
| <li> |
| <code>description</code> and <code>origin</code> are optional model descriptors. |
| </li> |
| <li> |
| <code>conversationTimeout</code> - timeout of the user's conversation. |
| If user doesn't communicate with the model this time period STM is going to be cleared. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| Mandatory parameter with default value. |
| </li> |
| <li> |
| <code>conversationDepth</code> - maximum supported depth the user's conversation. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| Mandatory parameter with default value. |
| </li> |
| </ul> |
| </section> |
| |
| <section id="model-pipeline"> |
| <h2 class="section-title">Model pipeline <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Before looking at <code>Pipeline</code> elements more throughly, let's start with terminology. |
| </p> |
| |
| <ul> |
| <li> |
| <code>Token</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCToken.html">NCToken</a>. |
| It is simple string, part of user input, which split according to some rules, for instance by spaces and some additional conditions, which depends on language and some expectations. |
| So user input "<b>Where is it?</b>" contains four tokens: "<b>Where</b>", "<b>is</b>", "<b>it</b>", "<b>?</b>". |
| Tokens data are input for searching <code>entities</code>. |
| </li> |
| <li> |
| <code>Entity</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntity.html">NCEntity</a>. |
| According to wikipedia, named entity is a real-world object, such as a person, location, organization, |
| product, etc., that can be denoted with a proper name. It can be abstract or have a physical existence. |
| Each entity can contain one or more tokens. |
| Entities data are input for searching <a href="intent-matching.html">Intent matching</a> conditions. |
| </li> |
| <li> |
| <code>Variant</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCVariant.html">NCVariant</a>. |
| List of entities. Potentially, each token can be recognized as different entities, |
| so user input can be processed as set of variants. |
| For example user input "Mercedes" can be processed as 2 variants, |
| both of them contains single element list of entities: car brand or Spanish family name. |
| When words are not overlapped with different <code>entities</code> there is only one |
| <code>variant</code> detected. |
| </li> |
| </ul> |
| |
| <p> |
| Back to <code>pipeline</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCPipeline.html">NCPipeline</a>. |
| <code>Pipeline</code> should be created based in following components: |
| </p> |
| <ul> |
| <li> |
| <code>Token parser</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCTokenParser.html">NCTokenParser</a>. |
| Mandatory pipeline component, it is required for parsing plain text, user input, and split this text |
| into tokens list. NlpCraft provides default EN implementation of token parser. |
| Also, project contain various examples for <a href="examples/light_switch_fr.html">French</a> and |
| <a href="examples/light_switch_ru.html">Russia</a> languages. |
| </li> |
| <li> |
| <code>Tokens enrichers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCTokenEnricher.html">NCTokenEnricher</a> optional list. |
| Tokens enricher is component which allows to add additional properties to prepared tokens, |
| like part of speech, quote, stop-words flags or any other. |
| NlpCraft provides default set of EN tokens enrichers implementations. |
| </li> |
| <li> |
| <code>Tokens validators</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCTokenValidator.html">NCTokenValidator</a> optional list. |
| Tokens validator is user defined component, where tokens are inspected and exception can be thrown |
| from user code to break user input processing. |
| </li> |
| <li> |
| <code>Entity parsers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityParser.html">NCEntityParser</a> mandatory list. |
| At least one entity parser must be defined. Having prepared tokens as input, |
| each entity parser tries to find user defined named entities. |
| NlpCraft provides wrappers for named-entity recognition components of OpenNLP and Stanford libraries. |
| </li> |
| <li> |
| <code>Entity enrichers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityEnricher.html">NCEntityEnricher</a> optional list. |
| Entity enricher is component which allows to add additional properties to prepared entities. |
| Can be useful for extending existing entity enrichers functionality. |
| </li> |
| <li> |
| <code>Entity mappers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityMapper.html">NCEntityMapper</a> optional list. |
| Entity mapper is component which allows to map one set of entities into another after the entities |
| were parsed and enriched. Can be useful for building complex parsers based on existing. |
| </li> |
| <li> |
| <code>Entity validators</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityValidator.html">NCEntityValidator</a> optional list. |
| Entities validator is user defined component, where prepared entities are inspected and |
| exceptions can be thrown from user code to break user input processing. |
| </li> |
| <li> |
| <code>Variant filter</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCVariantFilter.html">NCVariantFilter</a>. |
| Optional component which allows filtering detected variants, rejecting undesirable. |
| </li> |
| </ul> |
| |
| <p> |
| Below <code>Model</code> creation example. |
| <code>Pipeline</code> is prepared using <code>NCPipelineBuilder</code> class helper. |
| </p> |
| |
| <pre class="brush: scala, highlight: []"> |
| val pipeline = |
| new NCPipelineBuilder(). |
| withTokenParser(new NCFrTokenParser()). |
| withTokenEnricher(new NCFrLemmaPosTokenEnricher()). |
| withTokenEnricher(new NCFrStopWordsTokenEnricher()). |
| withEntityParser(new NCFrSemanticEntityParser("lightswitch_model_fr.yaml")). |
| build |
| val cfg = NCModelConfig("nlpcraft.lightswitch.fr.ex", "LightSwitch Example Model FR", "1.0") |
| |
| val mdl = new NCModelAdapter(cfg, pipeline) |
| </pre> |
| |
| <p> |
| This flexible system allows to create any pipelines on any language. |
| You can collect NlpCraft predefined components, write your own and easy reuse custom components. |
| </p> |
| </section> |
| <section id="model-intents"> |
| <h2 class="section-title">Model intents and callbacks <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| Each model class should contain one or more callback methods which are mapped on their intents definitions. |
| </p> |
| |
| <pre class="brush: scala, highlight: [1, 2, 4, 5]"> |
| @NCIntent("intent=ls term(act)={# == 'ls:on'} term(loc)={# == 'ls:loc'}*") |
| def onMatch( |
| ctx: NCContext, |
| im: NCIntentMatch, |
| @NCIntentTerm("act") act: NCEntity, |
| @NCIntentTerm("loc") locs: List[NCEntity] |
| ): NCResult = NCResult("OK") |
| </pre> |
| |
| <ul> |
| <li> |
| <code>Line 1</code> defines intent <code>ls</code> with two conditions. |
| </li> |
| <li> |
| <code>Line 2</code> defines related callback method. |
| </li> |
| <li> |
| <code>Lines 4 and 5</code> define two callback method's arguments which are corresponded to |
| <code>ls</code> intent conditions. |
| </li> |
| </ul> |
| |
| <p> |
| Note that there are a lot of options of defining intents and their callback methods. |
| They can be defined in configuration files, nested classes, etc. |
| Look at the chapter <a href="intent-matching.html">Intent Matching</a> content for get more details. |
| </p> |
| </section> |
| |
| <section id="model-behavior"> |
| <h2 class="section-title">Model behavior overriding<a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| There are also several <a href="/apis/latest/org/apache/nlpcraft/NCModel.html">NCModel</a> |
| callbacks that you can override to affect model behavior during |
| <a href="/intent-matching.html#model_callbacks">intent matching</a> |
| to perform logging, debugging, statistic or usage collection, explicit update or initialization of |
| conversation context, security audit or validation: |
| </p> |
| <ul> |
| <li> |
| Overriding <code>onMatchedIntent()</code> allows to reject matched intent and continue matching process. |
| </li> |
| <li> |
| Overriding <code>onResult()</code> allows to replace callback method execution result. |
| </li> |
| <li> |
| Overriding <code>onRejection()</code> allows to change operation result when rejection occurs. |
| </li> |
| <li> |
| Overriding <code>onError()</code> allows to change operation result when any error occurs. |
| </li> |
| </ul> |
| </section> |
| </div> |
| <div class="col-md-2 third-column"> |
| <ul class="side-nav"> |
| <li class="side-nav-title">On This Page</li> |
| <li><a href="#overview">Overview</a></li> |
| <li><a href="#model-configuration">Model configuration</a></li> |
| <li><a href="#model-pipeline">Model pipeline</a></li> |
| <li><a href="#model-intents">Model intents and callbacks</a></li> |
| <li><a href="#model-behavior">Model behavior overriding</a></li> |
| {% include quick-links.html %} |
| </ul> |
| </div> |
| |
| |
| |
| |