| --- |
| active_crumb: Docs |
| layout: documentation |
| id: overview |
| --- |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <div class="col-md-8 second-column"> |
| <section id="overview"> |
| <h2 class="section-title">Library API review <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| NlpCraft library contains two base concepts: <code>Model</code> and <code>Client</code> which have API representations |
| <a href="apis/latest/org/apache/nlpcraft/NCModel.html">NCModel</a> and |
| <a href="apis/latest/org/apache/nlpcraft/NCModelClient.html">NCModelClient</a>. |
| When you work with the system - you should prepare model configuring its parameters and defining its components. |
| After you just communicate with this model via client's methods. |
| </p> |
| |
| <ul> |
| <li> |
| <code>Model</code> is domain specific object which responsible for user input interpretation. |
| <code>Model</code> contains intents, defined via NlpCraft IDL with related code callbacks. |
| Intent is user defined callback and rule, according to which this callback should be called. |
| Rule is most often some template, based on expected set of entities in user input, but it can be more flexible. |
| </li> |
| |
| <li> |
| <code>Client</code> is object, which allows to communicate with the given model. |
| </li> |
| </ul> |
| |
| <p>Typical part of code:</p> |
| |
| <pre class="brush: scala, highlight: []"> |
| // Prepares domain model. |
| val mdl = new CustomNlpModel() |
| |
| // Prepares client for given model. |
| val client = new NCModelClient(mdl) |
| |
| // Sends text request to model by user ID "userId". |
| val result = client.ask("Some user command", "userId") |
| |
| // Clears dialog session for user with ID "userId". |
| client.clearDialog("userId") |
| </pre> |
| |
| <p> |
| <code>Model</code> definition includes two parts: |
| </p> |
| <ul> |
| <li> |
| <code>Configuration</code>. Static configuration parameters including name, version, etc. |
| </li> |
| <li> |
| <code>Pipeline</code>. Most important component, which defines user input processing chain. |
| <code>Pipeline</code> can be based on standard and custom user defined components. |
| </li> |
| </ul> |
| </section> |
| |
| <section id="model"> |
| <h2 class="section-title">Model overview<a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Let's start with terminology and describe work workflow. |
| </p> |
| |
| <ul> |
| <li> |
| <code>Token</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCToken.html">NCToken</a>. |
| It is simple string, part of user input, which split according to some rules, |
| for instance by spaces and some additional conditions, which depends on language and some expectations. |
| So user input "<b>Where is it?</b>" contains four tokens: "<b>Where</b>", "<b>is</b>", "<b>it</b>", "<b>?</b>". |
| Usually <code>tokens</code>are words and punctuation symbols which can also contain some additional |
| information like point of speech etc. |
| Tokens data are input for searching <code>entities</code>. |
| </li> |
| <li> |
| <code>Entity</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntity.html">NCEntity</a>. |
| According to wikipedia, named entity is a real-world object, such as a person, location, organization, |
| product, etc., that can be denoted with a proper name. It can be abstract or have a physical existence. |
| Each entity can contain one or more tokens. |
| Entities data are input for searching <a href="intent-matching.html">Intent matching</a> conditions. |
| </li> |
| <li> |
| <code>Variant</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCVariant.html">NCVariant</a>. |
| List of entities. Potentially, each token can be recognized as different entities, |
| so user input can be processed as set of variants. |
| For example user input "Mercedes" can be processed as 2 variants, |
| both of them contains single element list of entities: car brand or Spanish family name. |
| When words are not overlapped with different <code>entities</code> there is only one |
| <code>variant</code> detected. |
| </li> |
| </ul> |
| |
| <p> |
| <code>Model</code> must be able to do tree following things: |
| </p> |
| |
| <ul> |
| <li> |
| Parse user input text as the <code>tokens</code>. |
| They are input for searching <code>name entities</code>. |
| Tokens parsing components should be included into <a href="#model-pipeline">Model pipeline</a>. |
| </li> |
| <li> |
| Find <code>name entities</code> based on these parsed tokens. |
| <code>name entities</code>. |
| They are input for searching <code>intents</code>. |
| Entity parsing components should be included into <a href="#model-pipeline">Model pipeline</a>. |
| </li> |
| <li> |
| Prepare callback methods which contain business logic |
| and rules for matching user requests on them. |
| These callbacks with their rules and named as <code>intents</code>. |
| These matched callback methods execution with parameters extracted from user text input and |
| these execution results returned to user as answer on his request. |
| These callbacks methods should be defined in the model or model should have reference in them. |
| It will be described below. |
| </li> |
| </ul> |
| |
| <p> |
| Let's prepare the system which can call persons from your contact list. |
| Typical commands are: "Please call to John Smith" or "Connect me with Barbara Dillan". |
| This model should be able to recognize in user text following entities: |
| <code>command</code> and <code>person</code> to apply this command. |
| </p> |
| |
| <ul> |
| <li> |
| Parsing split your input on tokens ["<code>please</code>", "<code>call</code>", "<code>to</code>", "<code>john</code>", "<code>smith</code>"] |
| </li> |
| <li> |
| By these tokens model should be able to found two named entities: |
| <ul> |
| <li> |
| <code>command</code> by token <code>call</code>. |
| </li> |
| <li> |
| <code>person</code> by tokens <code>john</code> and <code>smith</code>. |
| </li> |
| </ul> |
| </li> |
| <li> |
| Also intents should be prepared: |
| <pre class="brush: scala, highlight: [1, 2, 5, 6]"> |
| @NCIntent("intent=call term(command)={# == command'} term(person)={# == 'person'}") |
| def onMatch( |
| ctx: NCContext, |
| im: NCIntentMatch, |
| @NCIntentTerm("command") command: NCEntity, |
| @NCIntentTerm("person") person: NCEntity |
| ): NCResult = ... |
| </pre> |
| |
| <ul> |
| <li> |
| <code>Line 1</code> defines intent <code>ls</code> with two conditions. |
| </li> |
| <li> |
| <code>Line 2</code> defines related callback method. |
| </li> |
| <li> |
| <code>Lines 4 and 5</code> define two callback method's arguments which are corresponded to |
| <code>call</code> intent conditions. You can extract normalized value |
| <code>john smith</code> from the <code>person</code> parameter and use in the method body. |
| </li> |
| </ul> |
| |
| <p> |
| Note that there are a lot of options of defining intents and their callback methods. |
| They can be defined in configuration files, nested classes, etc. |
| Look at the chapter <a href="intent-matching.html">Intent Matching</a> content for get more details. |
| </p> |
| </li> |
| </ul> |
| |
| </section> |
| |
| <section id="client"> |
| <h2 class="section-title">Client overview<a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| Base client methods: |
| </p> |
| <ul> |
| <li> |
| <code>ask()</code> passes user text input to the model and receives back execution |
| <a href="apis/latest/org/apache/nlpcraft/NCResult.html">NCResult</a> or |
| rejection exception if there isn't any triggered intents. |
| <a href="apis/latest/org/apache/nlpcraft/NCResult.html">NCResult</a> is wrapper on |
| callback method execution result with additional information. |
| </li> |
| <li> |
| <code>debugAsk()</code> passes user text input to the model and receives back callback and its parameters or |
| rejection exception if there isn't any triggered intents. |
| Main difference from <code>ask</code> that triggered intent callback method is not called. |
| This method and this parameter can be useful for tests scenarios. |
| </li> |
| <li> |
| <code>clearStm()</code> clears STM state. Memory is cleared wholly or with some predicate. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| </li> |
| <li> |
| <code>clearDialog()</code> clears dialog state. Dialog is cleared wholly or with some predicate. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| </li> |
| <li> |
| <code>close()</code> - Closes client. You can't call another client's methods after this method was closed. |
| </li> |
| </ul> |
| </section> |
| |
| <section id="model-configuration"> |
| <h2 class="section-title">Model configuration <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| <code>Model configuration</code> which is represented as is set of model parameter values. |
| Its API representation is <a href="apis/latest/org/apache/nlpcraft/NCModelConfig.html">NCModelConfig</a>. |
| </p> |
| <ul> |
| <li> |
| <code>id</code>, <code>name</code> and <code>version</code> are mandatory model descriptors. |
| </li> |
| <li> |
| <code>description</code> and <code>origin</code> are optional model descriptors. |
| </li> |
| <li> |
| <code>conversationTimeout</code> - timeout of the user's conversation. |
| If user doesn't communicate with the model this time period STM is going to be cleared. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| Mandatory parameter with default value. |
| </li> |
| <li> |
| <code>conversationDepth</code> - maximum supported depth the user's conversation. |
| Loot at <a href="short-term-memory.html">Conversation</a> chapter to get more details. |
| Mandatory parameter with default value. |
| </li> |
| </ul> |
| </section> |
| |
| <section id="model-pipeline"> |
| <h2 class="section-title">Model pipeline <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| Model <code>Pipeline</code> is represented as <a href="apis/latest/org/apache/nlpcraft/NCPipeline.html">NCPipeline</a> and |
| contains following components: |
| </p> |
| <ul> |
| <li> |
| <code>Token parser</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCTokenParser.html">NCTokenParser</a>. |
| Mandatory pipeline component, it is required for parsing plain text, user input, and split this text |
| into tokens list. NlpCraft provides default EN implementation of token parser. |
| Also, project contain various examples for <a href="examples/light_switch_fr.html">French</a> and |
| <a href="examples/light_switch_ru.html">Russia</a> languages. |
| </li> |
| <li> |
| <code>Tokens enrichers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCTokenEnricher.html">NCTokenEnricher</a> optional list. |
| Tokens enricher is component which allows to add additional properties to prepared tokens, |
| like part of speech, quote, stop-words flags or any other. |
| NlpCraft provides default set of EN tokens enrichers implementations. |
| </li> |
| <li> |
| <code>Tokens validators</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCTokenValidator.html">NCTokenValidator</a> optional list. |
| Tokens validator is user defined component, where tokens are inspected and exception can be thrown |
| from user code to break user input processing. |
| </li> |
| <li> |
| <code>Entity parsers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityParser.html">NCEntityParser</a> mandatory list. |
| At least one entity parser must be defined. Having prepared tokens as input, |
| each entity parser tries to find user defined named entities. |
| NlpCraft provides wrappers for named-entity recognition components of OpenNLP and Stanford libraries. |
| </li> |
| <li> |
| <code>Entity enrichers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityEnricher.html">NCEntityEnricher</a> optional list. |
| Entity enricher is component which allows to add additional properties to prepared entities. |
| Can be useful for extending existing entity enrichers functionality. |
| </li> |
| <li> |
| <code>Entity mappers</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityMapper.html">NCEntityMapper</a> optional list. |
| Entity mapper is component which allows to map one set of entities into another after the entities |
| were parsed and enriched. Can be useful for building complex parsers based on existing. |
| </li> |
| <li> |
| <code>Entity validators</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCEntityValidator.html">NCEntityValidator</a> optional list. |
| Entities validator is user defined component, where prepared entities are inspected and |
| exceptions can be thrown from user code to break user input processing. |
| </li> |
| <li> |
| <code>Variant filter</code> represented as <a href="apis/latest/org/apache/nlpcraft/NCVariantFilter.html">NCVariantFilter</a>. |
| Optional component which allows filtering detected variants, rejecting undesirable. |
| </li> |
| </ul> |
| |
| <p> |
| Below <code>Model</code> creation example. |
| <code>Pipeline</code> is prepared using <code>NCPipelineBuilder</code> class helper. |
| </p> |
| |
| <pre class="brush: scala, highlight: []"> |
| val pipeline = |
| new NCPipelineBuilder(). |
| withTokenParser(new NCFrTokenParser()). |
| withTokenEnricher(new NCFrLemmaPosTokenEnricher()). |
| withTokenEnricher(new NCFrStopWordsTokenEnricher()). |
| withEntityParser(new NCFrSemanticEntityParser("lightswitch_model_fr.yaml")). |
| build |
| val cfg = NCModelConfig("nlpcraft.lightswitch.fr.ex", "LightSwitch Example Model FR", "1.0") |
| |
| val mdl = new NCModelAdapter(cfg, pipeline) |
| </pre> |
| |
| <p> |
| This flexible system allows to create any pipelines on any language. |
| You can collect NlpCraft predefined components, write your own and easy reuse custom components. |
| </p> |
| </section> |
| |
| <section id="model-behavior"> |
| <h2 class="section-title">Model behavior overriding<a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| |
| <p> |
| There are also several <a href="/apis/latest/org/apache/nlpcraft/NCModel.html">NCModel</a> |
| callbacks that you can override to affect model behavior during |
| <a href="/intent-matching.html#model_callbacks">intent matching</a> |
| to perform logging, debugging, statistic or usage collection, explicit update or initialization of |
| conversation context, security audit or validation: |
| </p> |
| <ul> |
| <li> |
| Overriding <code>onMatchedIntent()</code> allows to reject matched intent and continue matching process. |
| </li> |
| <li> |
| Overriding <code>onResult()</code> allows to replace callback method execution result. |
| </li> |
| <li> |
| Overriding <code>onRejection()</code> allows to change operation result when rejection occurs. |
| </li> |
| <li> |
| Overriding <code>onError()</code> allows to change operation result when any error occurs. |
| </li> |
| </ul> |
| </section> |
| </div> |
| <div class="col-md-2 third-column"> |
| <ul class="side-nav"> |
| <li class="side-nav-title">On This Page</li> |
| <li><a href="#overview">Overview</a></li> |
| <li><a href="#model">Model overview</a></li> |
| <li><a href="#client">Client overview</a></li> |
| <li><a href="#model-configuration">Model configuration</a></li> |
| <li><a href="#model-pipeline">Model pipeline</a></li> |
| <li><a href="#model-behavior">Model behavior overriding</a></li> |
| {% include quick-links.html %} |
| </ul> |
| </div> |
| |
| |
| |
| |