blogs/short_term_memory.html - incubator-nlpcraft-website - Git at Google

 ---
 active_crumb: Short-Term Memory
 layout: blog
 blog_title: Short-Term Memory - Maintaining Conversation Context
 author_name: Aaron Radzinski
 author_avatar: images/lion.jpg
 author_twitter_id: aaron_radzinski
 publish_date: July 26, 2019
 ---

 <!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
 -->

 <section>
     {% include latest-ver-blog-warn.html %}
     <p>
         In this blog, I'll try to give a high-level overview of STM - Short-Term Memory, a technique used to
         maintain conversational context in NLPCraft. Maintaining the proper conversation context - remembering
         what the current conversation is about - is essential for all human interaction and thus essential for
         computer-based natural language understanding. To my knowledge, NLPCraft provides one of the most advanced
         implementations of STM, especially considering how tightly it is integrated with NLPCraft's unique
         intent-based matching (Google's <a target=google href="https://cloud.google.com/dialogflow/">DialogFlow</a> is very similar yet).
     </p>
     <p>
         Let's dive in.
     </p>
 </section>
 <section>
     <h2 class="section-title">Parsing User Input <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     <p>
         One of the key objectives when parsing user input sentence for Natural Language Understanding (NLU) is to
         detect all possible semantic entities, a.k.a <em>named entities</em>. Let's consider a few examples:
     </p>
     <ul>
         <li>
             <code>"What's the current weather in Tokyo?"</code><br/>
             This sentence is fully sufficient for the processing
             since it contains the topic <code>weather</code> as well as all necessary parameters
             like time (<code>current</code>) and location (<code>Tokyo</code>).
         </li>
         <li>
             <code>"What about Tokyo?"</code><br/>
             This is an unclear sentence since it does not have the subject of the
             question - what is it about Tokyo?
         </li>
         <li>
             <code>"What's the weather?"</code><br/>
             This is also unclear since we are missing important parameters
             of location and time for our request.
         </li>
     </ul>
     <p>
         Sometimes we can use default values like the current user's location and the current time (if they are missing).
         However, this can lead to the wrong interpretation if the conversation has an existing context.
     </p>
     <p>
         In real life, as well as in NLP-based systems, we always try to start a conversation with a fully defined
         sentence since without a context the missing information cannot be obtained and the sentenced cannot be interpreted.
     </p>
 </section>
 <section>
     <h2 class="section-title">Semantic Entities <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     <p>
         Let's take a closer look at the named entities from the above examples:
     </p>
     <ul>
         <li>
             <code>weather</code> - this is an indicator of the subject of the conversation. Note that it indicates
             the type of question rather than being an entity with multiple possible values.
         </li>
         <li>
             <code>current</code> - this is an entity of type <code>Date</code> with the value of <code>now</code>.
         </li>
         <li>
             <code>Tokyo</code> - this is an entity of type <code>Location</code> with two values:
             <ul>
                 <li><code>city</code> - type of the location.</li>
                 <li><code>Tokyo, Japan</code> - normalized name of the location.</li>
             </ul>
         </li>
     </ul>
     <p>
         We have two distinct classes of entities:
     </p>
     <ul>
         <li>
             Entities that have no values and only act as indicators or types. The entity <code>weather</code> is the
             type indicator for the subject of the user input.
         </li>
         <li>
             Entities that additionally have one or more specific values like <code>current</code> and <code>Tokyo</code> entities.
         </li>
     </ul>
     <div class="bq info">
         <div style="display: inline-block; margin-bottom: 20px">
             <a style="margin-right: 10px" target=_ href="https://opennlp.apache.org"><img src="/images/opennlp-logo-h32.png" alt=""></a>
             <a style="margin-right: 10px" target=_ href="https://cloud.google.com/natural-language/"><img src="/images/google-cloud-logo-small-h32.png" alt=""></a>
             <a style="margin-right: 10px" target=_ href="https://stanfordnlp.github.io/CoreNLP"><img src="/images/corenlp-logo-h48.png" alt=""></a>
             <a style="margin-right: 10px" target=_ href="https://spacy.io"><img src="/images/spacy-logo-h32.png" alt=""></a>
         </div>
         <p>
             Note that NLPCraft provides <a href="/integrations.html">support</a> for wide variety of named entities (with all built-in ones being properly normalized)
             including  <a href="/integrations.html">integrations</a> with
             <a target="spacy" href="https://spacy.io/">spaCy</a>,
             <a target="stanford" href="https://stanfordnlp.github.io/CoreNLP">Stanford CoreNLP</a>,
             <a target="opennlp" href="https://opennlp.apache.org/">OpenNLP</a> and
             <a target="google" href="https://cloud.google.com/natural-language/">Google Natural Language</a>.
         </p>
     </div>
 </section>
 <section>
     <h2 class="section-title">Incomplete Sentences <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     <p>
         Assuming previously asked questions about the weather in Tokyo (in the span of the ongoing conversation) one
         could presumably ask the following questions using a <em>shorter, incomplete</em>, form:
     </p>
     <ul>
         <li>
             <code>"What about Kyoto?</code><br/>
             This question is missing both the subject and the time. However, we
             can safely assume we are still talking about current weather.
         </li>
         <li>
             <code>"What about tomorrow?"</code><br/>
             Like above we automatically assume the weather subject but
             use <code>Kyoto</code> as the location since it was mentioned the last.
         </li>
     </ul>
     <p>
         These are incomplete sentences. This type of short-hands cannot be interpreted without prior context (neither
         by humans or by machines) since by themselves they are missing necessary information.
         In the context of the conversation, however, these incomplete sentences work. We can simply provide one or two
         entities and rely on the <em>"listener"</em> to recall the rest of missing information from a
         <em>short-term memory</em>, a.k.a conversation context.
     </p>
     <p>
         In NLPCraft, the intent-matching logic will automatically try to find missing information in the
         conversation context (that is automatically maintained). Moreover, it will properly treat such recalled
         information during weighted intent matching since it naturally has less "weight" than something that was
         found explicitly in the user input.
     </p>
 </section>
 <section>
     <h2 class="section-title">Short-Term Memory <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     <p>
         The short-term memory is exactly that... a memory that keeps only small amount of recently used information
         and that evicts its contents after a short period of inactivity.
     </p>
     <p>
         Let's look at the example from a real life. If you would call your friend in a couple of hours asking <code>"What about a day after?"</code>
         (still talking about weather in Kyoto) - he'll likely be thoroughly confused. The conversation is timed out, and
         your friend has lost (forgotten) its context. You will have to explain again to your confused friend what is that you are asking about...
     </p>
     <p>
         NLPCraft has a simple rule that 5 minutes pause in conversation leads to the conversation context reset. However,
         what happens if we switch the topic before this timeout elapsed?
     </p>
 </section>
 <section>
     <h2 class="section-title">Context Switch <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     <p>
         Resetting the context by the timeout is, obviously, not a hard thing to do. What can be trickier is to detect
         when conversation topic is switched and the previous context needs to be forgotten to avoid very
         confusing interpretation errors. It is uncanny how humans can detect such switch with seemingly no effort, and yet
         automating this task by the computer is anything but effortless...
     </p>
     <p>
         Let's continue our weather-related conversation. All of a sudden, we ask about something completely different:
     </p>
     <ul>
         <li>
             <code>"How much mocha latter at Starbucks?"</code><br/>
             At this point we should forget all about previous conversation about weather and assume going forward
             that we are talking about coffee in Starbucks.
         </li>
         <li>
             <code>"What about Peet's?"</code><br/>
             We are talking about latter at Peet's.
         </li>
         <li>
             <code>"...and croissant?"</code><br/>
             Asking about Peet's crescent-shaped fresh rolls.
         </li>
     </ul>
     <p>
         Despite somewhat obvious logic the implementation of context switch is not an exact science. Sometimes, you
         can have a "soft" context switch where you don't change the topic of the conversation 100% but yet sufficiently
         enough to forget at least some parts of the previously collected context. NLPCraft has a built-in algorithm
         to detect the hard switch in the conversation. It also exposes API to perform a selective reset on the
         conversation in case of "soft" switch.
     </p>
 </section>
 <section>
     <h2 class="section-title">Overriding Entities <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     <p>
         As we've seen above one named entity can replace or override an older entity in the STM, e.g. <code>"Peet's"</code>
         replaced <code>"Starbucks"</code> in our previous questions. <b>The actual algorithm that governs this logic is one
         of the most important part of STM implementation.</b> In human conversations we perform this logic seemingly
         subconsciously — but the computer algorithm to do it is not that trivial. Let's see how it is done in NLPCraft.
     </p>
     <p>
         One of the important supporting design decision is that an entity can belong to one or more groups. You can think of
         groups as types, or classes of entities (to be mathematically precise these are the categories). The entity's
         membership in such groups is what drives the rule of overriding.
     </p>
     <p>
         Let's look at the specific example.
     </p>
     <p>
         Consider a data model that defined 3 entities:
     </p>
     <ul>
         <li>
             <code>"sell"</code> (with synonym <code>"sales"</code>)
         </li>
         <li>
             <code>"buy"</code> (with synonym <code>"purchase"</code>)
         </li>
         <li>
             <code>"best_employee"</code> (with synonyms like <code>"best"</code>, <code>"best employee"</code>, <code>"best colleague"</code>)
         </li>
     </ul>
     <p>
         Our task is to support for following conversation:
     </p>
     <ul>
         <li>
             <code>"Give me the sales data"</code><br/>
             We return sales information since we detected <code>"sell"</code> entity by its synonym <code>"sales"</code>.
         </li>
         <li>
             <code>"Who was the best?"</code><br/>
             We return the best salesmen since we detected <code>"best_employee"</code> and we should pick <code>"sell"</code> entity from the STM.
         </li>
         <li>
             <code>"OK, give me the purchasing report now."</code><br/>
             This is a bit trickier. We should return general purchasing data and not a best purchaser employee.
             It feels counter-intuitive but we should NOT take <code>"best_employee"</code> entity from STM and, in fact, we should remove it from STM.
         </li>
         <li>
             <code>"...and who's the best there?"</code><br/>
             Now, we should return the best purchasing employee. We detected <code>"best_employee"</code> entity and we should pick <code>"buy"</code> entity from STM.
         </li>
         <li>
             <code>"One more time - show me the general purchasing data again"</code><br/>
             Once again, we should return a general purchasing report and ignore (and remove) <code>"best_employee"</code> from STM.
         </li>
     </ul>
 </section>
 <section>
     <h2 class="section-title">Overriding Rule <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     <p>
         Here's the rule we developed at NLPCraft and have been successfully using in various models:
     </p>
     <div class="bq success">
         <div class="bq-idea-container">
             <div><div>💡</div></div>
             <div>
                 <b>Overriding Rule</b>
                 <p>
                     The entity will override other entity or entities in STM that belong to the same group set or its superset.
                 </p>
             </div>
         </div>
     </div>
     <p>
         In other words, an entity with a smaller group set (more specific one) will override entity with the same
         or larger group set (more generic one).
         Let's consider an entity that belongs to the following groups: <code>{G1, G2, G3}</code>. This entity:
     </p>
     <ul>
         <li>
             WILL override existing entity belonging to <code>{G1, G2, G3}</code> groups (same set).
         </li>
         <li>
             WILL override existing entity belonging to <code>{G1, G2, G3, G4}</code> groups (superset).
         </li>
         <li>
             WILL NOT override existing entity belonging to <code>{G1, G2}</code> groups.
         </li>
         <li>
             WIL NOT override existing entity belonging to <code>{G10, G20}</code> groups.
         </li>
     </ul>
     <p>
         Let's come back to our sell/buy/best example. To interpret the questions we've outlined above we need to
         have the following 4 intents:
     </p>
     <ul>
         <li><code>id=sale term={# == 'sale'}</code></li>
         <li><code>id=best_sale_person term={# == 'sale'} term={# == best_employee}</code></li>
         <li><code>id=buy term={# == 'buy'}</code></li>
         <li><code>id=buy_best_person term={# == 'buy'} term={# == best_employee}</code></li>
     </ul>
     <p>
         (this is actual <a href="/intent-matching.html">Intent Definition Language</a> (IDL) used by NLPCraft -
         <code>term</code> here is basically what's often referred to as a slot in other systems).
     </p>
     <p>
         We also need to properly configure groups for our entities (names of the groups are arbitrary):
     </p>
     <ul>
         <li>Entity <code>"sell"</code> belongs to group <b>A</b></li>
         <li>Entity <code>"buy"</code> belongs to group <b>B</b></li>
         <li>Entity <code>"best_employee"</code> belongs to groups <b>A</b> and <b>B</b></li>
     </ul>
     <p>
         Let’s run through our example again with this configuration:
     </p>
     <ul>
         <li>
             <code>"Give me the sales data"</code>
             <ul>
                 <li>We detected entity from group <b>A</b>.</li>
                 <li>STM is empty at this point.</li>
                 <li>Return general sales report.</li>
                 <li>Store <code>"sell"</code> entity with group <b>A</b> in STM.</li>
             </ul>
         </li>
         <li>
             <code>"Who was the best?"</code>
             <ul>
                 <li>We detected entity belonging to groups <b>A</b> and <b>B</b>.</li>
                 <li>STM has entity belonging to group <b>A</b>.</li>
                 <li><b>{A, B}</b> does NOT override <b>{A}</b>.</li>
                 <li>Return best salesmen report.</li>
                 <li>Store detected <code>"best_employee"</code> entity.</li>
                 <li>STM now has two entities with <b>{A}</b> and <b>{A, B}</b> group sets.</li>
             </ul>
         </li>
         <li>
             <code>"OK, give me the purchasing report now."</code>
             <ul>
                 <li>We detected <code>"buy"</code> entity with group <b>A</b>.</li>
                 <li>STM has two entities with <b>{A}</b> and <b>{A, B}</b> group sets.</li>
                 <li><b>{A}</b> overrides both <b>{A}</b> and <b>{A, B}</b>.</li>
                 <li>Return general purchasing report.</li>
                 <li>Store <code>"buy"</code> entity with group <b>A</b> in STM.</li>
             </ul>
         </li>
     </ul>
     <p>
         And so on... easy, huh 😇 In fact, the logic is indeed relatively straightforward. It also follows
         common sense where the logic produced by this rule matches the expected human behavior.
     </p>
 </section>
 <section>
     <h2 class="section-title">Explicit Context Switch <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     <p>
         In some cases you may need to explicitly clear the conversation STM without relying on algorithmic behavior.
         It happens when current and new topic of the conversation share some of the same entities. Although at first
         it sounds counter-intuitive there are many examples of that in day to day life.
     </p>
     <p>
         Let’s look at this sample conversation:
     </p>
     <ul>
         <li>
             <b>Q</b>: <code>"What the weather in Tokyo?"</code><br/>
             <b>A</b>: Current weather in Tokyo...
         </li>
         <li>
             <b>Q</b>: <code>"Let’s do New York after all then!"</code><br/>
             <b>A</b>: Without an explicit conversation reset we would return current New York weather 🤔
         </li>
     </ul>
     <p>
         The second question was about going to New York (booking tickets, etc.). In real life - your
         counterpart will likely ask what you mean by "doing New York after all" and you’ll have to explain
         the abrupt change in the topic.
         You can avoid this confusion by simply saying: "Enough about weather! Let’s talk about this weekend plans" - after
         which the second question becomes clear. That sentence is an explicit context switch which you can also detect
         in the NLPCraft model.
     </p>
     <p>
         In NLPCraft you can also explicitly reset conversation context through API or by switching the model on the request.
     </p>
 </section>
 <section>
     <h2 class="section-title">Summary <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     <p>
         Let’s collect all our thoughts on STM into a few bullet points:
     </p>
     <ul>
         <li>
             Missing entities in incomplete sentences can be auto-recalled from STM.
         </li>
         <li>
             Newly detected type/category entity is likely indicating the change of topic.
         </li>
         <li>
             The key property of STM is its short-time storage and overriding rule.
         </li>
         <li>
             The explicit context switch is an important mechanism.
         </li>
     </ul>
     <div class="bq info">
         <b>
             Short-Term Memory
         </b>
         <p>
             It is uncanny how properly implemented STM can make conversational interface <b>feel like a normal human
             conversation</b>. It allows to minimize the amount of parasitic dialogs and Q&A driven interfaces
             without unnecessarily complicating the implementation of such systems.
         </p>
     </div>
 </section>
	---
	active_crumb: Short-Term Memory
	layout: blog
	blog_title: Short-Term Memory - Maintaining Conversation Context
	author_name: Aaron Radzinski
	author_avatar: images/lion.jpg
	author_twitter_id: aaron_radzinski
	publish_date: July 26, 2019
	---

	<!--
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->

	<section>
	{% include latest-ver-blog-warn.html %}
	<p>
	In this blog, I'll try to give a high-level overview of STM - Short-Term Memory, a technique used to
	maintain conversational context in NLPCraft. Maintaining the proper conversation context - remembering
	what the current conversation is about - is essential for all human interaction and thus essential for
	computer-based natural language understanding. To my knowledge, NLPCraft provides one of the most advanced
	implementations of STM, especially considering how tightly it is integrated with NLPCraft's unique
	intent-based matching (Google's <a target=google href="https://cloud.google.com/dialogflow/">DialogFlow</a> is very similar yet).
	</p>
	<p>
	Let's dive in.
	</p>
	</section>
	<section>
	<h2 class="section-title">Parsing User Input <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
	<p>
	One of the key objectives when parsing user input sentence for Natural Language Understanding (NLU) is to
	detect all possible semantic entities, a.k.a <em>named entities</em>. Let's consider a few examples:
	</p>
	<ul>
	<li>
	<code>"What's the current weather in Tokyo?"</code><br/>
	This sentence is fully sufficient for the processing
	since it contains the topic <code>weather</code> as well as all necessary parameters
	like time (<code>current</code>) and location (<code>Tokyo</code>).
	</li>
	<li>
	<code>"What about Tokyo?"</code><br/>
	This is an unclear sentence since it does not have the subject of the
	question - what is it about Tokyo?
	</li>
	<li>
	<code>"What's the weather?"</code><br/>
	This is also unclear since we are missing important parameters
	of location and time for our request.
	</li>
	</ul>
	<p>
	Sometimes we can use default values like the current user's location and the current time (if they are missing).
	However, this can lead to the wrong interpretation if the conversation has an existing context.
	</p>
	<p>
	In real life, as well as in NLP-based systems, we always try to start a conversation with a fully defined
	sentence since without a context the missing information cannot be obtained and the sentenced cannot be interpreted.
	</p>
	</section>
	<section>
	<h2 class="section-title">Semantic Entities <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
	<p>
	Let's take a closer look at the named entities from the above examples:
	</p>
	<ul>
	<li>
	<code>weather</code> - this is an indicator of the subject of the conversation. Note that it indicates
	the type of question rather than being an entity with multiple possible values.
	</li>
	<li>
	<code>current</code> - this is an entity of type <code>Date</code> with the value of <code>now</code>.
	</li>
	<li>
	<code>Tokyo</code> - this is an entity of type <code>Location</code> with two values:
	<ul>
	<li><code>city</code> - type of the location.</li>
	<li><code>Tokyo, Japan</code> - normalized name of the location.</li>
	</ul>
	</li>
	</ul>
	<p>
	We have two distinct classes of entities:
	</p>
	<ul>
	<li>
	Entities that have no values and only act as indicators or types. The entity <code>weather</code> is the
	type indicator for the subject of the user input.
	</li>
	<li>
	Entities that additionally have one or more specific values like <code>current</code> and <code>Tokyo</code> entities.
	</li>
	</ul>
	<div class="bq info">
	<div style="display: inline-block; margin-bottom: 20px">
	<a style="margin-right: 10px" target=_ href="https://opennlp.apache.org"><img src="/images/opennlp-logo-h32.png" alt=""></a>
	<a style="margin-right: 10px" target=_ href="https://cloud.google.com/natural-language/"><img src="/images/google-cloud-logo-small-h32.png" alt=""></a>
	<a style="margin-right: 10px" target=_ href="https://stanfordnlp.github.io/CoreNLP"><img src="/images/corenlp-logo-h48.png" alt=""></a>
	<a style="margin-right: 10px" target=_ href="https://spacy.io"><img src="/images/spacy-logo-h32.png" alt=""></a>
	</div>
	<p>
	Note that NLPCraft provides <a href="/integrations.html">support</a> for wide variety of named entities (with all built-in ones being properly normalized)
	including <a href="/integrations.html">integrations</a> with
	<a target="spacy" href="https://spacy.io/">spaCy</a>,
	<a target="stanford" href="https://stanfordnlp.github.io/CoreNLP">Stanford CoreNLP</a>,
	<a target="opennlp" href="https://opennlp.apache.org/">OpenNLP</a> and
	<a target="google" href="https://cloud.google.com/natural-language/">Google Natural Language</a>.
	</p>
	</div>
	</section>
	<section>
	<h2 class="section-title">Incomplete Sentences <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
	<p>
	Assuming previously asked questions about the weather in Tokyo (in the span of the ongoing conversation) one
	could presumably ask the following questions using a <em>shorter, incomplete</em>, form:
	</p>
	<ul>
	<li>
	<code>"What about Kyoto?</code><br/>
	This question is missing both the subject and the time. However, we
	can safely assume we are still talking about current weather.
	</li>
	<li>
	<code>"What about tomorrow?"</code><br/>
	Like above we automatically assume the weather subject but
	use <code>Kyoto</code> as the location since it was mentioned the last.
	</li>
	</ul>
	<p>
	These are incomplete sentences. This type of short-hands cannot be interpreted without prior context (neither
	by humans or by machines) since by themselves they are missing necessary information.
	In the context of the conversation, however, these incomplete sentences work. We can simply provide one or two
	entities and rely on the <em>"listener"</em> to recall the rest of missing information from a
	<em>short-term memory</em>, a.k.a conversation context.
	</p>
	<p>
	In NLPCraft, the intent-matching logic will automatically try to find missing information in the
	conversation context (that is automatically maintained). Moreover, it will properly treat such recalled
	information during weighted intent matching since it naturally has less "weight" than something that was
	found explicitly in the user input.
	</p>
	</section>
	<section>
	<h2 class="section-title">Short-Term Memory <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
	<p>
	The short-term memory is exactly that... a memory that keeps only small amount of recently used information
	and that evicts its contents after a short period of inactivity.
	</p>
	<p>
	Let's look at the example from a real life. If you would call your friend in a couple of hours asking <code>"What about a day after?"</code>
	(still talking about weather in Kyoto) - he'll likely be thoroughly confused. The conversation is timed out, and
	your friend has lost (forgotten) its context. You will have to explain again to your confused friend what is that you are asking about...
	</p>
	<p>
	NLPCraft has a simple rule that 5 minutes pause in conversation leads to the conversation context reset. However,
	what happens if we switch the topic before this timeout elapsed?
	</p>
	</section>
	<section>
	<h2 class="section-title">Context Switch <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
	<p>
	Resetting the context by the timeout is, obviously, not a hard thing to do. What can be trickier is to detect
	when conversation topic is switched and the previous context needs to be forgotten to avoid very
	confusing interpretation errors. It is uncanny how humans can detect such switch with seemingly no effort, and yet
	automating this task by the computer is anything but effortless...
	</p>
	<p>
	Let's continue our weather-related conversation. All of a sudden, we ask about something completely different:
	</p>
	<ul>
	<li>
	<code>"How much mocha latter at Starbucks?"</code><br/>
	At this point we should forget all about previous conversation about weather and assume going forward
	that we are talking about coffee in Starbucks.
	</li>
	<li>
	<code>"What about Peet's?"</code><br/>
	We are talking about latter at Peet's.
	</li>
	<li>
	<code>"...and croissant?"</code><br/>
	Asking about Peet's crescent-shaped fresh rolls.
	</li>
	</ul>
	<p>
	Despite somewhat obvious logic the implementation of context switch is not an exact science. Sometimes, you
	can have a "soft" context switch where you don't change the topic of the conversation 100% but yet sufficiently
	enough to forget at least some parts of the previously collected context. NLPCraft has a built-in algorithm
	to detect the hard switch in the conversation. It also exposes API to perform a selective reset on the
	conversation in case of "soft" switch.
	</p>
	</section>
	<section>
	<h2 class="section-title">Overriding Entities <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
	<p>
	As we've seen above one named entity can replace or override an older entity in the STM, e.g. <code>"Peet's"</code>
	replaced <code>"Starbucks"</code> in our previous questions. <b>The actual algorithm that governs this logic is one
	of the most important part of STM implementation.</b> In human conversations we perform this logic seemingly
	subconsciously — but the computer algorithm to do it is not that trivial. Let's see how it is done in NLPCraft.
	</p>
	<p>
	One of the important supporting design decision is that an entity can belong to one or more groups. You can think of
	groups as types, or classes of entities (to be mathematically precise these are the categories). The entity's
	membership in such groups is what drives the rule of overriding.
	</p>
	<p>
	Let's look at the specific example.
	</p>
	<p>
	Consider a data model that defined 3 entities:
	</p>
	<ul>
	<li>
	<code>"sell"</code> (with synonym <code>"sales"</code>)
	</li>
	<li>
	<code>"buy"</code> (with synonym <code>"purchase"</code>)
	</li>
	<li>
	<code>"best_employee"</code> (with synonyms like <code>"best"</code>, <code>"best employee"</code>, <code>"best colleague"</code>)
	</li>
	</ul>
	<p>
	Our task is to support for following conversation:
	</p>
	<ul>
	<li>
	<code>"Give me the sales data"</code><br/>
	We return sales information since we detected <code>"sell"</code> entity by its synonym <code>"sales"</code>.
	</li>
	<li>
	<code>"Who was the best?"</code><br/>
	We return the best salesmen since we detected <code>"best_employee"</code> and we should pick <code>"sell"</code> entity from the STM.
	</li>
	<li>
	<code>"OK, give me the purchasing report now."</code><br/>
	This is a bit trickier. We should return general purchasing data and not a best purchaser employee.
	It feels counter-intuitive but we should NOT take <code>"best_employee"</code> entity from STM and, in fact, we should remove it from STM.
	</li>
	<li>
	<code>"...and who's the best there?"</code><br/>
	Now, we should return the best purchasing employee. We detected <code>"best_employee"</code> entity and we should pick <code>"buy"</code> entity from STM.
	</li>
	<li>
	<code>"One more time - show me the general purchasing data again"</code><br/>
	Once again, we should return a general purchasing report and ignore (and remove) <code>"best_employee"</code> from STM.
	</li>
	</ul>
	</section>
	<section>
	<h2 class="section-title">Overriding Rule <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
	<p>
	Here's the rule we developed at NLPCraft and have been successfully using in various models:
	</p>
	<div class="bq success">
	<div class="bq-idea-container">
	<div><div>💡</div></div>
	<div>
	<b>Overriding Rule</b>
	<p>
	The entity will override other entity or entities in STM that belong to the same group set or its superset.
	</p>
	</div>
	</div>
	</div>
	<p>
	In other words, an entity with a smaller group set (more specific one) will override entity with the same
	or larger group set (more generic one).
	Let's consider an entity that belongs to the following groups: <code>{G1, G2, G3}</code>. This entity:
	</p>
	<ul>
	<li>
	WILL override existing entity belonging to <code>{G1, G2, G3}</code> groups (same set).
	</li>
	<li>
	WILL override existing entity belonging to <code>{G1, G2, G3, G4}</code> groups (superset).
	</li>
	<li>
	WILL NOT override existing entity belonging to <code>{G1, G2}</code> groups.
	</li>
	<li>
	WIL NOT override existing entity belonging to <code>{G10, G20}</code> groups.
	</li>
	</ul>
	<p>
	Let's come back to our sell/buy/best example. To interpret the questions we've outlined above we need to
	have the following 4 intents:
	</p>
	<ul>
	<li><code>id=sale term={# == 'sale'}</code></li>
	<li><code>id=best_sale_person term={# == 'sale'} term={# == best_employee}</code></li>
	<li><code>id=buy term={# == 'buy'}</code></li>
	<li><code>id=buy_best_person term={# == 'buy'} term={# == best_employee}</code></li>
	</ul>
	<p>
	(this is actual <a href="/intent-matching.html">Intent Definition Language</a> (IDL) used by NLPCraft -
	<code>term</code> here is basically what's often referred to as a slot in other systems).
	</p>
	<p>
	We also need to properly configure groups for our entities (names of the groups are arbitrary):
	</p>
	<ul>
	<li>Entity <code>"sell"</code> belongs to group <b>A</b></li>
	<li>Entity <code>"buy"</code> belongs to group <b>B</b></li>
	<li>Entity <code>"best_employee"</code> belongs to groups <b>A</b> and <b>B</b></li>
	</ul>
	<p>
	Let’s run through our example again with this configuration:
	</p>
	<ul>
	<li>
	<code>"Give me the sales data"</code>
	<ul>
	<li>We detected entity from group <b>A</b>.</li>
	<li>STM is empty at this point.</li>
	<li>Return general sales report.</li>
	<li>Store <code>"sell"</code> entity with group <b>A</b> in STM.</li>
	</ul>
	</li>
	<li>
	<code>"Who was the best?"</code>
	<ul>
	<li>We detected entity belonging to groups <b>A</b> and <b>B</b>.</li>
	<li>STM has entity belonging to group <b>A</b>.</li>
	<li><b>{A, B}</b> does NOT override <b>{A}</b>.</li>
	<li>Return best salesmen report.</li>
	<li>Store detected <code>"best_employee"</code> entity.</li>
	<li>STM now has two entities with <b>{A}</b> and <b>{A, B}</b> group sets.</li>
	</ul>
	</li>
	<li>
	<code>"OK, give me the purchasing report now."</code>
	<ul>
	<li>We detected <code>"buy"</code> entity with group <b>A</b>.</li>
	<li>STM has two entities with <b>{A}</b> and <b>{A, B}</b> group sets.</li>
	<li><b>{A}</b> overrides both <b>{A}</b> and <b>{A, B}</b>.</li>
	<li>Return general purchasing report.</li>
	<li>Store <code>"buy"</code> entity with group <b>A</b> in STM.</li>
	</ul>
	</li>
	</ul>
	<p>
	And so on... easy, huh 😇 In fact, the logic is indeed relatively straightforward. It also follows
	common sense where the logic produced by this rule matches the expected human behavior.
	</p>
	</section>
	<section>
	<h2 class="section-title">Explicit Context Switch <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
	<p>
	In some cases you may need to explicitly clear the conversation STM without relying on algorithmic behavior.
	It happens when current and new topic of the conversation share some of the same entities. Although at first
	it sounds counter-intuitive there are many examples of that in day to day life.
	</p>
	<p>
	Let’s look at this sample conversation:
	</p>
	<ul>
	<li>
	<b>Q</b>: <code>"What the weather in Tokyo?"</code><br/>
	<b>A</b>: Current weather in Tokyo...
	</li>
	<li>
	<b>Q</b>: <code>"Let’s do New York after all then!"</code><br/>
	<b>A</b>: Without an explicit conversation reset we would return current New York weather 🤔
	</li>
	</ul>
	<p>
	The second question was about going to New York (booking tickets, etc.). In real life - your
	counterpart will likely ask what you mean by "doing New York after all" and you’ll have to explain
	the abrupt change in the topic.
	You can avoid this confusion by simply saying: "Enough about weather! Let’s talk about this weekend plans" - after
	which the second question becomes clear. That sentence is an explicit context switch which you can also detect
	in the NLPCraft model.
	</p>
	<p>
	In NLPCraft you can also explicitly reset conversation context through API or by switching the model on the request.
	</p>
	</section>
	<section>
	<h2 class="section-title">Summary <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
	<p>
	Let’s collect all our thoughts on STM into a few bullet points:
	</p>
	<ul>
	<li>
	Missing entities in incomplete sentences can be auto-recalled from STM.
	</li>
	<li>
	Newly detected type/category entity is likely indicating the change of topic.
	</li>
	<li>
	The key property of STM is its short-time storage and overriding rule.
	</li>
	<li>
	The explicit context switch is an important mechanism.
	</li>
	</ul>
	<div class="bq info">
	<b>
	Short-Term Memory
	</b>
	<p>
	It is uncanny how properly implemented STM can make conversational interface <b>feel like a normal human
	conversation</b>. It allows to minimize the amount of parasitic dialogs and Q&A driven interfaces
	without unnecessarily complicating the implementation of such systems.
	</p>
	</div>
	</section>