NLPCraft provides automatic conversation management right out of the box that is fully integrated with intent matching. Conversation management is based on the idea of short-term memory (STM) - the mechanism by which NLPCraft "remembers" the context of the conversation for a certain amount of time and certain depth of the conversation. STM is automatically maintained by NLPCraft per each user and data model combination. Intent matching algorithm can "recall" the missing tokens from STM when trying to find a match for the conversational terms.
Why is this so important? Maintaining conversation state is necessary for effective context resolution, so that users could ask, for example, the following sequence of questions using example weather model:
User gets the current London’s weather. STM is empty at this moment so NLPCraft expects to get all necessary information from the user sentence. Meaningful parts of the sentence get stored in STM.
User gets the current Berlin’s weather. The only useful data in the user sentence is name of the city Berlin
. But since NLPCraft now has data from the previous question in its STM it can safely deduce that we are asking about weather
for today
. Berlin
overrides London
in STM.
User gets the next week forecast for Berlin. Again, the only useful data in the user sentence is next week
and forecast
. STM supplies Berlin
. Next week
override today
, and forecast
override weather
in STM.
Note that STM is maintained per user and per data model. Conversation management implementation is also smart enough to clear STM after certain period of time, i.e. it “forgets” the conversational context after few minutes of inactivity. Note also that conversational context can also be cleared explicitly using NCConversation.
To understand the algorithm behind the STM management let's back up a few steps...
One of the key objectives when parsing user input sentence for Natural Language Understanding (NLU) is to detect all possible semantic entities, a.k.a named entities. Let's consider a few examples:
"What's the current weather in Tokyo?"
weather
as well as all necessary parameters like time (current
) and location (Tokyo
)."What about Tokyo?"
"What's the weather?"
Sometimes we can use default values like the current user's location and the current time (if they are missing). However, this can lead to the wrong interpretation if the conversation has an existing context.
In real life, as well as in NLP-based systems, we always try to start a conversation with a fully defined sentence since without a context the missing information cannot be obtained and the sentenced cannot be interpreted.
Let's take a closer look at the named entities from the above examples:
weather
- this is an indicator of the subject of the conversation. Note that it indicates the type of question rather than being an entity with multiple possible values.current
- this is an entity of type Date
with the value of now
.Tokyo
- this is an entity of type Location
with two values:city
- type of the location.Tokyo, Japan
- normalized name of the location.We have two distinct classes of entities:
weather
is the type indicator for the subject of the user input.current
and Tokyo
entities.Assuming previously asked questions about the weather in Tokyo (in the span of the ongoing conversation) one could presumably ask the following questions using a shorter, incomplete, form:
"What about Kyoto?
"What about tomorrow?"
Kyoto
as the location since it was mentioned the last.These are incomplete sentences. This type of short-hands cannot be interpreted without prior context (neither by humans nor by machines) since by themselves they are missing necessary information. In the context of the conversation, however, these incomplete sentences work. We can simply provide one or two entities and rely on the "listener" to recall the rest of missing information from a short-term memory.
In NLPCraft, the intent-matching logic will automatically try to find missing information in the conversation context (that is automatically maintained). Moreover, it will properly treat such recalled information during weighted intent matching since it naturally has less "weight" than something that was found explicitly in the user input.
The short-term memory is exactly that - a memory that keeps only small amount of recently used information and that evicts its contents after a short period of inactivity.
Let's look at the example from a real life. If you called your friend in a couple of hours asking "What about a day after?"
(still talking about weather in Kyoto) - he'll likely be thoroughly confused. The conversation is timed out, and your friend has lost (forgotten) its context. You will have to explain again to your confused friend what is that you are asking about...
NLPCraft has a simple rule that 5 minutes pause in conversation leads to the conversation context reset. However, what happens if we switch the topic before this timeout elapsed?
Resetting the context by the timeout is, obviously, not a hard thing to do. What can be trickier is to detect when conversation topic is switched and the previous context needs to be forgotten to avoid very confusing interpretation errors. It is uncanny how humans can detect such switch with seemingly no effort, and yet automating this task by the computer is anything but effortless...
Let's continue our weather-related conversation. All of a sudden, we ask about something completely different:
"How much is mocha latter at Starbucks?"
"What about Peet's?"
"...and croissant?"
Despite somewhat obvious logic the implementation of context switch is not an exact science. Sometimes, you can have a "soft" context switch where you don't change the topic of the conversation 100% but yet sufficiently enough to forget at least some parts of the previously collected context. NLPCraft has a built-in algorithm to detect the hard switch in the conversation. You can also use NCConversation to perform a selective reset on the conversation in case of "soft" switch.
See NCConversation interface for API details for STM management.
As we've seen above one named entity can replace or override an older entity in the STM, e.g. "Peet's"
replaced "Starbucks"
in our previous questions. The actual algorithm that governs this logic is one of the most important part of STM implementation. In human conversations we perform this logic seemingly subconsciously — but the computer algorithm to do it is not that trivial. Let's see how it is done in NLPCraft.
One of the important supporting design decision is that an entity can belong to one or more groups. You can think of groups as types, or classes of entities (to be mathematically precise these are the categories). The entity's membership in such groups is what drives the rule of overriding.
Let's look at the specific example.
Consider a data model that defined 3 entities:
"sell"
(with synonym "sales"
)"buy"
(with synonym "purchase"
)"best_employee"
(with synonyms like "best"
, "best employee"
, "best colleague"
)Our task is to support for following conversation:
"Give me the sales data"
"sell"
entity by its synonym "sales"
."Who was the best?"
"best_employee"
and we should pick "sell"
entity from the STM."OK, give me the purchasing report now."
"best_employee"
entity from STM and, in fact, we should remove it from STM."...and who's the best there?"
"best_employee"
entity and we should pick "buy"
entity from STM."One more time - show me the general purchasing data again"
"best_employee"
from STM.Here's the rule we developed at NLPCraft and have been successfully using in various models:
The entity will override other entity or entities in STM that belong to the same group set or its superset.
In other words, an entity with a smaller group set (more specific one) will override entity with the same or larger group set (more generic one). Let's consider an entity that belongs to the following groups: {G1, G2, G3}
. This entity:
{G1, G2, G3}
groups (same set).{G1, G2, G3, G4}
groups (superset).{G1, G2}
groups.{G10, G20}
groups.Let's come back to our sell/buy/best example. To interpret the questions we've outlined above we need to have the following 4 intents:
intent=sale term={# == 'sale'}
intent=best_sale_person term={# == 'sale'} term={# == 'best_employee'}
intent=buy term={# == 'buy'}
intent=buy_best_person term={# == 'buy'} term={# == 'best_employee'}
(this is actual Intent Definition Language (IDL) used by NLPCraft - term
here is basically what's often referred to as a slot in other systems).
We also need to properly configure groups for our entities (names of the groups are arbitrary):
"sell"
belongs to group A"buy"
belongs to group B"best_employee"
belongs to groups A and BLet’s run through our example again with this configuration:
"Give me the sales data"
"sell"
entity with group A in STM."Who was the best?"
"best_employee"
entity."OK, give me the purchasing report now."
"buy"
entity with group A."buy"
entity with group A in STM.In some cases you may need to explicitly clear the conversation STM without relying on algorithmic behavior. It happens when current and new topic of the conversation share some of the same entities. Although at first it sounds counter-intuitive there are many examples of that in day to day life.
Let’s look at this sample conversation:
"What the weather in Tokyo?"
"Let’s do New York after all then!"
The second question was about going to New York (booking tickets, etc.). In real life - your counterpart will likely ask what you mean by "doing New York after all" and you’ll have to explain the abrupt change in the topic. You can avoid this confusion by simply saying: "Enough about weather! Let’s talk about this weekend plans" - after which the second question becomes clear. That sentence is an explicit context switch which you can also detect in the NLPCraft model.
In NLPCraft you can also explicitly reset conversation context through NCConversation interface or by switching the model on the request.