Skip to main content

Natural language understanding (NLU)

tip
NLU is a field of Natural Language Processing concerned with understanding the meaning of statements made in natural language.

NLU is one of the key functions which chatbots must have. It enables them to interpret requests and react to them in accordance with their clients’ expectations.

Chatbots apply NLU algorithms to solve two basic problems: detecting the communicative intention (intent) of their interlocutor and recognizing any mentioned named entities.

NLU features

The NLU core has the following features:

  • Detecting user intents. An intent is a key entity of the NLU service; it combines a set of phrases, user intent and other metadata.
  • System and custom entities. An entity is a unit of the NLU core. An entity is a sequence of words linked by an intent or rule. For example: names, date and time, location, etc.
  • Client entities are entities that can be personalized by the client during a conversation with the bot. The contents of such an entity are accessible to the client only. Client entities are used when personalization is required to identify intents.
  • Patterns are formal rules that describe keywords and expressions. You can use patterns to assign a client reply to one of the existing system states that defines state-specific reactions.
  • Slot filling is the process of inquiring about additional details in order to process a client request. The data acquired during an additional data request are available for use in the script.
  • Data labeling is a tool you can use to extract message subjects from the loaded data to which the bot will respond.
  • Extended NLU settings. You can configure new NLU options unique for each project.
  • NLP Direct API provides means for managing NLU.

NLU languages

Supported languages

When you create a project, the mandatory NLU language parameter determines the language the bot will understand. For every supported language, JAICP automatically implements the following:

  • a library for tokenization and morphological parsing;
  • built-in algorithms for intent recognition;
  • a set of standard entities.
caution
All languages other than English, Russian, and Chinese only support the STS classification algorithm.
LanguageNote
EnglishSupports paraphrasing training phrases.
ChineseDoes not support:
• fuzzy search and normalization of NLU entities.
• the ~ and $morph advanced pattern elements.
Danish
Dutch
French
German
Greek
Italian
JapaneseDoes not support the recognition of time and numbers written out in full.
KazakhIf date, time, or numbers are written out in full, you can recognize them using the zb.datetime and zb.number system entities.
LithuanianDoes not support the recognition of time and numbers written out in full.
Polish
Portuguese
Romanian
RussianSupports spell checking and paraphrasing training phrases.
Spanish
UkrainianSupports spell checking.

Other languages

If your project requires support for a language not provided by JAICP, you can connect an external NLU service with support for any other language and use it instead.

You can develop such a service yourself or use a third-party one. The external NLU service must comply with the Model API specification.

Bot script

NLU core parameters

The NLU core parameters are specified by default in chatbot.yaml:

language: en
botEngine: v2

nlp:
intentNoMatchThresholds:
phrases: 0.2
patterns: 0.2

The parameters are as follows:

  • language is the classifier language.

  • intentNoMatchThresholds is the minimum required similarity of the request to the intent phrases or intent patterns. The default value of phrases and patterns is 0.2. If the classifier cannot categorize the request, a noMatch event is triggered.

    tip

    You can also set a threshold value for patterns from the q and q! tags using the patternNoMatchThreshold parameter.

Combined use of intents and patterns

JAICP allows the combined use of intents and patterns in a single script. State activation rules defined using these engines have different priorities.

The mechanism for selecting activation rules when using intents and patterns can also be redefined manually, using either: