Troubleshooting

Transformer

Performs poorly with negative examples

The classifier assumes that it knows all intents. As a result, the classifier gives a high weight to one of the intents, even though the phrase does not match any of them.

The problem is common with Transformer multi.

Solution

Create a NoMatch intent group.
Add intents to the group for phrases that the classifier mistakenly classifies as other intents.

Fill this group of intents based on real conversation logs. This approach lets you gradually collect training phrases for new topics, which can later be added to the bot.
Add an activation to the script using intentGroup: /NoMatch.

Performs poorly with similar intents

The classifier might confuse intents that differ by just one entity, for example: apply for a credit card and apply for a debit card, consent to a conversation and consent to an offer.

The classifier might also distribute weight across intents, which can result in none of them reaching the activation threshold.

Solution

Use patterns instead of intents in local transitions and clarifications in the script.
Combine similar intents into one. In the script, determine the client intent based on the triggered entity.
Avoid using classification rules. After you filter the results, there might not be any intents left with sufficient weight.

Does not recognize synonyms

The model supports general language synonyms, but might not recognize domain-specific ones relevant to your project.

Solution

Add more variations with different synonyms and word combinations to your training phrases.

Requests match the wrong intents due to non-essential words

A possible cause is an imbalance of non-essential words in intents.

If there are few non-essential words in the training phrases, the classifier treats those words as important for the intent.

Solution

Options:

Add lots of different non-essential words to your phrases. Non-essential words should be evenly distributed across intents.
Remove all non-essential words from the dataset, make phrases as semantically loaded as possible.
Create a stop word dictionary and clean user requests in the preMatch handler.

Deep Learning

Long phrases are recognized with very low weight (0–0.1)

The algorithm takes into account all tokens present in the request. The more tokens there are, the more difficult it is to classify.

Solution

Train the algorithm to ignore non-essential words in a request, such as greetings, polite expressions, and similar.

Add non-essential words and phrases that you want to ignore to your training phrases. They should be evenly distributed across intents so that the algorithm does not consider them important.
Increase the value of the emb_drp parameter to artificially deactivate some weights and avoid overfitting.

Performs poorly with negative examples

The classifier assumes that it knows all intents. As a result, the classifier gives a high weight to one of the intents, even though the phrase does not match any of them.

Solution

Create a NoMatch intent group.
Add intents to the group for phrases that the classifier mistakenly classifies as other intents.

Fill this group of intents based on real conversation logs. This approach lets you gradually collect training phrases for new topics, which can later be added to the bot.
Add an activation to the script using intentGroup: /NoMatch.

Performs poorly with similar intents

The classifier might confuse intents that differ by just one entity, for example: apply for a credit card and apply for a debit card, consent to a conversation and consent to an offer.

Solution

Use patterns instead of intents in local transitions and clarifications in the script.
Combine similar intents into one. In the script, determine the client intent based on the triggered entity.
Avoid using classification rules. After you filter the results, there might not be any intents left with sufficient weight.

Letters of another alphabet always match the same intent

For example, Latin letters can always match the same intent. This happens because the algorithm uses embeddings for a single language only.

Solution

Enable the multi parameter. This might reduce the weight of all requests.
Use the $nlp.fixKeyboardLayout method if the user has chosen the wrong keyboard layout.

Low accuracy on a small dataset

The classifier needs to see enough examples of phrases to form an understanding of the intent.

Solution

Increase the n_epochs value. This parameter defines how many times the classifier sees the training phrases during training.

It affects training speed, so increasing the parameter is not recommended for large datasets.
If the value is too high, the model may overfit — it will perform well on dataset examples but poorly on any other input.

Does not recognize synonyms

Embeddings are trained on subwords. The model supports general language synonyms, but might not recognize domain-specific ones relevant to your project. The model might also have a poor understanding of word combinations.

Solution

Add more variations with different synonyms and word combinations to your training phrases.

Requests match the wrong intents due to non-essential words

A possible cause is an imbalance of non-essential words in intents.

If there are few non-essential words in the training phrases, the classifier treats those words as important for the intent.

Solution

Options:

Add lots of different non-essential words to your phrases. Non-essential words should be evenly distributed across intents.
Remove all non-essential words from the dataset, make phrases as semantically loaded as possible.
Create a stop word dictionary and clean user requests in the preMatch handler.

Classic ML

Performs poorly with similar intents

Only one intent can have a high enough weight.

For example, the classifier might confuse intents that differ by just one entity, such as: apply for a credit card and apply for a debit card, consent to a conversation and consent to an offer.

Solution

Combine similar intents into one. In the script, determine the client intent based on the triggered entity.
Use patterns instead of intents in local transitions and clarifications in the script.
Use classification rules.

Most requests match the same intent

The classifier needs to see enough examples of phrases to form an understanding of the intent. Classes with more examples tend to get more weight.

Solution

Make your intents balanced:

The dataset should not contain intents that contain significantly more or fewer phrases than others.
If some large intent includes several meanings or formulations, it can be divided into several.
If some smaller intents differ only by the presence of an entity, they can be merged into a single intent. In the script, determine the client intent based on the triggered entity.

Does not recognize synonyms

Embeddings are built based on the dataset, so the model is not aware of general-language synonyms.

Solution

Add more variations with different synonyms and word combinations to your training phrases.

Requests match the wrong intents due to non-essential words

A possible cause is an imbalance of non-essential words in intents.

If there are few non-essential words in the training phrases, the classifier treats those words as important for the intent.

Solution

Options:

Add lots of different non-essential words to your phrases. Non-essential words should be evenly distributed across intents.
Remove all non-essential words from the dataset, make phrases as semantically loaded as possible.
Create a stop word dictionary and clean user requests in the preMatch handler.

Transformer​

Performs poorly with negative examples​

Solution​

Performs poorly with similar intents​

Solution​

Does not recognize synonyms​

Solution​

Requests match the wrong intents due to non-essential words​

Solution​

Deep Learning​

Long phrases are recognized with very low weight (0–0.1)​

Solution​

Performs poorly with negative examples​

Solution​

Performs poorly with similar intents​

Solution​

Letters of another alphabet always match the same intent​

Solution​

Low accuracy on a small dataset​

Solution​

Does not recognize synonyms​

Solution​

Requests match the wrong intents due to non-essential words​

Solution​

Classic ML​

Performs poorly with similar intents​

Solution​

Most requests match the same intent​

Solution​

Does not recognize synonyms​

Solution​

Requests match the wrong intents due to non-essential words​

Solution​

Transformer

Performs poorly with negative examples

Solution

Performs poorly with similar intents

Solution

Does not recognize synonyms

Solution

Requests match the wrong intents due to non-essential words

Solution

Deep Learning

Long phrases are recognized with very low weight (0–0.1)

Solution

Performs poorly with negative examples

Solution

Performs poorly with similar intents

Solution

Letters of another alphabet always match the same intent

Solution

Low accuracy on a small dataset

Solution

Does not recognize synonyms

Solution

Requests match the wrong intents due to non-essential words

Solution

Classic ML

Performs poorly with similar intents

Solution

Most requests match the same intent

Solution

Does not recognize synonyms

Solution

Requests match the wrong intents due to non-essential words

Solution