Skip to main content

Speech synthesis markup

If you write bot responses for the phone channel, you can use special speech synthesis markup to control the pronunciation of words and phrases.

tip
The markup elements can be different depending on the TTS provider you are using.

SSML

Speech Synthesis Markup Language (SSML) is an XML-based markup language for speech synthesis. SSML allows you to customize the bot’s speech more flexibly, making it more natural and expressive.

You can check the list of supported SSML tags in the documentation of the selected provider. JAICP supports SSML for the following TTS providers:

There are several ways you can use SSML in the script:

  • The a reaction tag. Specify the tts parameter after the tag and pass the text with markup as its value:

     a: You shall not pass! || tts = "<emphasis>You</emphasis> shall not pass!"
  • Replies with the text type. Pass the text with markup in the tts property of the reply object:

    script:
    $response.replies = $response.replies || [];
    $response.replies.push({
    "type": "text",
    "text": "Cr is my favorite chemical element.",
    "tts": "<sub alias=\"Chromium\">Cr</sub> is my favorite chemical element."
    });
  • The $reactions.answer method. Pass the text with markup in the tts property of the method argument:

    script:
    $reactions.answer({
    "value": "Lucky you! You get 10% off your next purchase!",
    "tts": "Lucky you! <break time=\"1s\"> You get ten per cent off your next purchase!"
    });
tip

You can omit the speak tag in the marked-up text. If the speak tag is not specified, JAICP will wrap the whole text in this tag automatically.

Simplified markup

The Yandex v1 and v3 providers also support a simplified speech synthesis markup, which is not compatible with SSML. If simplified markup features are enough for your case, use it instead of SSML.

You can use the simplified markup in the a tag, the text response type, and the $reactions.answer method.

caution
If you use the simplified markup, do not pass the optional tts field in the response.

Yandex

The simplified markup is supported in TTS v1 and v3.

info

See Yandex SpeechKit documentation for a complete list of simplified markup elements.

Examples:

  • Use + before a stressed vowel any time you need to define the correct way to pronounce a word:

    a: They signed the c+ontract the following day.
  • To add a pause, you the sil<[t]> tag, where t is the pause duration in milliseconds:

    a: Stop! sil<[300]> Think about it!