Speech recognition and synthesis
Bots that make and accept calls use automatic speech recognition and synthesis:
- Automatic Speech Recognition (ASR) is the process of translating speech to text.
- Text-To-Speech (TTS), or speech synthesis, is the process of generating speech from written text.
When creating a phone channel, you can do either of the following:
Select one of the ASR/TTS providers supported by Just AI.
You can then customize speech recognition and text-to-speech settings in JAICP: select a model for recognition, a specific voice for speech synthesis, etc.
Create a connection using your own account registered by the ASR/TTS provider.tipIf you prefer to use your own connection, Just AI ASR/TTS limits do not apply to you.
Speech synthesis markup
To make the bot’s speech more expressive, you can use speech synthesis markup. JAICP supports Speech Synthesis Markup Language (SSML) that allows you to customize the speech tone, pronunciation, speed, volume, etc. Learn more about SSML in Speech synthesis markup.
Speech synthesis with variables
You can also use speech synthesis with variables if you want to use context-dependent variables that should be mentioned throughout the dialog. For more information, see Speech synthesis with variables.
Changing ASR and TTS settings from the script
The settings configured for the speech recognition and synthesis provider apply to all calls made through the phone channel. However, you can override them for each individual call if necessary: for example, you can switch the recognition language mid-conversation or change the voice in which the bot talks to a specific user.
To control the ASR and TTS settings from the script, use the
$dialer built-in service methods:
|Get the ASR/TTS provider name.
|Get the current ASR/TTS settings.
|Override the ASR/TTS settings.
|Specify additional ASR settings.