llmRequest
Beta
This reply type lets you use LLMs in the phone channel with minimal pauses.
With the llmRequest
, the bot receives text from an LLM and synthesizes speech in streaming mode.
This reply is supported only in the phone channel.
To learn more about using this reply in a script, see the LLM in telephony section.
Properties
Property | Type | Required | Description |
---|---|---|---|
provider | String | Yes | LLM provider:
|
tokenSecret | String | Yes | Name of the secret for accessing an LLM:
CUSTOM_LLM , you can reference this token in the headers parameter. |
url | String | No | API endpoint for an LLM. For example, if you want to send requests to YandexGPT, specify: https://llm.api.cloud.yandex.net/foundationModels/v1/completion. Specify the property only if you are usingCUSTOM_LLM . |
headers | Object | No | Request headers. The format of headers and authorization requirements depend on your provider. Instead of an API key in the headers, specify the name of the secret from the JAICP project. For example:{"Authorization": "Api-Key <TOKEN_NAME>"} , where <TOKEN_NAME> is the name of the secret in JAICP. The same name must be specified in the tokenSecret property. Specify the property only if you are using CUSTOM_LLM . |
model | String | Yes | Model for text generation:
|
temperature | Number | No | Adjusts the creativity level of responses. At higher values, the results are more creative and less predictable. We recommend using values between0.0 and 1.0 . These values are supported by all providers and result in consistent model performance. See your LLM provider’s documentation for more details about other possible parameter values. |
fillersPhraseConfig | Object | No | Object with pause filler settings. |
messages | Array | Yes | Dialog history. |
bargeInReply | Object | No | Pass a bargeInReply object in the script, create an empty response with the bargeInIf parameter. See an example in article LLM in telephony article. |
Fill pauses
When the LLM starts generating text, a pause occurs in the bot speech. The bot waits for the first sentence of the text to play it.
You can specify a phrase that the bot will say at the beginning of generation. This will help fill the pause if it is too long.
Property | Type | Required | Description |
---|---|---|---|
fillersPhraseConfig.fillerPhrase | String | Yes | Phrase text. |
fillersPhraseConfig.activationDelayMs | Number | No | Pause duration in milliseconds:
2000 . Values less than 500 might cause errors in llmRequest . |
Dialog history
Property | Type | Required | Description |
---|---|---|---|
messages.role | String | Yes | Participant role:
|
messages.content | String | Yes | Message text. |
The messages
field contains the dialog history that the LLM must take into account.
You can get the history of a dialog between the user and the bot in this format using the $jsapi.chatHistoryInLlmFormat
method.
History examples:
-
The history only with the last user request:
[
{"role": "user","content": $request.query}
] -
The history with a prompt and the previous messages:
[
{"role": "system", "content": "Keep answers short. A few sentences at most"},
{"role": "user","content": "Recommend a movie"},
{"role": "assistant", "content": "What genre?"},
{"role": "user", "content": "Comedy" }
]
History size and prompt size can affect LLM speed. If the LLM takes a long time to respond, try shortening history or prompt.
Function calling
Instead of generating a text response, the LLM can call a function.
- Function calling is supported only for
provider: "CUSTOM_LLM"
. - The LLM must support function calling.
- Currently, the bot cannot use a function to end the call.
For example, if a function contains
$dialer.hangUp
, it will not end the call.
Property | Type | Required | Description |
---|---|---|---|
tools | Array | No | An array with descriptions of available functions. |
eventName | String | No | The name of the event triggered if the LLM calls a function. |
Here is an example of a tools
array with a single function description:
var myTools = [
{
"type": "function",
"function": {
"name": "getWeather",
"description": "Get the current weather in the city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city to get the weather for"
}
},
"required": ["city"]
}
}
}
];
$response.replies.push({
type: "llmRequest",
…
tools: myTools,
eventName: "myEvent"
});
Here:
function.name
is the function name.function.description
is the function description. This description helps the LLM understand the purpose of the function.function.parameters
are the function parameters, provided in the JSON Schema format.
You can find a function calling example in the LLM in telephony article.
How to use
Below are examples of llmRequest
for Caila
and YandexGPT
providers:
- Caila
- YandexGPT
state: NoMatch
event!: noMatch
script:
$response.replies = $response.replies || [];
$response.replies.push({
type: "llmRequest",
provider: "CAILA_OPEN_AI",
// Secret name in JAICP
tokenSecret: "MY_TOKEN",
// Text generation model
model: "gpt-4o",
// Temperature
temperature: 0.6,
// Pause filler
fillersPhraseConfig: {"fillerPhrase": "Great question!", "activationDelayMs": 1000},
// Dialog history
messages: $jsapi.chatHistoryInLlmFormat()
});
state: NoMatch
event!: noMatch
script:
$response.replies = $response.replies || [];
$response.replies.push({
type: "llmRequest",
provider: "CUSTOM_LLM",
// Secret name in JAICP
tokenSecret: "MY_TOKEN",
// API endpoint for LLM
url: "https://llm.api.cloud.yandex.net/v1/chat/completions",
// Request headers
headers: {"Authorization": "Api-Key MY_TOKEN"},
// Model for text generation, path contains folder ID
model: "gpt://folder12345/yandexgpt",
// Temperature
temperature: 0.6,
// Pause filler
fillersPhraseConfig: {"fillerPhrase": "Great question!", "activationDelayMs": 1000},
// Dialog history
messages: $jsapi.chatHistoryInLlmFormat()
});
In this example:
MY_TOKEN
is the name of the secret in JAICP that contains the Yandex Cloud IAM token.folder12345
is the folder ID in Yandex Cloud.
For more details on working with YandexGPT, see the documentation.