Skip to content

Text-to-Video Talking Head


This section describes the asynchronous API methods for generating video based on the text and the configuration you specify.

Before you start using it obtain an authorazation token which will be specified in the header CDN-AUTH-TOKEN. On the page Authorization you can find the details about token lifetime and the method of obtaining it.

Below are descriptions of API methods with examples of queries to manage video creation.



  1. There is a limit on the number of API calls:
    • no more than 10 requests per minute for video generation requests (POST).
Characteristic Limit
Maximum video duration 3 minutes
Maximum text size 3000 characters
Period of time for storing a video file since its creation 30 days
Maximum number of requests per month 1500
Maximum lenght of video name 35 characters

API methods

Get a list of available configurations


  • Request type: GET
  • Headers: CDN-AUTH-TOKEN
  • Response data type: JSON Object
Response code Response data Response format Description
200 List of models and backgrounds available for the account JSON Successful request
403 status: type string, message, type string JSON Forbidden
404 None None Not found
500 None None Internal Server Error
503 None None Service unavailable

Request example

curl -H cdn-auth-token:${CDN_TOKEN} \

Successful response example

    "status": "ok",
    "voices": {
        "Alena": {
            "language": "ru-RU",
            "gender": "female",
            "name": "alena",
            "provider": "Yandex",
            "emotions": [
        "Sara": {
            "language": "en-EN",
            "gender": "female",
            "name": "en-US-SaraNeural",
            "provider": "Microsoft"
        "Mia": {
            "language": "en-EN",
            "gender": "female",
            "name": "en-GB-MiaNeural",
            "provider": "Microsoft"
    "models": {
        "Natalia": {
            "1.0.0": {
                "style": "suitjacket",
                "gender": "female",
                "voices": [
                "shots": {
                    "waist": {
                        "size": "HD"
            "4.0.0": {
                "style": "dressshirt",
                "gender": "female",
                "voices": [
                "shots": {
                    "waist": {
                        "size": "HD"
    "backgrounds": {
        "background_1": {
            "sizes": [
        "globe": {
            "sizes": [
    "quota": { 
        "limit": 560.0, 
        "left": 544.0 
    "watermark": false

Unsuccessful response example

{"status": "ERROR", "message": "invalid account"}

Create a task for video generation


  • Request type: POST
  • Headers: CDN-AUTH-TOKEN
  • Request body: JSON with task parameters
  • Response data type: JSON Object
Response code Response data Response format Description
202 status: type string, id: type string JSON The task was successfully enqueued and assigned its ID
400 status: type string, message, type string JSON Invalid request
403 status: type string, message, type string JSON Forbidden
404 None None Not found
422 Incorrect parameters' names JSON Bad request
500 None None Internal Server Error
503 None None Service unavailable

Request example

export TEXT=$(cat <<-END
    "script": "Hello! This is the text I am speaking.",
    "actor": {
        "name": "Natalia",
        "version": "4.0.0",
        "style": "dressshirt",
        "shot": "waist",
        "size": "HD"
    "voice": "Sara",
    "background": "green_screen",
    "video_name": "Awesome Video!"

curl -X POST \
     -H cdn-auth-token:${CDN_TOKEN} \
     -H "Content-Type: application/json" \
     -d "${TEXT}" \

Successful response example

"status": "ok",
"id": "90d70829-134f-4957-9c13-c8bf67c1678e"

Unsuccessful response example

{"status": "ERROR", "message": "invalid account"}

Check task status


  • Request type: GET
  • Headers: CDN-AUTH-TOKEN
  • Response data type: JSON Object
Response code Response data Response format Description
200 status: type string, task: type JSON JSON Successful request
400 status: type string, message, type string JSON Invalid request
403 status: type string, message, type string JSON Forbidden
404 None None Not found
500 None None Internal Server Error
503 None None Service unavailable

Description of the returned parameter task:

Parameter name Description
id ID of the task
attempts Number of attempts used during task processing
status Task status: in_queue, processing, processed, canceled
message Additional message about the task status

Request example

export TASK_ID=427c9566-2120-4b85-b168-bz4094667b99
curl -H cdn-auth-token:${CDN_TOKEN} \${TASK_ID}

Successful response example

"status": "ok",
"task": {
    "id": "3e10edaf-41c3-4210-b92f-5d68b269c20f",
    "attempts": 1,
    "status": "CANCELED",
    "message": "canceled by user request"

Unsuccessful response example

{"status": "ERROR", "message": "invalid account"}

Cancel task


  • Request type: DELETE
  • Headers: CDN-AUTH-TOKEN
  • Response data type: JSON Object
Response code Response data Response format Description
200 List of the generated video sorted by the date of receiving the task JSON Successful request
400 status: type string, message, type string JSON Invalid request
403 status: type string, message, type string JSON Forbidden
404 None None Not found
500 None None Internal Server Error
503 None None Service unavailable

Request example

export TASK_ID=427c9566-2120-4b85-b168-bz4094667b99
curl -X DELETE \
     -H cdn-auth-token:${CDN_TOKEN} \${TASK_ID}

Successful response example

{"status": "ok"}

Unsuccessful response example

{"status": "ERROR", "message": "wrong task status: CANCELED"}


  • Request type: GET
  • Headers: CDN-AUTH-TOKEN
  • Response data type: JSON Object
Response code Response data Response format Description
200 List of the generated video sorted by the date of receiving the task JSON Successful request
400 status: type string, message, type string JSON Invalid request
403 status: type string, message, type string JSON Forbidden
404 None None Not found
500 None None Internal Server Error
503 None None Service unavailable


You can download video only from the same IP address from which the request was sent. The link is valid for 6 hours.

Request example

export TASK_ID=427c9566-2120-4b85-b168-bz4094667b99
curl -H cdn-auth-token:${CDN_TOKEN} \${TASK_ID}

Successful response example

    "status": "ok",
    "url": ""

Unsuccessful response example

{"status": "ERROR", "message": "invalid account"}

List videos


  • Request type: GET
  • Headers: CDN-AUTH-TOKEN
  • Response data type: JSON Object

Description of request parameters:

Parameter name Description Required
start Date from which the calculation is made (inclusive). Must be in pattern: year-month-dayThours:minutes:seconds+tz, where seconds are equal to 00. Specified in UTC. Example 2020-02-11T12:30:00+00 No
end Date until which the calculation is made (not inclusive). Must be in pattern: year-month-dayThours:minutes:seconds+tz, where seconds are equal to 00. Specified in UTC. Example 2020-02-11T12:30:00+00 No
offset Result bias No
limit Result limitation No
sort Field name and the order of sorting by the field. The sort parameter has the form: [+-] No
fields Additional fields to be returned (background, shot, script, voice_name, actor_name, model_version, video_name, emotion, orientation, crop, shift, scale, circle) No
id ID of the task No

Possible response codes:

Response code Response data Response format Description
200 List of the generated video sorted by the date of receiving the task JSON Successful request
400 status: type string, message, type string JSON Invalid request
403 status: type string, message, type string JSON Forbidden
404 None None Not found
500 None None Internal Server Error
503 None None Service unavailable

Description of response parameters:

Parameter name Description
id ID of the task
video_url URL for the generated video
resolution Resolution of the video
duration Duration of the video (in seconds)
date Date the task was received in the service in UTC
background Background of video
shot Camera position on video
script SSML document or plain text, voiced by the speaker
voice_name Speaker's voice
actor_name Speaker's name
model_version Speaker's version
video_name Name of the video
emotion Emotion of speaker's voice
orientation The orientation of the video
crop Cropping factor for the speaker's image
shift Centerwise shift factor for the speaker's image
scale Scale factor for resizing the speaker's image
circle JSON object containing all fields sent in the same parameter to POST /generate. (location, border_color, background_color). Internal fields will be None if video was generated without circled actor


You can download video only from the same IP address from which the request was sent. The link is valid for 6 hours.

Request example

curl -H cdn-auth-token:${CDN_TOKEN} \

Successful response example

        "id": "5a22c605-1495-414f-9266-fa780f9a1c3f",
        "video_url": "",
        "duration": 30.844976480165343,
        "resolution": "SD",
        "date": "2022-06-15T21:32:53Z",
        "shot": "waist",
        "background": "globe"
        "id": "c34d1b6c-d993-446f-8e44-e5aadfabe98e",
        "video_url": "",
        "duration": 29.270747503730167,
        "resolution": "HD",
        "date": "2022-06-13T11:17:32Z",
        "shot": "waist",
        "background": "globe"

Unsuccessful response example

{"status": "ERROR", "message": "invalid account"}

Change video properties


  • Request type: PATCH
  • Headers: CDN-AUTH-TOKEN
  • Response data type: JSON Object

Description of request parameters:

Parameter Name Description Required
video_name New video name, no longer than 35 No

Possible response codes:

Response code Response data Response format Description
200 List of the generated video sorted by the date of receiving the task JSON Successful request
400 status: type string, message, type string JSON Invalid request
403 status: type string, message, type string JSON Forbidden
404 None None Not found
500 None None Internal Server Error
503 None None Service unavailable

Request example

export TASK_ID=427c9566-2120-4b85-b168-bz4094667b99
export TEXT=$(cat <<-END
    "video_name": "New name"
curl -X PATCH \
     -H cdn-auth-token:${CDN_TOKEN} \
     -H "Content-Type: application/json" \
     -d "${TEXT}" \${TASK_ID}

Successful response example

{"status": "ok", "message": "success"}

Unsuccessful response example

{"status": "ERROR", "message": "no such video"}

Delete video


  • Request type: DELETE
  • Headers: CDN-AUTH-TOKEN
  • Response data type: JSON Object

Possible response codes:

Response code Response data Response format Description
200 List of the generated video sorted by the date of receiving the task JSON Successful request
400 status: type string, message, type string JSON Invalid request
403 status: type string, message, type string JSON Forbidden
404 None None Not found
500 None None Internal Server Error
503 None None Service unavailable

Request example

export TASK_ID=427c9566-2120-4b85-b168-bz4094667b99
curl -X DELETE \
     -H cdn-auth-token:${CDN_TOKEN} \${TASK_ID}

Successful response example

{"status": "ok", "message": "success"}

Unsuccessful response example

{"status": "ERROR", "message": "no such video"}

Model configuration

To enqueue a task, you need to send a request with the configuration of the model in a JSON format.

Parameters of available models can be found by sending a request to /configurations

Configuration description

Parameter name Type Required Description
script string True Contains either text or ssml to be synthesized. In case of ssml use follow specification of corresponding TTS provider. By using SSML tags you can add pauses, change pronounciation, place stress markers etc. For example, for the current voice Alena use specification provided by Yandex on their website.
actor JSON True Speaker parameters
voice string True Name of the speaker's voice
background string False Background name (background size must match the size of the model's shot. If not specified, green_screen.)
emotion string False Emotion of speaker's voice. If not specified, voice will have neutral emotion.
video_name string False Name for generated video
composition JSON False Parameters of actor's composition
circle JSON False Parameters of actor's generation in circle

Parameters for actor section:

Parameter name Type Required Description
name string True Speaker's name
version string True Speaker version (optional parameter, or "latest". If not specified, the latest version corresponding to the specified parameters is used)
style string False Speaker style (optional)
shot string True Shot type of a speaker (the size of shot must match the size of background)
size string False Video resolution (SD, HD, FullHD or 4K), for which the specified speaker's shot fits (optional)

If the specified parameters are not consistent with each other, an error message will be received, for example:

Unsuccessful response example

    "message": "bad style for Natalia-latest",
    "status": "ERROR"

Parameters for composition section:

Parameter name Type Required Compatible styles Description
orientation string True rectangle, circle The orientation of the video. Valid values are "horizontal" and "vertical". If "vertical" is selected, the resolution will be inverted. For example, if the resolution is 1280x720 in horizontal mode, it will be 720x1280 in vertical mode
frame_style string False rectangle, circle Style of a frame around the speaker. Valid values are "rectangle" and "circle". "rectangle" will result in normal video with speaker taking full screen. "circle" will create a video with smaller speaker, surrounded by a circled frame in one of video corners. Different values of this parameter may result in others params being unavailable. Default value is rectangle
scale float False rectangle Scale factor for resizing the speaker's image. Must be a value between (0,1]. A value of 1 means no scaling, default value is 1
crop float False rectangle Cropping factor for the speaker's image. Speaker's resolution after crop will be video_width:video_height*crop. Shows how much of speaker will be cropped from the bottom. Takes values from (0,1], default value is 1
shift float False rectangle Centerwise shift factor for the speaker's image. Must be a value between [-1, 1]. Negative values indicate a shift to the left, and positive values indicate a shift to the right. A value of 0 means no shift. Default value is 0
location string True circle The cornerwise location of the framed speaker. The value is a combination of "{upper,lower}{left,right}". For example, the value "upperright" will produce a video with the speaker in a frame placed in the upper-right corner of the video. Default value is upperleft
background_color string False circle Hex representation of frame's background color. For example, #FFFFFF will make background white. Default value is #FFFFFF
border_color string False circle Hex representation of circle's border color, for example #FFFFFF will make border white. Default value is #000000