Configure voice and language for your XUNA AI agent

The voice you choose shapes how your agent is perceived. XUNA AI gives you access to over 5,000 voices across 31 languages — from professional and neutral to expressive and character-driven. You can select a voice in the dashboard or via API, and override it per session for personalized experiences.

Choosing a voice

Browse the full voice library in the XUNA AI Voice Library. Voices are organized by:

Gender — Male, female, or neutral.
Age — Young, middle-aged, or old.
Accent — American, British, Australian, and many more.
Use case — Narration, conversational, customer service, etc.
Language — Filtered by language support.

Once you find a voice, copy its voice ID and use it in your agent configuration.

from xuna_ai import XunaAI

client = XunaAI()

agent = client.conversational_ai.agents.update(
    agent_id="your-agent-id",
    conversation_config={
        "tts": {
            "voice_id": "JBFqnCBsd6RMkjVDRZzb",  # your chosen voice ID
        }
    }
)

Supported languages

XUNA AI Conversational AI supports 31 languages for both speech recognition and synthesis. Set the agent’s primary language to ensure the ASR model is tuned for the correct language.

View all supported languages

English, Spanish, French, German, Italian, Portuguese, Polish, Dutch, Russian, Japanese, Korean, Chinese (Mandarin), Arabic, Hindi, Turkish, Swedish, Norwegian, Danish, Finnish, Czech, Slovak, Romanian, Hungarian, Ukrainian, Greek, Bulgarian, Croatian, Catalan, Hebrew, Malay, and Indonesian.

agent = client.conversational_ai.agents.update(
    agent_id="your-agent-id",
    conversation_config={
        "agent": {
            "language": "en",  # ISO 639-1 language code
        }
    }
)

Automatic language detection

If your agent serves multilingual users, you can enable automatic language detection. The agent detects the user’s language from their first utterance and switches to matching speech recognition and synthesis automatically. Enable language detection through the tools configuration using the built-in language detection system tool.

Automatic language detection works best when users speak a full sentence. Short utterances like “hi” may not provide enough signal to detect language reliably.

Voice settings

Fine-tune how the voice sounds using these settings:

Setting	Description	Range
Stability	How consistent the voice sounds across sentences. Higher = more consistent, lower = more expressive.	0.0 – 1.0
Similarity boost	How closely the synthesized voice matches the original voice clone.	0.0 – 1.0
Style exaggeration	Amplifies the style of the voice. Use sparingly — high values can distort quality.	0.0 – 1.0

For customer support agents, use a stability of 0.7–0.8 and similarity boost of 0.75. This produces clear, consistent speech without sounding robotic.

Overriding voice per session

You can change the voice for a specific conversation at session start without modifying the agent’s default configuration. This is useful when you want to personalize voices by user preference or locale. See Personalization for how to pass per-session overrides.

Get Started

Configure

Deploy

Monitor & Optimize

Configure voice and language for your XUNA AI agent

Choosing a voice

Supported languages

Automatic language detection

Voice settings

Overriding voice per session

Get Started

Configure

Deploy

Monitor & Optimize

​Choosing a voice

​Supported languages

​Automatic language detection

​Voice settings

​Overriding voice per session

Choosing a voice

Supported languages

Automatic language detection

Voice settings

Overriding voice per session