Custom Word Pronounciation for AI Voice (Brand Names)
J
Juan Gomez
Add support for a custom word pronunciation dictionary that allows users to define how specific words (such as brand names) should be pronounced by the AI voice agent, similar to the functionality offered by HeyGen and ElevenLabs.
Problem Statement:
Currently, voice agents rely on default Text-to-Speech (TTS) pronunciation rules. This causes incorrect pronunciation of brand names, proprietary terms, and industry-specific words.
For example, our brand name Soccerfy must be pronounced “Soc-ker–fye”, but TTS engines often pronounce it incorrectly (e.g., “Soc-ker-fee”). This leads to:
- Brand inconsistency
- Reduced professionalism
- Confusion for callers
- Poor customer experience
There is currently no reliable way to enforce consistent pronunciation across all voice interactions.
Proposed Solution:
Introduce a Custom Pronunciation Dictionary that allows users to:
- Define a custom word or phrase
- Specify the correct pronunciation using:
- Phonetic spelling (e.g., “Soc-ker-fye”)
- IPA or SSML phoneme notation (optional)
- Apply the pronunciation globally across:
- Voice agents
- Call scripts
- Dynamic AI responses
This dictionary would override the default TTS behavior whenever the custom word appears.
Example Use Case:
Custom Word: Soccerfy
Display Text: Soccerfy
Pronunciation: soc-ker-fye
Whenever the AI agent speaks “Soccerfy,” it should consistently pronounce it as “Soc-ker-fye” without requiring manual script changes or workarounds.
Comparable Implementations:
HeyGen: Custom pronunciation entries for brand and proper nouns
ElevenLabs: Pronunciation dictionaries with phoneme-level control
These platforms demonstrate that this feature is technically feasible and significantly improves voice consistency.
Business Impact:
- Improves brand trust and professionalism
- Ensures consistent pronunciation across all AI agents
- Reduces manual scripting workarounds
- Enhances enterprise readiness for branded voice deployments
- Critical for agencies and businesses using AI voice at scale
Suggested Priority:
High — essential for production-grade voice agents and branded AI deployments.
Optional Enhancements (Nice to Have):
- Per-agent pronunciation dictionaries
- Per-language pronunciation rules
- UI preview / test pronunciation button
- Bulk upload of custom words
Log In
J
Juan Gomez
This is urgently needed, as brand names don't usually follow the standard pronunciation rules.