Pronunciation problem with the word "to"

I am really impressed how well Synth V works “out of the box”. But sometimes things don’t work as expected. The word “to” is clearly pronounced like “do” and I don’t know how to fix it. This is what I get.

Are you using the Basic or Pro version?
Which Voice DB does this happen with?
Which language is set?
Do you always get this pronunciation or just randomly?
Which word/syllable comes before and/or after it?

I cannot reproduce your problem with any voice or language.


1 Like

Sorry that I did not provide details. Pro version, English and Natalie.

Your question made me try a short pause before the word “to” and that made a huge difference!
Without pause:

With pause:

1 Like

A few English words are contextual. The main example is “thee” vs “thuh” based on whether “the” is followed by a consonant or vowel sound. They’re based on general behaviors of English speech, but of course they can never perfectly predict what will be most suitable for a specific scenario.

Adding space before the note changes its context, however if you don’t want a pause before the note you can just double click on the phonemes above the note and change the dx to t.


There must be another cause.
For me, “to” is pronounced correctly even without a pause (also Natalie English, without spaces in the syllables).


Did you enter the lyrics manually using the plugin version or generated them automatically from an audio file using the ARA extension?

1 Like

Thanks Claire,

Now that you mention it, I can see that from my screen captures. Will remove the pause/gap and see if it is enough to change dx uw to t uw. :grinning:

I used the pluin version and entered everything manually. Note by note, word by word. Full ARA integration is not yet supported in Logic. Will test some more!!

1 Like

The difference in this case is likely that hummersallad has “Use relaxed consonants” enabled in the Voice panel.

The method of entering the lyrics shouldn’t make any difference.

I wasn’t able to reproduce the exact timing, but the contextual behavior seems to be based on both that setting as well as the length of the previous note.


That’s interesting - thx!

Claire, you nailed it! :grinning:
“Use relaxed consonants” is exactly what causes the mispronunciation. Thanks!!!
When or why should “Use relaxed consonants” be used?
Don’t know if it was selected by default or if I enabled it for some unknown reason…

According to Dreamtonics: Synthesizer V Studio 1.4.0 Update | Dreamtonics株式会社

“Use relaxed consonants” option for phoneme conversion rules that better suit American English.

In this case the “to” example is just a hard one to get right with an automatic/contextual phoneme conversion. These are all common examples of how people might pronounce it:

t uw (clearly enunciated)
t ax “tuh”
dx ax (remember dx and d are not the same, dx is the alveolar tap as in American pronunciations of “better” or “water” where the “t” is not enunciated)
dx uw (more likely if the following word starts with a vowel)


Are the different options documented somewhere or do you have to try it out?

To be clear these are just examples of how humans say the word, not how SynthV Studio converts it to phonemes.

Dreamtonics can try to make conversion rules that will be “correct” for certain accents most of the time, but there’s no way for them to always predict what each user wants to hear in each specific scenario due to dialect/accent differences and just the nature of English as a highly inconsistent language.

If you want a specific sound in SynthV, the best approach will always be to just type in the pronunciation you want instead of trying to figure out which context results in the right automatic conversion.


Yes, I understand and it makes prefect sense but it is not easy to “speak” phonemes, it is almost like a new language! Especially for a dyslectic like me!

I find it easier to keep “relaxed consonants” and just respell “to” when I want the strong “T.” You can do this in various ways, such as type the English as “too” or “two” or “2,” or change the phoneme (above in green) to “t uw.” Changing this is easier than changing a lot more over-pronounced consonants you will get if you turn off “relaxed consonants” which sounds very stilted and unnatural. I also get in the habit of making a dictionary for a song if a certain word is used a lot, especially if there are many parts, like a choir.

There is a difference between spoken English, sung English, and proper pronunciation of individual words, so there is no setting that will automatically make singing sound natural. So if you want things to sound like a real singer you need to tweak the phonemes in either the dictionary or the individual phoneme.


Thanks ThomasS,

Makes sense. Will turn “Relaxed Consonants” back on and try your strategy,

The letter “T” is tricky to get right in singing if you want to sound natural. In real singing sometimes it is a D (or dx in phonemes) which is what relaxed does to “to,” but sometimes you need the hard T, and other times not even sounded at all.

You may have to change it yourself if relaxed does not work, like “better or sweater” should be “bedder or swedder” etc. Other times you have to eliminate the “T” from the end of words, because singers won’t really sound that, like “don’t” should be “doan” or “won’t” should be “woan.” When a word ends with “T” and the next word is “you” then eliminate the “T” from the first word and change “you” to “chew” (or perhaps “Jew”) so “don’t you” should be spelled “doan chew” for singing purposes. “Want to…” should normally be sung as “wanna” and many other things about the letter T, which you just have to use your ears to make it sound right.